Re: [PERFORM] Possibly slow query

2005-01-31 Thread Manfred Koizar
On Wed, 26 Jan 2005 07:16:25 -0800, "Peter Darley"
<[EMAIL PROTECTED]> wrote:
>SELECT User_ID
>FROM Assignments A NATURAL LEFT JOIN (SELECT * FROM Assignment_Settings
>WHERE Setting='Status') ASet
>WHERE A.User_ID IS NOT NULL
>   AND ASet.Assignment_ID IS NULL
>GROUP BY User_ID;

"ASet.Assignment_ID IS NULL" and "value IS NULL" as you had in your
original post don't necessarily result in the same set of rows.

SELECT DISTINCT a.User_ID
  FROM Assignments a
   LEFT JOIN Assignment_Settings s
  ON (a.Assignment_ID=s.Assignment_ID
  AND s.Setting='Status')
 WHERE a.User_ID IS NOT NULL
   AND s.Value IS NULL;

Note how the join condition can contain subexpressions that only depend
on columns from one table.

BTW,
|neo=# \d assignment_settings
| [...]
| setting   | character varying(250) | not null
| [...]
|Indexes:
|[...]
|"assignment_settings_assignment_id_setting" unique, btree (assignment_id, 
setting)

storing the setting names in their own table and referencing them by id
might speed up some queries (and slow down others).  Certainly worth a
try ...

Servus
 Manfred

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Postgres server getting slow!!

2005-01-31 Thread N S

 Thanks joshua

  i tried running vacuum command, 
vacuum database as well as vacuum 

but even after that querying the database , the memory shoots up
as i mentioned in the previous mail and never comes down.
Also the old pids of connections established remains even after the
connection is closed.

Will backing up the complete database, dropping and recreating can
make any difference. 


Kindly suggest

Thanks in advance

regards

N S


On Mon, 31 Jan 2005 Joshua D.Drake wrote :
>N S wrote:
>
>>I checked to find out the cause of the problem, ppp is disconnecting properly and the user session is also closed smoothely.
>>But when a report query is run on the table containing 32500 records,
>>the memory shoots up from 50 MB to 500 MB(Total memory is 512 MB RAM).
>>After that the memory usage never comes down .When some 4 or 5 user
>>connects, the remaining memory is utilised in a very little way, and finally the 6th or 7th user is denied with database access.The server now becomes slow.
>>
>>Will running vacuum help to solve the problem?
>>
>Sounds like you need to run vacuum and analyze. It also sounds like you
>may need to run vacuum full the first time.
>
>vacuum needs to be run regularly as does analyze.
>
>Sincerely,
>
>Joshua D. Drake
>
>
>>
>>The total database dump is 50 MB and the /var/lib/pgsql/data contains
>>700 MB of data.
>>
>>Which all paramters are required to be increased in postgresq.conf.
>>
>>
>>Regds
>>
>>N S
>>
>>On Sun, 30 Jan 2005 N S wrote :
>> >
>> >
>> >Thanks tom. I checked the client side software. The software closes connection when connected locally. But when connected through dialup,
>> >this problem comes. I will check the ppp connection also.
>> >Is there any method of killing old pids. And also any performance tuning to be done on postgresql.conf file.
>> >
>> >The database now contains 20K records. Will that cause  a problem?
>> >
>> >Regds
>> >
>> >Narayanan
>> >
>> >On Sat, 29 Jan 2005 Tom Lane wrote :
>> > >"Narayanan Subramaniam Iyer" <[EMAIL PROTECTED]> writes:
>> > > > 1) When 3 or 4 clients connect to this server, the pids are created and
>> > > > those pids are not killed even after the client disconnects.
>> > >
>> > >In that case your clients are not really disconnecting.  Take a closer
>> > >look at your client-side software.
>> > >
>> > >                regards, tom lane
>> > >
>> > >---(end of broadcast)---
>> > >TIP 3: if posting/reading through Usenet, please send an appropriate
>> > >      subscribe-nomail command to [EMAIL PROTECTED] so that your
>> > >      message can get through to the mailing list cleanly
>>
>>
>>
>>
>
>
>
>-- Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
>Postgresql support, programming shared hosting and dedicated hosting.
>+1-503-667-4564 - [EMAIL PROTECTED] - http://www.commandprompt.com
>PostgreSQL Replicator -- production quality replication for PostgreSQL
>






Re: [PERFORM] Postgres server getting slow!!

2005-01-31 Thread N S

  
Thanks joshua

  i tried running vacuum command,
vacuum database as well as vacuum 

but even after that querying the database , the memory shoots up
as i mentioned in the previous mail and never comes down.
Also the old pids of connections established remains even after the
connection is closed.

Will backing up the complete database, dropping and recreating can
make any difference.


Kindly suggest

Thanks in advance

regards

N S


     N S wrote:

> I checked to find out the cause of the problem, ppp is disconnecting
> properly and the user session is also closed smoothely.
> But when a report query is run on the table containing 32500 records,
> the memory shoots up from 50 MB to 500 MB(Total memory is 512 MB RAM).
> After that the memory usage never comes down .When some 4 or 5 user
> connects, the remaining memory is utilised in a very little way, and
> finally the 6th or 7th user is denied with database access.The server
> now becomes slow.
>
> Will running vacuum help to solve the problem?
>
Sounds like you need to run vacuum and analyze. It also sounds like you
may need to run vacuum full the first time.

vacuum needs to be run regularly as does analyze.

Sincerely,

Joshua D. Drake


>
> The total database dump is 50 MB and the /var/lib/pgsql/data contains
> 700 MB of data.
>
> Which all paramters are required to be increased in postgresq.conf.
>
>
> Regds
>
> N S
>
> On Sun, 30 Jan 2005 N S wrote :
> >
> >
> >Thanks tom. I checked the client side software. The software closes
> connection when connected locally. But when connected through dialup,
> >this problem comes. I will check the ppp connection also.
> >Is there any method of killing old pids. And also any performance
> tuning to be done on postgresql.conf file.
> >
> >The database now contains 20K records. Will that cause  a problem?
> >
> >Regds
> >
> >Narayanan
> >
> >On Sat, 29 Jan 2005 Tom Lane wrote :
> > >"Narayanan Subramaniam Iyer" <[EMAIL PROTECTED]> writes:
> > > > 1) When 3 or 4 clients connect to this server, the pids are
> created and
> > > > those pids are not killed even after the client disconnects.
> > >
> > >In that case your clients are not really disconnecting.  Take a closer
> > >look at your client-side software.
> > >
> > >                regards, tom lane
> > >
> > >---(end of
> broadcast)---
> > >TIP 3: if posting/reading through Usenet, please send an appropriate
> > >      subscribe-nomail command to [EMAIL PROTECTED] so that your
> > >      message can get through to the mailing list cleanly
>
>
>
>  





Re: [PERFORM] PostgreSQL clustering VS MySQL clustering

2005-01-31 Thread Alex Turner
fsync on.

Alex Turner
NetEconomist


On Fri, 28 Jan 2005 11:19:44 -0500, Merlin Moncure
<[EMAIL PROTECTED]> wrote:
> > With the right configuration you can get very serious throughput.  The
> > new system is processing over 2500 insert transactions per second.  We
> > don't need more RAM with this config.  The disks are fast enough.
> > 2500 transaction/second is pretty damn fast.
> 
> fsync on/off?
> 
> Merlin
> 
>

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Automagic tuning

2005-01-31 Thread Markus Schaber
Hi, Cristopher,

Christopher Kings-Lynne schrieb:
>> Are there any tools that help with postgres/postgis performance tuning?
>>
>> So they measure the acutal tuple costs and cpu power, or suggest optimal
>> values for the index sample counts?
>
> Have you turned on the stat_* settings in postgresql.conf and then
> examined the pg_stat_* system views?

As far as I examined, those views only count several things like fetched
rows and pages, and cache hits.

I would like something that really measures values like random_page_cost
or cpu_tuple_cost that are hardware dependent.

I assume such thing does not exist?

Markus

--
markus schaber | dipl. informatiker
logi-track ag | rennweg 14-16 | ch 8001 zürich
phone +41-43-888 62 52 | fax +41-43-888 62 53
mailto:[EMAIL PROTECTED] | www.logi-track.com


signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] Possibly slow query

2005-01-31 Thread Peter Darley
Manfred,
Yeah, that was a typo.  It should have been ASet.Value IS NULL.
I have considered storing the setting names by key, since I do have a
separate table with the names and a key as you suggest, but since my
application is only ~75% finished, it's still pretty important to have human
readable/editable tables.
Thanks,
Peter Darley

-Original Message-
From: Manfred Koizar [mailto:[EMAIL PROTECTED]
Sent: Monday, January 31, 2005 3:06 AM
To: Peter Darley
Cc: Richard Huxton; Pgsql-Performance
Subject: Re: [PERFORM] Possibly slow query


On Wed, 26 Jan 2005 07:16:25 -0800, "Peter Darley"
<[EMAIL PROTECTED]> wrote:
>SELECT User_ID
>FROM Assignments A NATURAL LEFT JOIN (SELECT * FROM Assignment_Settings
>WHERE Setting='Status') ASet
>WHERE A.User_ID IS NOT NULL
>   AND ASet.Assignment_ID IS NULL
>GROUP BY User_ID;

"ASet.Assignment_ID IS NULL" and "value IS NULL" as you had in your
original post don't necessarily result in the same set of rows.

SELECT DISTINCT a.User_ID
  FROM Assignments a
   LEFT JOIN Assignment_Settings s
  ON (a.Assignment_ID=s.Assignment_ID
  AND s.Setting='Status')
 WHERE a.User_ID IS NOT NULL
   AND s.Value IS NULL;

Note how the join condition can contain subexpressions that only depend
on columns from one table.

BTW,
|neo=# \d assignment_settings
| [...]
| setting   | character varying(250) | not null
| [...]
|Indexes:
|[...]
|"assignment_settings_assignment_id_setting" unique, btree
(assignment_id, setting)

storing the setting names in their own table and referencing them by id
might speed up some queries (and slow down others).  Certainly worth a
try ...

Servus
 Manfred


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] PostgreSQL clustering VS MySQL clustering

2005-01-31 Thread Olivier Sirven
Le Vendredi 21 Janvier 2005 19:18, Marty Scholes a écrit :
> The indexes can be put on a RAM disk tablespace and that's the end of
> index problems -- just make sure you have enough memory available.  Also
> make sure that the machine can restart correctly after a crash: the
> tablespace is dropped and recreated, along with the indexes.  This will
> cause a machine restart to take some time.
Tell me if I am wrong but it sounds to me like like an endless problemThis 
solution may work with small indexes (less than 4GB) but what appends when 
the indexes grow ? You would add more memory to your server ? But there will 
be a moment were you can not add more so what's next ?

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] PostgreSQL clustering VS MySQL clustering

2005-01-31 Thread Marty Scholes
> Tell me if I am wrong but it sounds to me like like
> an endless problem
Agreed.  Such it is with caching.  After doing some informal 
benchmarking with 8.0 under Solaris, I am convinced that our major choke 
point is WAL synchronization, at least for applications with a high 
commit rate.

We have noticed a substantial improvement in performance with 8.0 vs 
7.4.6.  All of the update/insert problems seem to have gone away, save 
WAL syncing.

I may have to take back what I said about indexes.
Olivier Sirven wrote:
Le Vendredi 21 Janvier 2005 19:18, Marty Scholes a écrit :
The indexes can be put on a RAM disk tablespace and that's the end of
index problems -- just make sure you have enough memory available.  Also
make sure that the machine can restart correctly after a crash: the
tablespace is dropped and recreated, along with the indexes.  This will
cause a machine restart to take some time.
Tell me if I am wrong but it sounds to me like like an endless problemThis 
solution may work with small indexes (less than 4GB) but what appends when 
the indexes grow ? You would add more memory to your server ? But there will 
be a moment were you can not add more so what's next ?


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faq


[PERFORM] Performance degredation at client site

2005-01-31 Thread Bill Chandler
Hello, 

Client is seeing continual performance degradation on
updates and queries from a large database.  Any help
appreciated.

Client is using PostgreSQL 7.4.2 on Sparcv9 650MHZ
cpu, 2GB Ram, running Solaris.

We have the following tables:

EVENT_TBL
evt_id   bigserial, unique
d1   numeric(13)
obj_id   numeric(6)
d2   numeric(13)
val  varchar(22)
correction   numeric(1)
deltanumeric(13)

CONTROL_TBL
obj_id   numeric(6), unique
name varchar(22), unique
dtypenumeric(2)
dfreqnumeric(2)

Indexes:
EVENT_TBL.d1 (non-clustered)
EVENT_TBL.obj_id (non-clustered)
CONTROL_TBL.obj_id (non-clustered)
CONTROL_TBL.name (clustered)

Update processes run continually throughout the day in
which rows are inserted but none deleted.  The
EVENT_TBL is currently very big, w/ over 5 million
rows.  The CONTROL_TBL is fairly small w/ around 4000
rows.  We're doing a "VACUUM ANALYZE" on each table
after each update has been completed and changes
committed.  Each night we drop all the indexes and
recreate them.  

Do I understand correctly, however, that when you
create a unique SERIAL column an index is
automatically created on that column?  If so, does
that sound like a possible culprit?  We are not doing
any reindexing on that index at all.  Could it be
suffering from index bloat?  Do we need to
periodically explicity run the command:

reindex index event_tbl_evt_id_key;

???

Even seemingly simple commands are taking forever. 
For example:

select evt_id from event_tbl where evt_id=1;

takes over a minute to complete.


Here is a slightly more complicated example along with
its explain output:

select events.evt_id, ctrl.name, events.d1,
events.val, events.d2, events.correction, ctrl.type,
ctrl.freq from event_tbl events, control_tbl ctrl
where events.obj_id = ctrl.obj_id and events.evt_id >
369 order by events.evt_id limit 2000;

 QUERY PLAN
-
 Limit  (cost=0.00..6248.56 rows=2000 width=118)
   ->  Nested Loop  (cost=0.00..7540780.32
rows=2413606 width=118)
 ->  Index Scan using event_tbl_evt_id_key on
event_tbl events  (cost=0.00..237208.57 rows=2413606
width=63)
   Filter: (evt_id > 369)
 ->  Index Scan using control_tbl_obj_id_idx
on control_tbl ctrl  (cost=0.00..3.01 rows=1 width=75)
   Index Cond: ("outer".obj_id =
ctrl.obj_id)
(6 rows)

This takes minutes to return 2000 rows.

Thank you in advance.

Bill



__ 
Do you Yahoo!? 
All your favorites on one personal page – Try My Yahoo!
http://my.yahoo.com 

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] Performance degredation at client site

2005-01-31 Thread Tom Lane
Bill Chandler <[EMAIL PROTECTED]> writes:
> Update processes run continually throughout the day in
> which rows are inserted but none deleted.

What about row updates?

> Even seemingly simple commands are taking forever. 
> For example:
> select evt_id from event_tbl where evt_id=1;
> takes over a minute to complete.

Since evt_id is a bigint, you need to write that as

select evt_id from event_tbl where evt_id=1::bigint;

or various other locutions that have the same effect.  What you have is
a bigint-vs-int comparison, which is not indexable in releases before 8.0.

The same problem is occurring in your other example.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Performance degredation at client site

2005-01-31 Thread PFC

Do I understand correctly, however, that when you
create a unique SERIAL column an index is
automatically created on that column?  If so, does
that sound like a possible culprit?  We are not doing
any reindexing on that index at all.  Could it be
suffering from index bloat?  Do we need to
periodically explicity run the command:
SERIAL creates a sequence, not an index.
UNIQUE and PRIMARY KEY do create indexes.
Regards.
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Performance degredation at client site

2005-01-31 Thread Bill Chandler
Tom,

Thank you!  I will have the client try that.  What
about the event_tbl_evt_id_key index question.  Could
that also be causing me difficulties?  Should I
periodically reindex it? 

thanks,

Bill

--- Tom Lane <[EMAIL PROTECTED]> wrote:

> Bill Chandler <[EMAIL PROTECTED]> writes:
> > Update processes run continually throughout the
> day in
> > which rows are inserted but none deleted.
> 
> What about row updates?
> 
> > Even seemingly simple commands are taking forever.
> 
> > For example:
> > select evt_id from event_tbl where evt_id=1;
> > takes over a minute to complete.
> 
> Since evt_id is a bigint, you need to write that as
> 
> select evt_id from event_tbl where evt_id=1::bigint;
> 
> or various other locutions that have the same
> effect.  What you have is
> a bigint-vs-int comparison, which is not indexable
> in releases before 8.0.
> 
> The same problem is occurring in your other example.
> 
>   regards, tom lane
> 


__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Automagic tuning

2005-01-31 Thread Josh Berkus
Markus,

> As far as I examined, those views only count several things like fetched
> rows and pages, and cache hits.
>
> I would like something that really measures values like random_page_cost
> or cpu_tuple_cost that are hardware dependent.
>
> I assume such thing does not exist?

Nope.  You gotta whip out your calculator and run some queries.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Automagic tuning

2005-01-31 Thread Tom Lane
Josh Berkus  writes:
>> I would like something that really measures values like random_page_cost
>> or cpu_tuple_cost that are hardware dependent.
>> 
>> I assume such thing does not exist?

> Nope.  You gotta whip out your calculator and run some queries.

Preferably a whole lot of queries.  All the measurement techniques I can
think of are going to have a great deal of noise, so you shouldn't
twiddle these cost settings based on just a few examples.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


[PERFORM] High end server and storage for a PostgreSQL OLTP system

2005-01-31 Thread Cosimo Streppone
Hi all,
I've been following this list for nearly a year now.
I've always managed to get PostgreSQL 7.1.x right for the job,
which in my case is a large and complex oltp system,
run under Pg for 6 years now.
We were already planning the switch from 7.1 to 7.4 (or even 8.0).
The last project we're facing with has a transaction volume that is
something we've never dealt with. By "transaction" I mean
something involving 10 to 10,000 (and more) sql queries
(a complex mix of insert/ update/ delete/ select).
I'd like to ask:
1) What kind of performance gain can I expect switching from
   7.1 to 7.4 (or 8.0)? Obviously I'm doing my own testing,
   but I'm not very impressed by 8.0 speed, may be I'm doing
   testing on a low end server...
2) The goal is to make the db handle 100 tps (something like
   100 users). What kind of server and storage should I provide?
   The actual servers our application runs on normally have
   2 Intel Xeon processors, 2-4 Gb RAM, RAID 0/1/5 SCSI
   disk storage with hard drives @ 10,000 rpm
3) Highest I/O throughput SCSI adapters? Adaptec?
4) Is it correct to suppose that multiple RAID 1 arrays
   can provide the fastest I/O ?
   I usually reserve one RAID1 array to db data directory,
   one RAID1 array to pg_xlog directory and one RAID1 array
   for os and application needs.
5) OS and Pg specific tuning?
   Usually I modify shared memory settings and most of postgresql.conf
   available settings for 7.1, like `effective_cache', `shared_buffers',
   `wal_buffers', `wal_files', and so on.
--
Cosimo
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] High end server and storage for a PostgreSQL OLTP system

2005-01-31 Thread Tom Lane
Cosimo Streppone <[EMAIL PROTECTED]> writes:
> 1) What kind of performance gain can I expect switching from
> 7.1 to 7.4 (or 8.0)? Obviously I'm doing my own testing,
> but I'm not very impressed by 8.0 speed, may be I'm doing
> testing on a low end server...

Most people report a noticeable speedup in each new release; we hit
different things in different releases, but usually at least one
performance gain is useful to any one person.  For a jump as far as
from 7.1 to 8.0 I'm surprised that you're not seeing any gain at all.
What was your test case exactly?  Have you perhaps tuned your app
so specifically to 7.1 that you need to detune it?

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


[PERFORM] Index Slowing Insert >50x

2005-01-31 Thread Trevor Ball
Title: Index Slowing Insert >50x






I have a data collector function in a PostGreSQL 7.4 DB running on Linux that inserts approximately 1 records into a table every fifteen minutes. The table has two macaddr columns, one varchar(50) column, two timestamptz columns, five interval columns, one float8 column, and one int4 column. I have one multi-column B-tree index on the two macaddr columns, the varchar(50), and one of the timestamptz columns, in that order.

The 1-record insert takes approximately 2 minutes, which I thought seemed awfully slow, so I tried removing the index, and sure enough, without the index the insert took less than two seconds. I repeated the inserts many times (with and without the index) and there’s very little other activity on this server, so I’m confident of these results.

There are approximately 1 fixed combinations of the first three indexed columns, and the fourth is the current time, so essentially what the function is doing is inserting a set of values for each of those 1 fixed combinations for every fifteen minute period. I can see how this might be a worst-case scenario for an index, because the inserted rows are alone and evenly spaced through the index. Even so, it doesn’t seem reasonable to me that an index would slow an insert more than 50-fold, regardless of hardware or the nature of the index. Am I wrong? Can anybody suggest why this would be happening and what I might be able to do about it? In production the table will have several million records, and the index is necessary for data retrieval from this table to be feasible, so leaving the index off is not an option.

Thanks in advance,

Trevor Ball




[PERFORM] Very important choice

2005-01-31 Thread Lago, Bruno Almeida do








Hello my friends,

 

I'd like to know (based on your experience and
technical details) which OS is recommended for running PostgreSQL keeping in
mind 3 indicators:

 

1 - Performance (SO, Network and IO)

2 - SO Stability

3 - File System Integrity

 

Comparisons between Slackware, Gentoo and FreeBSD are
welcome.

 

Which file system has the best performance and integrity:
XFS (Linux) or UFS (FreeBSD)? 

*I've read that UFS is not a journaling FS. Is this
right? How much this difference affects performance and integrity?

 

I don't have experience with FreeBSD so I'd like
to know if it is possible to run XFS on FreeBSD 5.3.

 

 

Thank you,

Bruno Almeida do Lago








Re: [PERFORM] Automagic tuning

2005-01-31 Thread Jim C. Nasby
On Mon, Jan 31, 2005 at 03:26:12PM -0500, Tom Lane wrote:
> Josh Berkus  writes:
> >> I would like something that really measures values like random_page_cost
> >> or cpu_tuple_cost that are hardware dependent.
> >> 
> >> I assume such thing does not exist?
> 
> > Nope.  You gotta whip out your calculator and run some queries.
> 
> Preferably a whole lot of queries.  All the measurement techniques I can
> think of are going to have a great deal of noise, so you shouldn't
> twiddle these cost settings based on just a few examples.

Are there any examples of how you can take numbers from pg_stats_* or
explain analize and turn them into configuration settings (such and
random page cost)?
-- 
Jim C. Nasby, Database Consultant   [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] High end server and storage for a PostgreSQL OLTP system

2005-01-31 Thread Jim C. Nasby
On Mon, Jan 31, 2005 at 09:41:32PM +0100, Cosimo Streppone wrote:
> 2) The goal is to make the db handle 100 tps (something like
>100 users). What kind of server and storage should I provide?
> 
>The actual servers our application runs on normally have
>2 Intel Xeon processors, 2-4 Gb RAM, RAID 0/1/5 SCSI
>disk storage with hard drives @ 10,000 rpm

You might look at Opteron's, which theoretically have a higher data
bandwidth. If you're doing anything data intensive, like a sort in
memory, this could make a difference.

> 4) Is it correct to suppose that multiple RAID 1 arrays
>can provide the fastest I/O ?
>I usually reserve one RAID1 array to db data directory,
>one RAID1 array to pg_xlog directory and one RAID1 array
>for os and application needs.

RAID10 will be faster than RAID1. The key factor to a high performance
database is a high performance I/O system. If you look in the archives
you'll find people running postgresql on 30 and 40 drive arrays.
-- 
Jim C. Nasby, Database Consultant   [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Automagic tuning

2005-01-31 Thread Tom Lane
"Jim C. Nasby" <[EMAIL PROTECTED]> writes:
> On Mon, Jan 31, 2005 at 03:26:12PM -0500, Tom Lane wrote:
>> Preferably a whole lot of queries.  All the measurement techniques I can
>> think of are going to have a great deal of noise, so you shouldn't
>> twiddle these cost settings based on just a few examples.

> Are there any examples of how you can take numbers from pg_stats_* or
> explain analize and turn them into configuration settings (such and
> random page cost)?

Well, the basic idea is to adjust random_page_cost so that the ratio of
estimated cost to real elapsed time (as shown by EXPLAIN ANALYZE) is the
same for seqscans and indexscans.  What you have to watch out for is
that the estimated cost model is oversimplified and doesn't take into
account a lot of real-world factors, such as the activity of other
concurrent processes.  The reason for needing a whole lot of tests is
essentially to try to average out the effects of those unmodeled
factors, so that you have a number that makes sense within the planner's
limited view of reality.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Index Slowing Insert >50x

2005-01-31 Thread Tom Lane
"Trevor Ball" <[EMAIL PROTECTED]> writes:
> ... it doesn't
> seem reasonable to me that an index would slow an insert more than
> 50-fold, regardless of hardware or the nature of the index.

Seems pretty slow to me too.  Can you provide a self-contained test
case?

One possibility is that depending on your platform and locale setting,
varchar comparisons can be a whole lot slower than a normal person would
consider sane.  If you're not using C locale, you might try C locale and
see if it helps.

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Automagic tuning

2005-01-31 Thread Jim C. Nasby
On Tue, Feb 01, 2005 at 12:06:27AM -0500, Tom Lane wrote:
> "Jim C. Nasby" <[EMAIL PROTECTED]> writes:
> > On Mon, Jan 31, 2005 at 03:26:12PM -0500, Tom Lane wrote:
> >> Preferably a whole lot of queries.  All the measurement techniques I can
> >> think of are going to have a great deal of noise, so you shouldn't
> >> twiddle these cost settings based on just a few examples.
> 
> > Are there any examples of how you can take numbers from pg_stats_* or
> > explain analize and turn them into configuration settings (such and
> > random page cost)?
> 
> Well, the basic idea is to adjust random_page_cost so that the ratio of
> estimated cost to real elapsed time (as shown by EXPLAIN ANALYZE) is the
> same for seqscans and indexscans.  What you have to watch out for is
> that the estimated cost model is oversimplified and doesn't take into
> account a lot of real-world factors, such as the activity of other
> concurrent processes.  The reason for needing a whole lot of tests is
> essentially to try to average out the effects of those unmodeled
> factors, so that you have a number that makes sense within the planner's
> limited view of reality.

Given that, I guess the next logical question is: what would it take to
collect stats on queries so that such an estimate could be made? And
would it be possible/make sense to gather stats useful for tuning the
other parameters?
-- 
Jim C. Nasby, Database Consultant   [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] High end server and storage for a PostgreSQL OLTP system

2005-01-31 Thread Cosimo Streppone
Tom Lane wrote:
Cosimo writes:
1) What kind of performance gain can I expect switching from
   7.1 to 7.4 (or 8.0)? Obviously I'm doing my own testing,
   but I'm not very impressed by 8.0 speed, may be I'm doing
   testing on a low end server...
Most people report a noticeable speedup in each new release
> [...]
I'm surprised that you're not seeing any gain at all.
What was your test case exactly?  Have you perhaps tuned your app
so specifically to 7.1 that you need to detune it?
We tend to use the lowest common SQL features that will allow
us to work with any db, so probably the problem is the opposite,
there is no pg-specific overtuning.
Also, the real pg load, that should be my ideal test case,
is somewhat difficult to reproduce (~ 50 users with handhelds
and browser clients).
Another good test is a particular procedure that opens
several (~1000) subsequent transactions, composed of many
repeated selection queries with massive write loads on 6/7
different tables, as big as 300/400k tuples.
Every transaction ends with either commit or rollback state
Indexing here should be ok, for I've analyzed every single query
also under database "stress".
Probably one big issue is that I need to vacuum/reindex too often
to keep db performances at a good(tm) level. I realize that this
has been addressed in several ways with newer PGs.
However, I need to do a lot of application and performance
tests and do them more seriously. Then I'll report the results here.
--
Cosimo
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] High end server and storage for a PostgreSQL OLTP system

2005-01-31 Thread Cosimo Streppone
Jim C. Nasby wrote:
On Mon, Jan 31, 2005 at 09:41:32PM +0100, Cosimo wrote:
>2) The goal is to make the db handle 100 tps (something like
>   100 users). What kind of server and storage should I provide?
You might look at Opteron's, which theoretically have a higher data
bandwidth. If you're doing anything data intensive, like a sort in
memory, this could make a difference.
Would Opteron systems need 64-bit postgresql (and os, gcc, ...)
build to have that advantage?
>4) Is it correct to suppose that multiple RAID 1 arrays
>   can provide the fastest I/O ?
>   I usually reserve one RAID1 array to db data directory,
>   one RAID1 array to pg_xlog directory and one RAID1 array
>   for os and application needs.
RAID10 will be faster than RAID1.
Sorry Jim, by RAID10 you mean several raid1 arrays mounted on
different linux partitions? Or several raid1 arrays that
build up a raid0 array? In the latter case, who decides which
data goes in which raid1 array? Raid Adapter?
> The key factor to a high performance database is a high
> performance I/O system. If you look in the archives
you'll find people running postgresql on 30 and 40
> drive arrays.
I'll do a search, thank you.
--
Cosimo
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]