[ADMIN] OS monitoring tools I should be using?

2005-10-20 Thread Ron Mayer

Is there a good place that summarizes the OS tools I should
be using to monitor postgresql servers on various machines?

On *nix machines I have sar/sysstat stuff constantly logging
stuff; but on Win* machines I don't know what the equivalent
should be.  Any recommendations?

For interactive monitoring I have an OK feel for vmstat
and top to watch memory & cpu use, and iostat to watch the
quantity of disk/IO per drive.  But there are still some
things I can't find easily so any new advice wold be appreciated.

For example, HPUX's iostat used to tell me number
of disk-seeks/second but I don't see the equivalent for Linux.

Rather than just ask that specific question; I'd be interested
to know in general what OS-level monitoring tools you guys
find most important for various platforms.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [ADMIN] Postgresql DB on a live cd

2003-11-20 Thread Ron Mayer
Lucio wrote:
This application uses a static (read-only, no insert, modify or delete) 200MB
postgresql db, so can't put pgsql/data/base and pgsql/data/pg_xlog in ramdisk.
One thing I don't think I saw mentioned yetI think certain queries 
create temporary tables (or at least use some temporary space) so
  /base/[whatevernumber]/pgsql_tmp
should be a symlink to something on the ramdisk as well.

(Kinda off-topic -- anecdotally it feels to me that some queries
go faster if I put /pgsql_tmp on a different device & disk controller.
If I verify this I'll post results to perform)
  Ron Mayer
  [EMAIL PROTECTED]
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [ADMIN] Postgresql DB on a live cd

2003-11-20 Thread Ron Mayer
Lucio wrote:
Hi,

I'am trying to create a demo cd (knoppix) with a jdbc application.

This application uses a static (read-only, no insert, modify or delete) 200MB
postgresql db, so can't put pgsql/data/base and pgsql/data/pg_xlog in ramdisk.
One thing I don't think I saw mentioned yet
I think some large sorts and temporary tables also use
  /base/[whatevernumber]/pgsql_tmp
so you probably want to make that a symlink to something on the ramdisk
as well.
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [ADMIN] run httpd and postgresql on different machines ?

2005-01-18 Thread Ron Mayer
Chuming Chen wrote:
I want to set up a web site using apache httpd, php and postgresql. From 
the performance point of view, which  architecture is better? 1)  Run 
httpd and postgresql on the same machine; 2) Run postgresql on seperate 
machine. My concern is that the machine I am going to run httpd has 
limitted storage. I am expecting the increasing of postgresql database 
once I set it ip.
I had good luck with 4 very cheap (small, used, 1CPU, IDE disk)
machines running httpd/php/MONO-ASP.NET, and 1 more expensived
machine (with some internal failover capabilities - dual power
supplies, with a RAID array, with a support contract) running
postgresql.
The reasoning was one of cost/performance with the ability to
have likely-to-fail components fail with no downtime.
The cheapest way to scale the front-end machines with failover
capabilities was to use sub-$1000 slightly obsolete PCs.
The cheapest way I knew to provide limited scalability
and failover (at least for disk) for a database was a raid array.
With some of the newer replication features or pgpool, it might
be easier to scale "out" instead of "up"; but I have no experience
making that determination.
How about the rest of you guys
If CPU demands in my database get to the point of needing 5-CPUs
with a read-mostly (90%) system, am I better off with
  1 lots of replication between small servers
 or
  2 scaling up a big server.
(the reason I'm CPU bound instead of disk bound is that many
of my queries are spatial operations with PostGIS like unions
and buffers of polygons).
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


[ADMIN] Does anyone have a script to monitor if FSM settings are sufficient?

2005-01-29 Thread Ron Mayer
I may be implementing a database at a customer's site that would
ideally run administration free for a year or so -- or at least
be able to have scripts detect proactively if they'd need to call
us for support before the system starts behaving poorly.
Is there a way to parse the output of "vacuum" to tell me
if my fsm settings aren't big enough to keep up, and or
indicate a need to vacuum more frequently?
Or am I better off looking at something else?  Like perhaps
log_min_duration_statement and tell them to call us if
we see statements getting slow for any reason?
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [ADMIN] Best practice - Vacuum. Replication suggestions and pg vs mysql

2005-03-17 Thread Ron Mayer
Tom Lane wrote:
David B <[EMAIL PROTECTED]> writes:
15minute lock is a long time.
There's no lock, unless you are using VACUUM FULL which you shouldn't.
Or, I believe, if he has any GIST indexes (such as tsearch or
postgis ones).  At least it seems normal vacuum locks GIST indexes
for quite some time here.
I ended up stopping using vacuum_cost_delay to minimize the
time those GIST indexes seemed locked.
   Ron
PS: If I'm right, I wonder if VACUUM or at least autovacuum
should automatically force vacuum_cost_delay to zero while
holding locks (i.e. while doing GIST indexes) to reduce the
time those locks are held.
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [ADMIN] PostgreSQL, INC. Support

2005-03-22 Thread Ron Mayer
Peter Eisentraut wrote:
Chris Hoover wrote:
Also, are there any other good companies to consider for support?
http://techdocs.postgresql.org/companies.php
Any reason Fujitsu (http://fastware.com.au/postgresql_support.html) isn't 
on that list?
I think it adds quite a bit of credibility when I tell customers
that if they need global support they can get it from a company
bigger than Oracle.
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faq


Re: [ADMIN] Stable Release?

2008-02-21 Thread Ron Mayer
Gergely CZUCZY wrote:
> On Thu, Feb 21, 2008 at 11:53:35AM -0500, Carol Walter wrote:
>> A colleague told me that the latest stable release of PostgreSQL is 8.1.x.  
>> I thought it was 8.2.4.  What is the latest stable release? 
>>  I thought there was a problem with autovacuum in the earlier releases.  Can 
>> anyone comment?
> Have you tried to check out the website of the project?
> Let me help you, it's http://www.postgresql.org/
> I'm sure you've read it, while you was looking for this list...
> 

I'm guessing Carol did see it but still has confusion over
whether a "major" or "latest" release is considered "stable".

I see the web site mention that 8.3's the "latest" release;
and on other pages it says 8.2.3 is the latest release[1], and
sometimes other versions[2].

With all the bizarre numbering schemes software[3] uses,
where sometimes 2.0.0 means "unstable"; perhaps it'd be
nice if our versioning page[4] explicitly said "Our major
releases are stable releases that we consider suitable for
production".





[1] http://www.postgresql.org/docs/faqs.FAQ_chinese_trad.html
[2] http://www.postgresql.org/docs/faqs.FAQ_czech.html#item1.6
[3] http://www.linux.com/feature/45507
[4] http://www.postgresql.org/support/versioning

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [ADMIN] Latest postgres stable version

2008-02-26 Thread Ron Mayer

Tom Lane wrote:

On Tue, Feb 26, 2008 at 12:08 AM, Suresh Gupta VG <[EMAIL PROTECTED]> wrote:

May I know what is the stable pgsql release and latest?


This topic has been addressed very recently, see this thread:
http://archives.postgresql.org/pgsql-admin/2008-02/msg00238.php


Yeah - and from that thread there are still misleading
web pages out there to confuse non-native English speakers.

http://www.postgresql.org/docs/faqs.FAQ_chinese_trad.html#1.6

which apparently says that 8.2.3 is the latest, and

http://www.postgresql.org/docs/faqs.FAQ_czech.html#item1.6

which apparently says 8.2.5 is the latest.

Seems the web site might want to use some macro for the
latest versions to keep the various FAQs up-to-date.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [ADMIN] 8.3.5 broken after power fail SOLVED

2009-02-21 Thread Ron Mayer
Naomi Walker wrote:
> Other than disaster tests, how would I know if I have an system that
> lies about fsync?

Well, the linux kernel tries to detect it on bootup and
will give messages like this:
  %dmesg | grep 'disabling barriers'
  JBD: barrier-based sync failed on md1 - disabling barriers
  JBD: barrier-based sync failed on hda3 - disabling barriers
when it detects certain types of unreliable fsync's. The command
  %hdparm -I /dev/hdf | grep FLUSH_CACHE_EXT
will give you clues if a hard drive itself even can support
a non-lying fsync when it's internal cache is enabled.


Sadly some filesystems (ext3) lie even above and beyond what
Linux does - by only using the write barriers correctly
when the inode itself is modified; not when the data is modified.
A test program here:
http://archives.postgresql.org/pgsql-performance/2008-08/msg00159.php
can detect those cases where the kernel & drive don't lie
about fsync but ext3 lies in spite of them; with more background
info here:
 http://article.gmane.org/gmane.linux.file-systems/21373
 http://thread.gmane.org/gmane.linux.kernel/646040


Elsewhere in the archives you can find programs that measure
how fast fsyncs happen - but on your hardware, and you can
try to see if those numbers approximately match how fast your
disks spin.  But then you still need to make sure the test
program used the same methods for syncing the drive that your
postgres configuration files are choosing.

I wonder if the only really safe way is to run a very
write intensive database script and pull and kill your
system in a number of ways, including yanking power to
the system; to disk arrays, etc and see if your database died.


> 
> 
>>
>> We preach this again and again.  PostgreSQL can only survive a power
>> outage type failure ONLY if the hardware / OS / filesystem don't lie
>> about fsync.  If they do, all bets are off, and this kind of failure
>> means you should really failover to another machine or restore a
>> backup.
>>
>> It's why you have to do possibly destructive tests to see if your
>> server stands at least some chance of surviving this kind of failure,
>> log shipping for recovery, and / or replication of another form (slony
>> etc...) to have a reliable server.
>>
>> The recommendations for recovery of data are just that, recovery
>> oriented.  They can't fix a broken database at that point.  You need
>> to take it offline after this kind of failure if you can't trust your
>> hardware.
>>
>> Usually when it finds something wrong it just won't start up.
>>
>>   
> 
> 


-- 
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] Updating a very large table

2009-04-24 Thread Ron Mayer
Kevin Grittner wrote:
> Chris Browne  wrote: 
>  
>> I'd suggest adding an index
>  
> The OP said the table had 15 indexes already.  I would guess one of
> those could be used.  Perhaps it has a primary key
>  
>>   update table1 set new_column = [whatever calculation]
>>   where new_column is null and
>> quasi_unique_column in
>>   (select quasi_unique_column from table1
>>where new_column is null limit 1000);
>  
> Or, if the primary key (or other unique or quasi-unique existing
> index) has multiple columns, this could still be done with:
>  
>   update table1 set new_column = [whatever calculation]
>   where new_column is null and
> (col1, col2) in
>   (select col1, col2 from table1
>where new_column is null limit 1000);
>  

Would doing something with ctid be even better?
Or does it have some risks I'm missing.   I'm thinking
something like:

fli=# select max(ctid) from table1;
 max
-
 (183000,42)
(1 row)

Then

update table set new_column=[whatever] where ctid<'(1,1)';
vacuum;
update table set new_column=[whatever] where ctid>'(1,1)' and 
ctid<'(2,1');
vacuum;
...
update table set new_column=[whatever] where ctid>'(18,1)';
vacuum;

and perhaps a final

update table set new_column=[whatever] where new_column is null;

to catch any this might have missed?


Seems this makes it easer to control how much the table will
bloat too -- if I only want it to bloat 5% I divide max(ctid) by 20
for each group size


-- 
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] hardware information

2009-09-17 Thread Ron Mayer
Scott Marlowe wrote:
> On Tue, Sep 15, 2009 at 3:58 AM, std pik  wrote:
>> Hello all..
>> I'm using PostgreSQL 8.3..
>> How can I get information about the hardware utilization:
>>   - CPU usage.
>>   - Disk space.
>>   - Memory allocation.
>> thank you.
> 
> Dude, there was a whole thread.  Are you receiving our emails?  If so
> is there some part of the thread that you had some questions?  If not,
> I'm not sure how you'll answer this.

I wonder if we should set up a pgfoundry package that queries the
OS for the relevant information we find particularly relevant;
including more subtle things like whether or not fsync's lying
and why, and if the OOM killer's enabled.

Then we could tell people who ask this to install the module
and tell them to include the output of pg_system_info();

I might be volunteering to try this for a couple platforms,
but don't really know what's involved in setting up a pgfoundry
project.

-- 
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] ANALYZE not working?

2003-01-06 Thread Ron Mayer

On Mon, 6 Jan 2003, Jeff Boes wrote:
>
> [...] Suddenly [...] ANALYZE isn't working properly (it is recording
> pg_class.reltuples far lower than the actual row count).

I had the same problem recently...
  http://archives.postgresql.org/pgsql-bugs/2002-08/msg00015.php
where "vacuum analyze" and "vacuum; analyze;" were giving me
three orders of magnitude differences in estimates.

Apparently "analyze" is somewhat dependant on the order in which
rows had been inserted in the table; since it just does a (small)
random sample rather than a full table scan.

The thread there has some detailed explanation from Tom about the
underlying cause of the different results and how to diagnose it.


Short summary for me was that for a short term fix, I paid
the price for the slower "vacuum analzye" more frequently;
and later I re-ordered the whole table
  create table tmp_table as select * from my_table order by foo;
  drop table my_table;
  alter table tmp_table rename to my_table;
which made "analyze;" give good estimates again.

Hope this helps.
   Ron




---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [ADMIN] deadlock problem in Ad serving..

2003-01-20 Thread Ron Mayer

On Mon, 20 Jan 2003, Tom Lane wrote:
>
> Bhuvan A <[EMAIL PROTECTED]> writes:
> >> Error:...deadlock detected...
> > ... You can overcome this by locking the table in share row
> > exclusive mode also...
>
> ...use shorter transactions (one per page, not one per several pages).

Hmm... with his query:
  "update banner_stats set imp=imp+1 where uniqid=4330"

Is it true that the problem happens when updates are done
in a different order by two transactions like this:

 trans.1:  "update banner_stats set imp=imp+1 where uniqid=4330"
 trans.2:  "update banner_stats set imp=imp+1 where uniqid=10"
 trans.1:  "update banner_stats set imp=imp+1 where uniqid=10"
 trans.2:  "update banner_stats set imp=imp+1 where uniqid=4330"

If so, then could the problem be avoided if in his application
logic he always did the updates in the same order? ... I.e. Make
sure the each transaction does the updates in the same order by
sorting his updates based on uniqid in the client?

   Ron


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [ADMIN] deadlock problem in Ad serving..

2003-01-20 Thread Ron Mayer

On Mon, 20 Jan 2003, Tom Lane wrote:
>
> If I understood correctly, he's tracking webpage hits; so the updates
> are going to correspond to the sequence in which visitors move to
> different webpages.

Ah... I was thinking he was counting banners served within a
single page (perhaps a banner on top and a banner on the bottom),
and doing accounting of which banners were shown.  In that case
it might have been interesting to keep some of his information
in a transaction.

  start transaction...
insert_a_new_cookie_record_if_it_didn't_exits..
record the top_banner...
record the bottom_banner...
  end transaction...

I've done something like that to count how many distinct users saw
particular ads.  In this case sorting the small number (2) of banners
in his application logic would be easy.

> I would suggest using a separate transaction for each webpage visited.
> Holding a transaction open across multiple page traversals is widely
> considered bad news for a number of reasons, not only this one.

I understand this part.

  Ron


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [ADMIN] Are 50 million rows a problem for postgres ?

2003-09-09 Thread Ron Mayer

> Hi all, i work in a telco and i have huge ammount of data, (50 million)
> but i see a lack of performance at huge tables with postgres, 
> are 50 million rows the "limit" of postgres ? (with a good performance)

I have worked on a datawarehouse (postgresql 7.3) with a
pretty standard star schema with over 250 million rows on
the central 'fact' table, and anywhere from 100 to 10+ million
records in the surrounding 'dimension' tables.

The most common queries were simple joins between 3 tables, with 
selects on one of the ids.  These took a few (1-60) seconds.
About 500,000 new records were loaded each night; and the ETL
processing and creating some aggregates took about 11 hours/night
with 7.3, and 9 hours/night with 7.4beta.

Hope this helps.


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match