Re: [PERFORM] Anything to be gained from a 'Postgres Filesystem'?

2004-10-21 Thread Aaron Werman
The intuitive thing would be to put pg into a file system. 

/Aaron

On Thu, 21 Oct 2004 12:44:10 +0200, Leeuw van der, Tim
[EMAIL PROTECTED] wrote:
 Hi,
 
 I guess the difference is in 'severe hacking inside PG' vs. 'some unknown amount of 
 hacking that doesn't touch PG code'.
 
 Hacking PG internally to handle raw devices will meet with strong resistance from 
 large portions of the development team. I don't expect (m)any core devs of PG will 
 be excited about rewriting the entire I/O architecture of PG and duplicating large 
 amounts of OS type of code inside the application, just to try to attain an unknown 
 performance benefit.
 
 PG doesn't use one big file, as some databases do, but many small files. Now PG 
 would need to be able to do file-management, if you put the PG database on a raw 
 disk partition! That's icky stuff, and you'll find much resistance against putting 
 such code inside PG.
 So why not try to have the external FS know a bit about PG and it's 
 directory-layout, and it's IO requirements? Then such type of code can at least be 
 maintained outside the application, and will not be as much of a burden to the rest 
 of the application.
 
 (I'm not sure if it's a good idea to create a PG-specific FS in your OS of choice, 
 but it's certainly gonna be easier than getting FS code inside of PG)
 
 cheers,
 
 --Tim
 
 
 
 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Steinar H. Gunderson
 Sent: Thursday, October 21, 2004 12:27 PM
 To: [EMAIL PROTECTED]
 Subject: Re: [PERFORM] Anything to be gained from a 'Postgres Filesystem'?
 
 On Thu, Oct 21, 2004 at 08:58:01AM +0100, Matt Clark wrote:
  I suppose I'm just idly wondering really.  Clearly it's against PG
  philosophy to build an FS or direct IO management into PG, but now it's so
  relatively easy to plug filesystems into the main open-source Oses, It
  struck me that there might be some useful changes to, say, XFS or ext3, that
  could be made that would help PG out.
 
 This really sounds like a poor replacement for just making PostgreSQL use raw
 devices to me. (I have no idea why that isn't done already, but presumably it
 isn't all that easy to get right. :-) )
 
 /* Steinar */
 --
 Homepage: http://www.sesse.net/
 
 ---(end of broadcast)---
 TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
 
 ---(end of broadcast)---
 TIP 7: don't forget to increase your free space map settings
 


-- 

Regards,
/Aaron

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Simple machine-killing query!

2004-10-21 Thread Aaron Werman
Sounds like you need some way to match a subset of the data first,
rather than try indices that are bigger than the data. Can you add
operation indices, perhaps on the first 10 bytes of the keys in both
tables or on a integer hash of all of the strings? If so you could
join on the exact set difference over the set difference of the
operation match.

/Aaron


On Thu, 21 Oct 2004 17:34:17 +0300, Victor Ciurus [EMAIL PROTECTED] wrote:
 Hi all,
 
 I'm writing this because I've reached the limit of my imagination and
 patience! So here is it...
 
 2 tables:
 1 containing 27 million variable lenght, alpha-numeric records
 (strings) in 1 (one) field. (10 - 145 char lenght per record)
 1 containing 2.5 million variable lenght, alpha-numeric records
 (strings) in 1 (one) field.
 
 table wehere created using:
 CREATE TABLE public.BIGMA (string VARCHAR(255) NOT NULL) WITH OIDS; +
 CREATE INDEX BIGMA_INDEX ON public.BIGMA USING btree (string);
 and
 CREATE TABLE public.DIRTY (string VARCHAR(128) NOT NULL) WITH OIDS; +
 CREATE INDEX DIRTY_INDEX ON public.DIRTY USING btree (string);
 
 What I am requested to do is to keep all records from 'BIGMA' that do
 not apear in 'DIRTY'
 So far I have tried solving this by going for:
 
 [explain] select * from BIGMA where string not in (select * from DIRTY);
QUERY PLAN
 
  Seq Scan on bigma  (cost=0.00..24582291.25 rows=500 width=145)
Filter: (NOT (subplan))
SubPlan
  -  Seq Scan on dirty  (cost=0.00..42904.63 rows=2503963 width=82)
 (4 rows)
 
 AND
 
 [explain] select * from bigma,dirty where bigma.email!=dirty.email;
   QUERY PLAN
 ---
  Nested Loop  (cost=20.00..56382092.13 rows=2491443185 width=227)
Join Filter: ((inner.email)::text  (outer.email)::text)
-  Seq Scan on dirty  (cost=0.00..42904.63 rows=2503963 width=82)
-  Materialize  (cost=20.00..30.00 rows=1000 width=145)
  -  Seq Scan on bigma  (cost=0.00..20.00 rows=1000 width=145)
 (5 rows)
 
 Now the problem is that both of my previous tries seem to last
 forever! I'm not a pqsql guru so that's why I'm asking you fellas to
 guide mw right! I've tried this on mysql previosly but there seems to
 be no way mysql can handle this large query.
 
 QUESTIONS:
 What can I do in order to make this work?
 Where do I make mistakes? Is there a way I can improve the performance
 in table design, query style, server setting so that I can get this
 monster going and producing a result?
 
 Thanks all for your preciuos time and answers!
 
 Victor C.
 
 ---(end of broadcast)---
 TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
 


-- 

Regards,
/Aaron

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Free PostgreSQL Training, Philadelphia, Oct 30

2004-10-20 Thread Aaron Werman
I'm driving from Tenafly NJ and going to both sessions. If you're able
to get to the George Washington Bridge (A train to 178th Street [Port
Authority North] and a bus over the bridge), I can drive you down. I'm
not sure right now about the return because I have confused plans to
meet someone.

/Aaron


On Tue, 19 Oct 2004 14:43:29 -0400, Max Baker [EMAIL PROTECTED] wrote:
 On Wed, Oct 13, 2004 at 12:21:27PM -0400, Aaron Mulder wrote:
  All,
My company (Chariot Solutions) is sponsoring a day of free
  PostgreSQL training by Bruce Momjian (one of the core PostgreSQL
  developers).  The day is split into 2 sessions (plus a QA session):
 
   * Mastering PostgreSQL Administration
   * PostgreSQL Performance Tuning
 
Registration is required, and space is limited.  The location is
  Malvern, PA (suburb of Philadelphia) and it's on Saturday Oct 30.  For
  more information or to register, see
 
  http://chariotsolutions.com/postgresql.jsp
 
 I'm up in New York City and would be taking the train down to Philly.  Is
 anyone coming from Philly or New York that would be able to give me a lift
 to/from the train station?  Sounds like a great event.
 
 Cheers,
 -m
 
 ---(end of broadcast)---
 TIP 3: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly
 


-- 

Regards,
/Aaron

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] Performance vs Schemas

2004-10-19 Thread Aaron Werman
Right - if you split a table to a lot of more selective tables, it can often
dramatically change the plan options (e.g. - in a single table, selectivity
for a query may be 1% and require an expensive nested loop while in the more
restrictive table it may match 14% of the data and do a cheaper scan).

Also - don't forget that just rebuilding a database cleanly can dramatically
improve performance.

The only dbms I know that indexes views is MS SQL Server 2000, where it is a
limited form of materialized queries. pg doesn't do MQs, but check out
functional indices.

/Aaron

- Original Message - 
From: Gregory S. Williamson [EMAIL PROTECTED]
To: Igor Maciel Macaubas [EMAIL PROTECTED];
[EMAIL PROTECTED]
Sent: Thursday, October 14, 2004 2:45 PM
Subject: Re: [PERFORM] Performance vs Schemas


Igor,

I'm not sure if it is proper to state that schemas are themselves speeding
things up.

As an example, we have data that is usually accessed by county; when we put
all of the data into one big table and select from it using a code for a
county of interest, the process is fairly slow as there are several hundred
thousand candidate rows from that county in a table with many millions of
rows. When we broke out certain aspects of the data into schemas (one per
county) the searches become very fast indeed because we can skip the
searching for a specific county code with the relevant tables and there is
less (unneeded) data in the table being searched.

As always, EXPLAIN ANALYZE ... is your friend in understanding what the
planner is doing with a given query.

See http://www.varlena.com/varlena/GeneralBits/Tidbits/ for some useful
information, especially under the performance tips section.

HTH,

Greg Williamson
DBA
GlobeXplorer LLC

-Original Message-
From: Igor Maciel Macaubas [mailto:[EMAIL PROTECTED]
Sent: Thu 10/14/2004 11:38 AM
To: [EMAIL PROTECTED]
Cc:
Subject: [PERFORM] Performance vs Schemas
Hi all,

I recently migrated my database from schema 'public' to multiple schema.
I have around 100 tables, and divided them in 14 different schemas, and then
adapted my application to use schemas as well.
I could percept that the query / insert / update times get pretty much
faster then when I was using the old unique schema, and I'd just like to
confirm with you if using schemas speed up the things. Is that true ?

What else I can do to speed up the query processing, best pratices,
recommendations ... ? What about indexed views, does postgresql supports it?

Regards,
Igor
--
[EMAIL PROTECTED]




---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Normal case or bad query plan?

2004-10-12 Thread Aaron Werman
Makes sense. See DB2 8.2 info on their new implementation of cross column
statistics. If this is common and you're willing to change code, you can
fake that by adding a operation index on some hash function of both columns,
and search for both columns and the hash.

- Original Message - 
From: Kris Jurka [EMAIL PROTECTED]
To: Gabriele Bartolini [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Monday, October 11, 2004 5:17 PM
Subject: Re: [PERFORM] Normal case or bad query plan?




 On Mon, 11 Oct 2004, Gabriele Bartolini wrote:


 --
---
Seq Scan on ip2location  (cost=0.00..30490.65 rows=124781 width=8)
  (actual time=5338.120..40237.283 rows=1 loops=1)
  Filter: ((1040878301::bigint = ip_address_from) AND
  (1040878301::bigint = ip_address_to))
Total runtime: 40237.424 ms
 

 I believe the problem is that pg's lack of cross-column statistics is
 producing the poor number of rows estimate.  The number of rows mataching
 just the first 1040878301::bigint = ip_address_from condition is 122774
 which is roughtly 10% of the table.  I imagine the query planner
 believes that the other condition alone will match the other 90% of the
 table.  The problem is that it doesn't know that these two ranges'
 intersection is actually tiny.  The planner assumes a complete or nearly
 complete overlap so it thinks it will need to fetch 10% of the rows from
 both the index and the heap and chooses a seqscan.

 Kris Jurka

 ---(end of broadcast)---
 TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] Data warehousing requirements

2004-10-07 Thread Aaron Werman
Consider how the fact table is going to be used, and review hacking it up
based on usage. Fact tables should be fairly narrow, so if there are extra
columns beyond keys and dimension keys consider breaking it into parallel
tables (vertical partitioning).

Horizontal partitioning is your friend; especially if it is large - consider
slicing the data into chunks. If the fact table is date driven it might be
worthwhile to break it into separate tables based on date key.  This wins in
reducing the working set of queries and in buffering. If there is a real
hotspot, such as current month's activity, you might want to keep a separate
table with just the (most) active data.Static tables of unchanged data can
simplify backups, etc., as well.

Consider summary tables if you know what type of queries you'll hit.
Especially here, MVCC is not your friend because it has extra work to do for
aggregate functions.

Cluster helps if you bulk load.

In most warehouses, the data is downstream data from existing operational
systems. Because of that you're not able to use database features to
preserve integrity. In most cases, the data goes through an
extract/transform/load process - and the output is considered acceptable.
So, no RI is correct for star or snowflake design. Pretty much no anything
else that adds intelligence - no triggers, no objects, no constraints of any
sort. Many designers try hard to avoid nulls.

On the hardware side - RAID5 might work here because of the low volume if
you can pay the write performance penalty. To size hardware you need to
estimate load in terms of transaction type (I usually make bucket categories
of small, medium, and large effort needs) and transaction rate. Then try to
estimate how much CPU and I/O they'll use.

/Aaron

Let us not speak of them; but look, and pass on.

- Original Message - 
From: Gabriele Bartolini [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Wednesday, October 06, 2004 5:36 PM
Subject: [PERFORM] Data warehousing requirements


 Hi guys,

 I just discussed about my problem on IRC. I am building a Web usage
 mining system based on Linux, PostgreSQL and C++ made up of an OLTP
 database which feeds several and multi-purpose data warehouses about
users'
 behaviour on HTTP servers.

 I modelled every warehouse using the star schema, with a fact table
and
 then 'n' dimension tables linked using a surrogate ID.

 Discussing with the guys of the chat, I came up with these
conclusions,
 regarding the warehouse's performance:

 1) don't use referential integrity in the facts table
 2) use INTEGER and avoid SMALLINT and NUMERIC types for dimensions' IDs
 3) use an index for every dimension's ID in the fact table

 As far as administration is concerned: run VACUUM ANALYSE daily and
 VACUUM FULL periodically.

 Is there anything else I should keep in mind?

 Also, I was looking for advice regarding hardware requirements for a
 data warehouse system that needs to satisfy online queries. I have indeed
 no idea at the moment. I can only predict 4 million about records a month
 in the fact table, does it make sense or not? is it too much?

 Data needs to be easily backed up and eventually replicated.

 Having this in mind, what hardware architecture should I look for? How
 many hard disks do I need, what kind and what RAID solution do you suggest
 me to adopt (5 or 10 - I think)?

 Thank you so much,
 -Gabriele
 --
 Gabriele Bartolini: Web Programmer, ht://Dig  IWA/HWG Member, ht://Check
 maintainer
 Current Location: Prato, Toscana, Italia
 [EMAIL PROTECTED] | http://www.prato.linux.it/~gbartolini | ICQ#129221447
   Leave every hope, ye who enter!, Dante Alighieri, Divine Comedy, The
 Inferno








 ---
 Outgoing mail is certified Virus Free.
 Checked by AVG anti-virus system (http://www.grisoft.com).
 Version: 6.0.773 / Virus Database: 520 - Release Date: 05/10/2004








 ---(end of broadcast)---
 TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Caching of Queries

2004-10-01 Thread Aaron Werman
The context of the discussion was a hack to speed queries against static
tables, so MVCC is not relevent.  As soon as any work unit against a
referenced table commits, the cache is invalid, and in fact the table
shouldn't be a candidate for this caching for a while. In fact, this cache
would reduce some the MVCC 'select count(*) from us_states' type of horrors.

(The attraction of a server side cache is obviously that it could *with no
server or app changes* dramatically improve performance. A materialized view
is a specialized denormalization-ish mechanism to optimize a category of
queries and requires the DBA to sweat the details. It is very hard to cache
things stochastically without writing a server. Trigger managed extracts
won't help you execute 1,000 programs issuing the  query  select sec_level
from sec where division=23 each second or a big table loaded monthly.)



- Original Message - 
From: Jeff [EMAIL PROTECTED]
To: Mitch Pirtle [EMAIL PROTECTED]
Cc: Aaron Werman [EMAIL PROTECTED]; Scott Kirkwood
[EMAIL PROTECTED]; Neil Conway [EMAIL PROTECTED];
[EMAIL PROTECTED]; Tom Lane [EMAIL PROTECTED]
Sent: Monday, September 27, 2004 2:25 PM
Subject: Re: [PERFORM] Caching of Queries


 [ discussion of server side result caching ]

 and lets not forget PG's major fork it will throw into things:  MVCC
 The results of query A may hold true for txn 1, but not txn 2 and so on
 .
 That would have to be taken into account as well and would greatly
 complicate things.

 It is always possible to do a poor man's query cache with triggers..
 which would just leave you with basically a materialized view.

 --
 Jeff Trout [EMAIL PROTECTED]
 http://www.jefftrout.com/
 http://www.stuarthamm.net/


 ---(end of broadcast)---
 TIP 8: explain analyze is your friend


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Slow update/insert process

2004-10-01 Thread Aaron Werman



Some quick notes:

- Using a side effect of a function to update the 
database feels bad to me
- how long does theSELECT into varQueryRecord 
md5(upc.keyp
 function take / what does it's explain look 
like?
- There are a lot of non-indexed columns on that 
delta master table, such as keyf_upc. 

 I'm guessing you're doing 
90,000 x {a lot of slow scans}
- My temptation would be to rewrite the processing 
to do a pass of updates, a pass of inserts, 
 and then the SELECT

  - Original Message - 
  From: 
  Patrick 
  Hatcher 
  To: [EMAIL PROTECTED] 
  
  Sent: Friday, October 01, 2004 2:14 
  PM
  Subject: [PERFORM] Slow update/insert 
  process
  Pg: 7.4.5 RH 
  7.3 8g Ram 200 g drive space RAID0+1 Tables vacuum on a 
  nightly basis The following 
  process below takes 8 hours to run on 90k records and I'm not sure where to 
  being to look for the bottleneck. This isn't the only updating on this 
  database that seems to take a long time to complete. Is there something 
  I should be looking for in my conf settings?  TIA Patrick SQL: 
  ---Bring back only selected records to run 
  through the update process. --Without 
  the function the SQL takes  10secs to return 90,000 records 
  SELECT 
  count(pm.pm_delta_function_amazon(upc.keyp_upc,'amazon')) FROM mdc_upc upc JOIN public.mdc_products prod ON upc.keyf_products = 
  prod.keyp_products JOIN 
  public.mdc_price_post_inc price ON prod.keyp_products = 
  price.keyf_product JOIN 
  public.mdc_attribute_product ap on ap.keyf_products = prod.keyp_products and 
  keyf_attribute=22 WHERE 
  upper(trim(ap.attributevalue)) NOT IN 
  ('ESTEE LAUDER', 
  'CLINIQUE','ORGINS','PRESCRIPTIVES','LANC?ME','CHANEL','ARAMIS','M.A.C','TAG 
  HEUER') AND 
  keyf_producttype222 AND 
  prod.action_publish = 1; Function: CREATE OR 
  REPLACE FUNCTION pm.pm_delta_function_amazon(int4, "varchar")RETURNS 
  bool AS'DECLARE varkeyf_upc 
 ALIAS FOR $1; varPassword  
ALIAS FOR $2; 
  varRealMD5
  varchar; varDeltaMD5  
varchar; varLastTouchDate   
   date; varQuery  
 text; varQuery1
   text; varQueryMD5  
text; 
  varQueryRecordrecord; 
  varFuncStatusboolean := false; 
  BEGIN-- Check the passwordIF varPassword  
  \'amazon\' THEN Return false;END IF;-- 
  Get the md5 hash for this productSELECT into varQueryRecord 
  md5(upc.keyp_upc || prod.description || 
  pm.pm_price_post_inc(prod.keyp_products)) AS md5 FROM 
  public.mdc_upc upc JOIN public.mdc_products prod ON 
  upc.keyf_products = prod.keyp_products JOIN 
  public.mdc_price_post_inc price ON price.keyf_product = 
  prod.keyp_products WHERE upc.keyp_upc = varkeyf_upc LIMIT 1 
  ;IF NOT FOUND THEN RAISE EXCEPTION 
  \'varRealMD5 is NULL. UPC ID is %\', varkeyf_upc;ELSE 
  varRealMD5:=varQueryRecord.md5;END IF;-- Check 
  that the product is in the delta table and return its hash for comparison 
  SELECT into varQueryRecord md5_hash,last_touch_date  
  FROM pm.pm_delta_master_amazon WHERE keyf_upc = 
  varkeyf_upc LIMIT 1;IF NOT FOUND THEN -- ADD and 
  exit INSERT INTO pm.pm_delta_master_amazon 
  (keyf_upc,status,md5_hash,last_touch_date) values 
  (varkeyf_upc,\'add\',varRealMD5,CURRENT_DATE); 
  varFuncStatus:=true; RETURN 
  varFuncStatus;ELSE --Update the record  
   --- If the hash matches then set the record to HOLD 
  IF varRealMD5 = varQueryRecord.md5_hash THEN   
  UPDATE pm.pm_delta_master_amazon
  SET status= \'hold\',
  last_touch_date = CURRENT_DATE   WHERE 
  keyf_upc = varkeyf_upc AND last_touch_date  CURRENT_DATE; 
 varFuncStatus:=true; 
  ELSE   -- ELSE mark the item as 
  ADD   UPDATE pm.pm_delta_master_amazon 
 SET status= \'add\',
  last_touch_date = CURRENT_DATE   WHERE 
  keyf_upc = varkeyf_upc;   
  varFuncStatus:=true; END IF; END 
  IF;RETURN varFuncStatus;END;'LANGUAGE 'plpgsql' 
  IMMUTABLE;TableDef 
  CREATE TABLE pm.pm_delta_master_amazon ( 
keyf_upc   
   int4 ,   status
   varchar(6) ,   
  md5_hashvarchar(40) , 
last_touch_date   
   date   
  ) GO CREATE INDEX status_idx   ON 
  pm.pm_delta_master_amazon(status) GO CONF  # WRITE AHEAD 
  LOG #--- 
  # - Settings - #fsync = true   
 # turns forced synchronization on or off #wal_sync_method = fsync# 
  the default varies across platforms:   
# fsync, fdatasync, open_sync, or 
  open_datasync wal_buffers = 32  
# min 4, 8KB each 
  # - Checkpoints - checkpoint_segments = 50# 
  in logfile segments, min 1, 16MB each checkpoint_timeout = 600# range 30-3600, in 
  seconds #checkpoint_warning = 30 
 # 0 is off, in seconds #commit_delay = 0  
# range 0-10, in microseconds #commit_siblings = 5  
  # range 1-1000 Patrick 
  HatcherMacys.Com


Re: [PERFORM] Caching of Queries

2004-10-01 Thread Aaron Werman
I'm not sure I understand your req fully. If the same request is repeatedly
done with same parameters, you could implement a proxy web server with a
croned script to purge stale pages. If there is substantially the same data
being summarized, doing your own summary tables works; if accessed enough,
they're in memory. I interleaved some notes into your posting.

- Original Message - 

From: Josh Berkus [EMAIL PROTECTED]

To: Postgresql Performance [EMAIL PROTECTED]

Sent: Friday, October 01, 2004 1:10 PM

Subject: Re: [PERFORM] Caching of Queries




 People:

 Transparent query caching is the industry standard for how these
things
 are handled.   However, Postgres' lack of this feature has made me
consider
 other approaches, and I'm starting to wonder if the standard query
caching
 -- where a materialized query result, or some reduction thereof, is cached
in
 database memory -- isn't the best way to cache things.  I'm going to
 abbreviate it SQC for the rest of this e-mail.

 Obviously, the draw of SQC is its transparency to developers.   With it,
the
 Java/Perl/PHP programmers and the DBA don't have to communicate at all -- 
you
 set it up, give it some RAM, and it just works.   As someone who
frequently
 has to consult based on limited knowledge, I can understand the appeal.

My sense is that pg is currently unique among popular dbmses in having the
majority of applications being homegrown (a chicken / egg / advocacy issue -
if I install a CMS, I'm not the DBA or the PHP programmer - and I don't want
to change the code; we'll see more about this when native WinPg happens).





 However, one of the problems with SQC, aside from the ones already
mentioned
 of stale data and/or cache-clearing, is that (at least in applications
like
 MySQL's) it is indiscriminate and caches, at least breifly, unique queries
as
 readily as common ones.   Possibly Oracle's implementation is more
 sophisticated; I've not had an opportunity.

I'm not sure I agree here. Stale data and caching choice are
optimizer/buffer manager choices and implementation can decide whether to
allow stale data. These are design choices involving development effort and
choices of where to spend server cycles and memory. All buffering choices
cache unique objects, I'm not sure why this is bad (but sensing you want
control of the choices). FWIW, this is my impression of other dbmses.

In MySQL, a global cache can be specified with size and globally, locally,
or through statement hints in queries to suggest caching results. I don't
believe that these could be used as common subexpressions (with an exception
of MERGE table component results). The optimizer knows nothing about the
cached results - SQL select statements are hashed, and can be replaced by
the the cached statement/results on a match.

In DB2 and Oracle result sets are not cached. They have rich sets of
materialized view features (that match your requirements). They allow a
materialized view to be synchronous with table updates or asynchronous.
Synchronous is often an unrealistic option, and asynchronous materialized
views are refreshed at a specified schedule. The optimizers allow query
rewrite (in Oracle it is a session option) so one can connect to the
database and specify that the optimizer is allowed to replace subexpressions
with data from (possibly stale) materialized views. SQL Server 2K has more
restrictive synchronous MVs, but I've never used them.

So, in your example use in Oracle, you would need to define appropriate MVs
with a ½ hour refresh frequency, and hope that the planner would use them in
your queries. The only change in the app is on connection you would allow
use of asynchronous stale data.

You're suggesting an alternative involving identifying common, but
expensive, subexpressions and generating MVs for them. This is a pretty
sophisticated undertaking, and probably requires some theory research to
determine if it's viable.



 The other half of that problem is that an entire query is cached, rather
than
 just the relevant data to uniquely identify the request to the
application.
 This is bad in two respects; one that the entire query needs to be parsed
to
 see if a new query is materially equivalent, and that two materially
 different queries which could utilize overlapping ranges of the same
 underlying result set must instead cache their results separately, eating
up
 yet more memory.

There are two separate issues. The cost of parse/optimization and the cost
of results retrieval. Other dbmses hash statement text. This is a good
thing, and probably 3 orders of magnitude faster than parse and
optimization. (Oracle also has options to replace literals with parameters
and match parse trees instead of text, expecting parse costs to be less than
planning costs.) MySQL on a match simply returns the result set. Oracle and
DB2 attempt to rewrite queries to use the DBA selected extracts. The MySQL
approach seems to be almost what you're describing: all it needs 

Re: [PERFORM] Caching of Queries

2004-09-28 Thread Aaron Werman

- Original Message - 
From: Tom Lane [EMAIL PROTECTED]
To: Aaron Werman [EMAIL PROTECTED]
Cc: Iain [EMAIL PROTECTED]; Jim C. Nasby [EMAIL PROTECTED];
[EMAIL PROTECTED]
Sent: Tuesday, September 28, 2004 9:58 AM
Subject: Re: [PERFORM] Caching of Queries


 Aaron Werman [EMAIL PROTECTED] writes:
  I imagine a design where a shared plan cache would consist of the plans,
  indexed by a statement hash and again by dependant objects.  A statement
to
  be planned would be hashed and matched to the cache. DDL would need to
  synchronously destroy all dependant plans. If each plan maintains a
validity
   ^
  flag, changing the cache wouldn't have to block so I don't see where
there
^^
  would be contention.

 You have contention to access a shared data structure *at all* -- for
 instance readers must lock out writers.  Or didn't you notice the self-
 contradictions in what you just said?

 Our current scalability problems dictate reducing such contention, not
 adding whole new sources of it.

You're right - that seems unclear. What I meant is that there can be a
global hash table that is never locked, and the hashes point to chains of
plans that are only locally locked for maintenance, such as gc and chaining
hash collisions. If maintenance was relatively rare and only local, my
assumption is that it wouldn't have global impact.

The nice thing about plan caching is that it can be sloppy, unlike block
cache, because it is only an optimization tweak. So, for example, if the
plan has atomic refererence times or counts there is no need to block, since
overwriting is not so bad. If the multiprocessing planner chains the same
plan twice, the second one would ultimately age out

/Aaron


 regards, tom lane


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Data Warehouse Reevaluation - MySQL vs Postgres -- merge tables

2004-09-14 Thread Aaron Werman
 Mark Cotner wrote:

  The time has come to reevaluate/rearchitect an
  application which I built about 3 years ago.  There
  are no performance concerns with MySQL, but it would
  benefit greatly from stored procedures, views, etc.


From: Mischa Sandberg [EMAIL PROTECTED]

 If your company is currently happy with MySQL, there probably are other
 (nontechnical) reasons to stick with it. I'm impressed that you'd
 consider reconsidering PG.

I'd like to second Mischa on that issue. In general, if you migrate an
*existing* application from one RDBMS to another, you should expect
performance to decrease significantly. This is always true in a well
performing system even if the replacement technology is more sophisticated.
This is because of several factors.

Even if you try to develop in totally agnostic generic SQL, you are always
customizing to a feature set, namely the ones in the current system. Any
existing application has had substantial tuning and tweaking, and the new
one is at a disadvantage. Moreover, an existing system is a Skinnerian
reward/punishment system to the developers and DBAs, rewarding or punishing
them for very environment specific choices - resulting in an application,
dbms, OS, and platform that are both explicitly and unconsciously customized
to work together in a particular manner.

The net effect is a rule of thumb that I use:

NEVER reimplement an existing system unless the project includes substantial
functional imporovement.

Every time I've broken that rule, I've found that users expectations, based
on the application they are used to, are locked in. Any place where the new
system is slower, the users are dissatisfied; where it exceeds expectations
it isn't appreciated: the users are used to the old system quirks, and the
improvements only leave them uncomforable since the system acts
differently. (I've broken the rule on occation for standardization
conversions.)

My expectation is that pg will not get a fair shake here. If you do it - I'd
like to see the results anyway.

/Aaron

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Data Warehouse Reevaluation - MySQL vs Postgres -- merge tables

2004-09-14 Thread aaron werman
From: Harald Lau (Sector-X) [EMAIL PROTECTED]
...
 From: Mischa Sandberg [EMAIL PROTECTED]

  If your company is currently happy with MySQL, there probably are
  other (nontechnical) reasons to stick with it. I'm impressed that
  you'd consider reconsidering PG.

 I'd like to second Mischa on that issue.
Though both of you are right from my point of view, I don't think
it's very useful to discuss this item here.
It is kinda windy for the list, but the point is that a big part of 
performance is developer expectation and user expectation. I'd hope to lower 
expectations before we see an article in eWeek. Perhaps this thread should 
move to the advocacy list until the migration needs specific advice.

_
Get ready for school! Find articles, homework help and more in the Back to 
School Guide! http://special.msn.com/network/04backtoschool.armx

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Looking for ideas on how to speed up warehouse loading

2004-04-23 Thread Aaron Werman
By definition, it is equivalent to:

SELECT t1.id, t2.url FROM referral_temp t2 LEFT /*OUTER*/ JOIN d_referral t1
ON t2.url = t1.referral_raw_url
union all
SELECT null, url FROM referral_temp WHERE url is null
ORDER BY 1;



/Aaron

- Original Message - 
From: Joe Conway [EMAIL PROTECTED]
To: Sean Shanny [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Friday, April 23, 2004 12:38 AM
Subject: Re: [PERFORM] Looking for ideas on how to speed up warehouse
loading


 Sean Shanny wrote:
  explain analyze SELECT t1.id, t2.url FROM referral_temp t2 LEFT OUTER
  JOIN d_referral t1 ON t2.url = t1.referral_raw_url ORDER BY t1.id;

  What I would like to know is if there are better ways to do the join?  I
  need to get all the rows back from the referral_temp table as they are
  used for assigning FK's for the fact table later in processing.  When I
  iterate over the values that I get back those with t1.id = null I assign
  a new FK and push both into the d_referral table as new entries as well
  as a text file for later use.  The matching records are written to a
  text file for later use.

 Would something like this work any better (without disabling index scans):

 SELECT t1.id, t2.url
 FROM referral_temp t2, d_referral t1
 WHERE t1.referral_raw_url = t2.url;

 process rows with a match

 SELECT t1.id, t2.url
 FROM referral_temp t2
 WHERE NOT EXISTS
 (select 1 FROM d_referral t1 WHERE t1.referral_raw_url = t2.url);

 process rows without a match

 ?

 Joe

 ---(end of broadcast)---
 TIP 7: don't forget to increase your free space map settings


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] Help with performance problems

2004-04-23 Thread Aaron Werman
Your second server has queuing (load averages are highish), only 2 processes
running, and almost all cycles are idle. You need to track down your
bottleneck. Have you looked at iostat/vmstat? I think it would be useful to
post these, ideally both before and after full vacuum analyze.

/Aaron


- Original Message - 
From: Chris Hoover [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Friday, April 23, 2004 9:31 AM
Subject: [PERFORM] Help with performance problems


I need some help.  I have 5 db servers running our database servers, and
they
all are having various degrees of performance problems.  The problems we are
experiencing are:

1.  General slowness
2.  High loads

All of our db's are running on Dell Poweredge 2650 with 2 P4 Xeons (2.8 -
3.06 GHz) with 8 to 12 GB of memory.  The databases are running on attached
Dell Powervault 220s running raid5.

The databases were created and taken into production before I started
working
here and are very flat.  Most of the major tables have a combined primary
key
using an int field and a single char field.  There are some additional
indexes on some tables.  Most queries I see in the  logs are running at less
than .01 seconds with many significantly slower.

We are trying to narrow down the performance problem to either the db or the
hardware.  As the dba, I need to  try and get these db's tuned to the best
possible way considering the current db state.  We are in the beginning of a
complete db redesign and application re-write, but the completion and
deployment of the new db and app are quite a ways off.

Anyway, we are running the following:
PE 2650 w/ 2 cpus (2.8-3.06) - HT on
8-12 GB memory
OS on raid 0
DB's on Powervaults 220S using raid 5 (over 6 disks)
Each Postgresql cluster has 2 db up to almost 170db's (project to level out
the num of db's/cluster is being started)
DB's are no bigger than a few GB in size (largest is about 11GB according to
a
du -h)
Running RH ES 2.1

Here is the postgresql.conf from the server with the 11GB db:

max_connections = 64
shared_buffers = 32768 # 256MB=32768(buffs)*8192(bytes/buff)
max_fsm_relations = 1000 # min 10, fsm is free space map, ~40 bytes
max_fsm_pages = 1 # min 1000, fsm is free space map, ~6 bytes
sort_mem = 4096 # 256MB=4096(bytes/proc)*64(procs or conns)
checkpoint_segments = 16 # in logfile segments, min 1, 16MB each
checkpoint_timeout = 30 # range 30-3600, in seconds
effective_cache_size = 131072 # typically 8KB each
log_connections = true
log_pid = true
log_statement = true
log_duration = true
log_timestamp = true
stats_start_collector = true
stats_reset_on_server_start = true
stats_command_string = true
stats_row_level = true
stats_block_level = true
LC_MESSAGES = 'en_US'
LC_MONETARY = 'en_US'
LC_NUMERIC = 'en_US'
LC_TIME = 'en_US'

Here is top (server running pretty good right now)
  9:28am  up 25 days, 16:02,  2 users,  load average: 0.54, 0.33, 0.22
94 processes: 91 sleeping, 3 running, 0 zombie, 0 stopped
CPU0 states: 64.0% user,  0.1% system,  0.0% nice, 34.0% idle
CPU1 states: 29.0% user,  9.0% system,  0.0% nice, 60.0% idle
CPU2 states:  2.0% user,  0.1% system,  0.0% nice, 96.0% idle
CPU3 states:  0.0% user,  0.0% system,  0.0% nice, 100.0% idle
Mem:  7720072K av, 7711648K used,8424K free,  265980K shrd,  749888K
buff
Swap: 2096440K av,   22288K used, 2074152K free 6379304K
cached

Here is top from another server (with the most db's):
 9:31am  up 25 days, 16:05,  5 users,  load average: 2.34, 3.39, 4.28
147 processes: 145 sleeping, 2 running, 0 zombie, 0 stopped
CPU0 states:  6.0% user,  1.0% system,  0.0% nice, 91.0% idle
CPU1 states:  9.0% user,  4.0% system,  0.0% nice, 85.0% idle
CPU2 states:  9.0% user,  3.0% system,  0.0% nice, 86.0% idle
CPU3 states:  9.0% user,  4.0% system,  0.0% nice, 85.0% idle
Mem:  7721096K av, 7708040K used,   13056K free,  266132K shrd, 3151336K
buff
Swap: 2096440K av,   24208K used, 2072232K free 3746596K
cached

Thanks for any help/advice,

Chris


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] possible improvement between G4 and G5

2004-04-20 Thread Aaron Werman
There are a few things that you can do to help force yourself to be I/O
bound. These include:

- RAID 5 for write intensive applications, since multiple writes per synch
write is good. (There is a special case for logging or other streaming
sequential writes on RAID 5)

- Data journaling file systems are helpful in stress testing your
checkpoints

- Using midsized battery backed up write through buffering controllers. In
general, if you have a small cache, you see the problem directly, and a huge
cache will balance out load and defer writes to quieter times. That is why a
midsized cache is so useful in showing stress in your system only when it is
being stressed.

Only partly in jest,
/Aaron

BTW - I am truly curious about what happens to your system if you use
separate RAID 0+1 for your logs, disk sorts, and at least the most active
tables. This should reduce I/O load by an order of magnitude.

Vivek Khera [EMAIL PROTECTED] wrote in message
news:[EMAIL PROTECTED]
  JB == Josh Berkus [EMAIL PROTECTED] writes:

 JB Aaron,
  I do consulting, so they're all over the place and tend to be complex.
Very
  few fit in RAM, but still are very buffered. These are almost all
backed
  with very high end I/O subsystems, with dozens of spindles with battery
  backed up writethrough cache and gigs of buffers, which may be why I
worry
  so much about CPU. I have had this issue with multiple servers.

 JB Aha, I think this is the difference.  I never seem to be able to
 JB get my clients to fork out for adequate disk support.  They are
 JB always running off single or double SCSI RAID in the host server;
 JB not the sort of setup you have.

 Even when I upgraded my system to a 14-spindle RAID5 with 128M cache
 and 4GB RAM on a dual Xeon system, I still wind up being I/O bound
 quite often.

 I think it depends on what your working set turns out to be.  My
 workload really spans a lot more of the DB than I can end up caching.

 -- 
 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
 Vivek Khera, Ph.D.Khera Communications, Inc.
 Internet: [EMAIL PROTECTED]   Rockville, MD  +1-301-869-4449 x806
 AIM: vivekkhera Y!: vivek_khera   http://www.khera.org/~vivek/

 ---(end of broadcast)---
 TIP 8: explain analyze is your friend


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Moving postgres to FC disks

2004-04-20 Thread Aaron Werman



I agree on not linking and adding non-SAN disk 
dependancy to your DB. I'm trying to understand your FS reasoning. I have never seen XFS run faster than ReiserFS in any 
situation (or for that matter beat any FS in performance except JFS). XFS has 
some nifty very large file features, but we're talking about 30G and all modern 
FSs support 2G files. 

My tendancy would be to stay on ext3, since it is 
the default RH FS. I would review site preference and the SAN recommended FS and 
see if they add any compelling points.

/Aaron

  - Original Message - 
  From: 
  Joshua D. 
  Drake 
  To: Anjan Dave 
  Cc: [EMAIL PROTECTED] 
  
  Sent: Tuesday, April 20, 2004 8:27 
  PM
  Subject: Re: [PERFORM] Moving postgres to 
  FC disks
  
  

-With the db 
size being as big as, say, 30+GB, how do I move it on the new logical drive? 
(stop postgresql, and simply move it over somehow and make a 
link?)I would stop the database, move the 
  data directory to the new volume using rsync then start up postgresql pointed 
  at the new data directory.Providing everything is working correctly you 
  can then remove the old data directory.
  


-Currently, 
the internal RAID volume is ext3 filesystem. Any recommendations for the 
filesystem on the new FC volume? Rieserfs?
XFS
  

DBs are 
7.4.1(RH9), and 7.2.3 (RH8).


Appreciate any 
pointers.

Thanks,Anjan-- 
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-667-4564 - [EMAIL PROTECTED] - http://www.commandprompt.com
PostgreSQL Replicator -- production quality replication for PostgreSQL


Re: [PERFORM] good pc but bad performance,why?

2004-04-07 Thread Aaron Werman
fsync I'm thinking 50 inserts, if autocommiting is 50TPS = ~100 IO per
second (50 WAL + checkpoint IO) = roughly the I/O rate of a single drive.

Huang - Are you using a single drive for pg? If so, there is a safety
problem of both the data and logs used for recovery on the same drive. If
the drive crashes, there is nothing left for recovery.

Also, there is a big contention issue, since the log is a fast sequential
write, and checkpointing is random. If the log is on a separate drive,
you'll probably see insert speed at disk sequential write speed, since the
other drive(s) should hopefully be able to keep up when checkpointing. If
they share the same drive, you'll see an initial burst of inserts, then a
order of magnitude performance drop-off as soon as you checkpoint - because
the disk is interleaving the log and data writes.

fsync off is only appropriate for externally recoverable processes, such as
loading an empty server from a file.

/Aaron

- Original Message - 
From: Richard Huxton [EMAIL PROTECTED]
To: huang yaqin [EMAIL PROTECTED]; Tom Lane [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Wednesday, April 07, 2004 4:33 AM
Subject: Re: [PERFORM] good pc but bad performance,why?


On Wednesday 07 April 2004 05:00, huang yaqin wrote:
 hello

 Thanks, you are right.
 I use postmaster -o -F  to start my PGand performance improved
 greatly.

I don't think Tom was recommending turning fsync off. If you have a system
crash/power glitch then the database can become corrupted.

If you are happy the possibility if losing your data, write performance will
improve noticably.

-- 
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] possible improvement between G4 and G5

2004-04-06 Thread Aaron Werman

- Original Message - 
From: Tom Lane [EMAIL PROTECTED]
To: Qing Zhao [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Tuesday, April 06, 2004 1:47 AM
Subject: Re: [PERFORM] possible improvement between G4 and G5


 Qing Zhao [EMAIL PROTECTED] writes:
  We have got a G5 64-bit processor to replace an old G4 32-bit
  processor.  Given everything else equal, should we see a big
  improvement on PG's performance?

 Nope.  Database performance typically depends on disk performance first,
 and RAM size second.

I'm surprised by this thought. I tend to hit CPU bottlenecks more often than
I/O ones. In most applications, db I/O is a combination of buffer misses and
logging, which are both reasonably constrained. RAM size seems to me to be
the best way to improve performance, and then CPU which is needed to perform
the in-memory searching, locking, versioning,  and processing, and finally
I/O (this is not the case in small I/O subsystems - if you have less than a
dozen drives, you're easily I/O bound). I/O is often the thing I tune first,
because I can do it in place without buying hardware.

Conceptually, an RDBMS converts slow random I/O into in memory processing
and sequential logging writes. If successful, it should reduce the I/O
overhead.

/Aaron

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] possible improvement between G4 and G5

2004-04-06 Thread Aaron Werman


- Original Message - 
From: Josh Berkus [EMAIL PROTECTED]
To: Aaron Werman [EMAIL PROTECTED]; Qing Zhao [EMAIL PROTECTED];
Tom Lane [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Tuesday, April 06, 2004 2:52 PM
Subject: Re: [PERFORM] possible improvement between G4 and G5


 Aaron,

  I'm surprised by this thought. I tend to hit CPU bottlenecks more often
than
  I/O ones. In most applications, db I/O is a combination of buffer misses
and
  logging, which are both reasonably constrained.

 Not my experience at all.  In fact, the only times I've seen modern
platforms
 max out the CPU was when:
 a) I had bad queries with bad plans, or
 b) I had reporting queires that did a lot of calculation for display
(think
 OLAP).

 Otherwise, on the numerous servers I administrate, RAM spikes, and I/O
 bottlenecks, but the CPU stays almost flat.

 Of course, most of my apps are large databases (i.e. too big for RAM) with
a
 heavy transaction-processing component.

 What kind of applications are you running?

 -- 
 -Josh Berkus
  Aglio Database Solutions
  San Francisco



hot air

I do consulting, so they're all over the place and tend to be complex. Very
few fit in RAM, but still are very buffered. These are almost all backed
with very high end I/O subsystems, with dozens of spindles with battery
backed up writethrough cache and gigs of buffers, which may be why I worry
so much about CPU. I have had this issue with multiple servers.

Consider an analysis db with 10G data. Of that, 98% of the access is read
and only 2% write (that is normal for almost anything that is not order
entry, even transaction processing with thorough cross validation). Almost
all the queries access 10%, or 1G of the data. Of the reads, they average ~3
level b-trees, with the first 2 levels certainly cached, and the last ones
often cached. Virtually all the I/O activity is logical reads against
buffer. A system with a 100 transactions which on average access 200 rows
does 98% of 200 rows x 100 transactions x 3 logical I/Os per read = 58,800
logical reads, of which actually maybe a hundred are physical reads.  It
also does 2% of 200 rows x 100 transactions x (1 table logical I/O and say 2
index logical writes) per write = 1,200 logical writes to log, of which
there are 100 transaction commit synch writes, and in reality less than that
because of queuing against logs (there are also 1,200 logical writes
deferred to checkpoint, of which it is likely to only be 40 physical writes
because of page overlaps).

Transaction processing is a spectrum between activity logging, and database
centric design. The former, where actions are stored in the database is
totally I/O bound with the engine acting as a thin layer of logical to
physical mapping. Database centric processing makes the engine a functional
server of discrete actions - and is a big CPU hog.

What my CPU tends to be doing is a combination of general processing,
complex SQL processing: nested loops and sorting and hashing and triggers
and SPs.

I'm curious about you having flat CPU, which is not my experience. Are your
apps mature and stable?

/hot air

/Aaron

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] single index on more than two coulumns a bad thing?

2004-04-05 Thread Aaron Werman
You're absolutely correct that the general rule is to lead a composite index
with the highest cardinality index columns for fastest selectivity. Indices
and all physical design are based on usage. In this case of unique indices
supporting primary keys in a hierarchy, it depends. For selection of small
sets of arbitrary rows, your arrangement is best. For hierarchy based
queries,  such as for grandparent of foo, and parent of bar, give average
age of sons - the hierarchy based index is often more efficient.

Surrogate keys have a role, and can improve performance, but also carry an
enormous penalty of intentionally obfuscating logical keys and data
semantics, and almost always lead to data errors not being caught because
they obscure irrational relationships. I hate them, but use them frequently
in high transaction rate operational systems where there is much functional
validation outside the dbms (and the apps behave therefore like object
databases and surrogate keys are network database pointers) and in data
warehousing (where downstream data cannot be corrected anyway).

/Aaron

- Original Message - 
From: Leeuw van der, Tim [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Sunday, April 04, 2004 5:06 PM
Subject: Re: [PERFORM] single index on more than two coulumns a bad thing?


Hi Aaron,

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] Behalf Of
 Aaron Werman
 Sent: vrijdag 2 april 2004 13:57


 another thing that I have all over the place is a hierarchy:
 index on grandfather_table(grandfather)
 index on father_table(grandfather, father)
 index on son_table(grandfather, father, son)


It depends on your data-distribution, but I find that in almost all cases
it's beneficial to have your indexes the other way round in such cases:

index on grandfather_table(grandfather)
index on father_table(father, grandfather)
index on son_table(son, father, grandfather)

That usually gives a less common, more selective value at the start of the
index, making the initial selection in the index smaller.

And AFAIK I don't have to rewrite my queries for that; the planner doesn't
care about the order of expressions in the query that are on the same level.

That said, I tend to use 'surrogate keys'; keys generated from sequences or
auto-number columns for my tables. It makes the tables less readable, but
the indexes remain smaller.


Greetings,

--Tim



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] PostgreSQL and Linux 2.6 kernel.

2004-04-03 Thread Aaron Werman
Almost any cross dbms migration shows a drop in performance. The engine
effectively trains developers and administrators in what works and what
doesn't. The initial migration thus compares a tuned to an untuned version.

/Aaron

- Original Message - 
From: Josh Berkus [EMAIL PROTECTED]
To: Gary Doades [EMAIL PROTECTED]; [EMAIL PROTECTED]
Sent: Saturday, April 03, 2004 1:59 PM
Subject: Re: [PERFORM] PostgreSQL and Linux 2.6 kernel.


 Gary,

  There are no indexes on the columns involved in the update, they are
  not required for my usual select statements. This is an attempt to
  slightly denormalise the design to get the performance up comparable
  to SQL Server 2000. We hope to move some of our databases over to
  PostgreSQL later in the year and this is part of the ongoing testing.
  SQLServer's query optimiser is a bit smarter that PostgreSQL's (yet)
  so I am hand optimising some of the more frequently used
  SQL and/or tweaking the database design slightly.

 Hmmm ... that hasn't been my general experience on complex queries.
However,
 it may be due to a difference in ANALYZE statistics.   I'd love to see you
 increase your default_stats_target, re-analyze, and see if PostgreSQL gets
 smarter.

 -- 
 -Josh Berkus
  Aglio Database Solutions
  San Francisco


 ---(end of broadcast)---
 TIP 3: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] single index on more than two coulumns a bad thing?

2004-04-02 Thread Aaron Werman
another thing that I have all over the place is a hierarchy:
index on grandfather_table(grandfather)
index on father_table(grandfather, father)
index on son_table(grandfather, father, son)

almost all of my indices are composite. Are you thinking about composite
indices with low cardinality leading columns?

/Aaron

- Original Message - 
From: Josh Berkus [EMAIL PROTECTED]
To: Palle Girgensohn [EMAIL PROTECTED];
[EMAIL PROTECTED]
Sent: Thursday, April 01, 2004 7:35 PM
Subject: Re: [PERFORM] single index on more than two coulumns a bad thing?


 Palle,

  Is it always bad to create index xx on yy (field1, field2, field3);

 No, it seldom bad, in fact.I have some indexes that run up to seven
 columns, becuase they are required for unique keys.

 Indexes of 3-4 columns are often *required* for many-to-many join tables.

 I'm afraid that you've been given some misleading advice.

  I guess the problem is that the index might often grow bigger than the
  table, or at least big enough not to speed up the queries?

 Well, yes ... a 4-column index on a 5-column table could be bigger than
the
 table if allowed to bloat and not re-indexed.   But that's just a reason
for
 better maintainence.

 -- 
 -Josh Berkus
  Aglio Database Solutions
  San Francisco


 ---(end of broadcast)---
 TIP 7: don't forget to increase your free space map settings


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux

2004-03-23 Thread Aaron Werman
Are you talking about
http://www.potentialtech.com/wmoran/postgresql.php#conclusion
- Original Message - 
From: Subbiah, Stalin [EMAIL PROTECTED]
To: [EMAIL PROTECTED]; Matt Clark [EMAIL PROTECTED]; Subbiah, Stalin
[EMAIL PROTECTED]; 'Andrew Sullivan' [EMAIL PROTECTED];
[EMAIL PROTECTED]
Sent: Tuesday, March 23, 2004 3:42 PM
Subject: Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux


 As anyone done performance benchmark testing with solaris sparc/intel
linux.
 I once read a post here, which had benchmarking test results for using
 different filesystem like xfs, ext3, ext2, ufs etc. i couldn't find that
 link anymore and google is failing on me, so anyone have the link handy.

 Thanks!

 -Original Message-
 From: Josh Berkus [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, March 23, 2004 12:13 PM
 To: Matt Clark; Subbiah, Stalin; 'Andrew Sullivan';
 [EMAIL PROTECTED]
 Subject: Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux


 Matt, Stalin,

  As for the compute intensive side (complex joins  sorts etc), the Dell
 will
 most likely beat the Sun by some distance, although
  what the Sun lacks in CPU power it may make up a bit in memory
bandwidth/
 latency.

 Personally, I've been unimpressed by Dell/Xeon; I think the Sun might do
 better than you think, comparitively.On all the Dell servers I've used
 so
 far, I've not seen performance that comes even close to the hardware
specs.

 -- 
 -Josh Berkus
  Aglio Database Solutions
  San Francisco

 ---(end of broadcast)---
 TIP 8: explain analyze is your friend


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] atrocious update performance

2004-03-15 Thread Aaron Werman
Bulk updates are generally dogs (not just in pg), so I avoid doing them by
doing faster selects and inserts. You can create a new table using 'create
table as' to produce your target results. This is real fast - avoiding the
row iteration in insert, allowing the select optimizer to run and no index
overhead. Then alter/rename, add indexes and whatever else hangs off the
table (or if you're lazy do an insert/select into the original target
table). I often see 2 orders of magnitude improvement doing this, and no
need to vacuum.

/Aaron

- Original Message - 
From: Rosser Schwarz [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Monday, March 15, 2004 3:28 PM
Subject: [PERFORM] atrocious update performance


We're in the throes of an MS SQL to PostgreSQL migration; our databases
include a number of ~5M row tables.  We decided to take this opportunity
to clean up and slightly re-normalize our schemas, given what we've
learned about the data over its lifetime and such, else we wouldn't be
experiencing any of the following (we could instead just dump and `copy
from`).

We have a temporary table, public.tempprod, containing 4.7M rows, one
for each row in account.cust.  account.cust has, among others, two
columns, prod and subprod, which we're trying to update from tempprod
joined against prod.  The update tends to take unnecessarily long--
rather, we've had to finally kill it after its taking obscenely too
long.

The table:

# \d account.cust
  Table account.cust
  Column   |Type | Modifiers
---+-+--

 custid| bigint  | not null default
   | |
nextval('account.custid_seq'::text)
 ownerid   | integer | not null
 origid| text| not null
 pname | text|
 fname | text|
 mname | text|
 lname | text|
 suffix| text|
 addr1 | text|
 addr2 | text|
 addr3 | text|
 city  | text|
 state | text|
 zip   | text|
 zipplus   | text|
 homeph| text|
 workph| text|
 otherph   | text|
 ssn   | text|
 isactive  | boolean | default true
 createddt | timestamp without time zone | default now()
 prodid| bigint  |
 subprodid | bigint  |
Indexes:
cust_pkey primary key, btree (custid)
ix_addr1 btree (addr1) WHERE (addr1 IS NOT NULL)
ix_addr2 btree (addr2) WHERE (addr2 IS NOT NULL)
ix_city btree (city) WHERE (city IS NOT NULL)
ix_fname btree (fname) WHERE (fname IS NOT NULL)
ix_homeph btree (homeph) WHERE (homeph IS NOT NULL)
ix_lname btree (lname) WHERE (lname IS NOT NULL)
ix_mname btree (mname) WHERE (mname IS NOT NULL)
ix_origid btree (origid)
ix_ssn btree (ssn) WHERE (ssn IS NOT NULL)
ix_state btree (state) WHERE (state IS NOT NULL)
ix_workph btree (workph) WHERE (workph IS NOT NULL)
ix_zip btree (zip) WHERE (zip IS NOT NULL)

We're currently running on a dual Xeon 700 (I know, I know; it's what
we've got) with 2.5GB RAM and 4x36GB SCSI in hardware RAID 5 (Dell
Perc3 something-or-other controller).  If we can demonstrate that
PostgreSQL will meet our needs, we'll be going production on a dual
Opteron, maxed memory, with a 12-disk Fibre Channel array.

The query is:

update account.cust set prodid =
(select p.prodid from account.prod p
join public.tempprod t on t.pool = p.origid
where custid = t.did)

And then, upon its completion, s/prod/subprod/.

That shouldn't run overnight, should it, let alone for -days-?

In experimenting with ways of making the updates take less time, we tried
adding product and subproduct columns to tempprod, and updating those.
That seemed to work marginally better:

explain analyze update public.tempprod set prodid =
(select account.prod.prodid::bigint
from account.prod
where public.tempprod.pool::text = account.prod.origid::text)

Seq Scan on tempprod (cost=0.00..9637101.35 rows 4731410 width=56) (actual
time=24273.467..16090470.438 rows=4731410 loops=1)
  SubPlan
-  Limit (cost=0.00..2.02 rows=2 width=8) (actual time=0.134..0.315
rows=1 loops=4731410)
  -  Seq Scan on prod (cost=0.00..2.02 rows=2 width=8) (actual
  time=0.126..0.305 rows=1 loops=4731410)
  Filter: (($0)::text = (origid)::text)
Total runtime: 2284551.962 ms

But then going from public.tempprod to account.cust again takes days.  I
just cancelled an update that's been running since last Thursday.
Alas, 

Re: [PERFORM] atrocious update performance

2004-03-15 Thread Aaron Werman
The original point was about a very slow update of an entire table with a
plan that looped, and over a dozen conditional indices - vs. a 'create as'
in a CPU starved environment. I stand by my statement about observing the
orders of magnitude difference. In theory I agree that the update should be
in the same order of magnitude as the create as, but in practice I disagree.
I also think something is wrong on the logical side (besides FKs, are there
any triggers?) but was responding to the Gordian knot issue of bailing out
of pg.

Can you post a sample extract, Rosser? Otherwise, I'll try to put together a
sample of a slow mass join update.

/Aaron

- Original Message - 
From: Tom Lane [EMAIL PROTECTED]
To: Rosser Schwarz [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Monday, March 15, 2004 7:08 PM
Subject: Re: [PERFORM] atrocious update performance


 Rosser Schwarz [EMAIL PROTECTED] writes:
  You can create a new table using 'create table as' to produce your
  target results. This is real fast ...
  I often see 2 orders of magnitude improvement doing this, and no
  need to vacuum.

  Indeed:
  Query returned successfully with no result in 582761 ms.
  Though I must say, ten minutes is nominally more than two orders of
  mangitude performance improvement, versus several days.

 Hm.  There is no way that inserting a row is two orders of magnitude
 faster than updating a row --- they both require storing a new row and
 making whatever index entries are needed.  The only additional cost of
 the update is finding the old row (not a very big deal AFAICS in the
 examples you gave) and marking it deleted (definitely cheap).  So
 there's something awfully fishy going on here.

 I'm inclined to suspect an issue with foreign-key checking.  You didn't
 give us any details about foreign key relationships your cust table is
 involved in --- could we see those?  And the schemas of the other tables
 involved?

 Also, exactly which PG version is this?

 regards, tom lane

 ---(end of broadcast)---
 TIP 3: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Scaling further up

2004-03-14 Thread Aaron Werman
Sorry about not chiming in before - I've been too swamped to think. I agree
with most of the points, but a lot of these posts are interesting and seem
to describe systems from an SA perspective to my DBA-centric view.

- Original Message - 
From: Marty Scholes [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Wednesday, March 10, 2004 6:29 PM
Subject: Re: [PERFORM] Scaling further up


 I have some suggestions based on my anecdotal experience.

 1. This is a relatively small DB -- the working set will likely be in
 RAM at any moment in time, making read I/O time mostly irrelevant.

 2. The killer will be write times -- specifically log writes.  Small and
 heavily synchronized writes, log and data writes, will drag down an
 impressive hardware RAID setup.  We run mirrored hardware RAID 5 arrays
 with write back cache and are constantly seeking ways to improve write
 performance.  We do a lot of batch processing, though, so we do a lot of
 write I/Os.

My experience with RAID5 for streaming sequential writes is bad. This is
sometimes helped by the hardware caching to cover the cost of the additional
I/Os for striping (write through RAID5 + big cache acts like RAID 1+0 until
you run out of cache). Batch processing is different from high concurrency
transactions because it needs faster volume streaming, while TP is dependant
on the speed of ack'ing (few big writes with less synchronous waits vs. lots
of small writes which serialize everyone). (RAID 3 worked for me in the past
for logging, but I haven't used it in years.)


 3. Be very careful with battery backed write cache.  It usually works
 as advertised.  More than once in the past decade I have seen
 spontaneous cache corruption after power losss.  The corruption usually
 happens when some admin, including me, has assumed that the cache will
 ALWAYS survive a power failure unblemished and has no plan B.  Make
 sure you have a contingency plan for corruption, or don't enable the
cache.

I agree strongly. There is also the same problem with disk write back cache
and even with SCSI controllers with write through enabled. PITR would help
here. A lot of these problems are due to procedural error post crash.


 4. RAID 10 will likely have bigger stripe sizes on the RAID 0 portion of
 the setup, and might hinder, not help small write I/O performance.

In a high volume system without write caching you are almost always going to
see queuing, which can make the larger buffer mostly irrelevant, if it's not
huge. Write caching thrives on big  block sizes (which is a key reason why
Symmetrix doesn't do worse than it does) by reducing I/O counts. Most shops
I've set up or seen use mirroring or RAID 10 for logs. Note also that many
RAID 10 controllers in a non-write cached setup allows having a race between
the two writers, acknowledging when the first of the two completes -
increasing throughput by about 1/4.


 5. Most (almost all) of the I/O time will be due to the access time
 (head seek + head settle + rotational latency) and very little of the
 I/O time will due to data transfer time.  In other words, getting drives
 that provide faster transfer rates will barely improve performance.  The
 secret is lowering the access time.

True. This is very much a latency story. Even in volume batch, you can see
access time that clearly shows some other system configuration bottleneck
that happens elsewhere before hitting I/O capacity.


 6. A relatively cheap way to drastically drop the access time is to get
 large drive(s) and only use a portion of them for storage.  The less
 space used on the drive, the less area the heads need to cover for
 seeks.  At one extreme, you could make the partition the size of a
 single cylinder.  This would make access time (ignoring OS and
 controller overhead) identical to rotational latency, which is as low as
 4.2 ms for a cheap 7200 RPM drive.

This is a good strategy for VLDB, and may not be relevant in this case.

Also - big sequential writes and 15K rpm drives, in the case of
writethrough, is a beautiful thing - they look like a manufacturers' demo. A
primary performance role of a RDBMS is to convert random I/O to sequential
(by buffering reads and using a streaming log to defer random writes to
checkpoints). RDBMS's are the prime beneficiaries of the drive speed
improvements - since logging, backups, and copies are about the only things
(ignoring bad perl scripts and find commands) that generate loads of 50+
mB/sec.

/Aaron

---(end of broadcast)---
TIP 8: explain analyze is your friend