Re: [PERFORM] Index on two columns not used

2006-10-21 Thread Péter Kovács

Sorry for the amateurish question, but what are heap tuples?

Also, my understanding is that the following statement applies only for 
composite indexes: PostgreSQL can't use the values stored in the index 
to check the join condition. I assume that PostgreSQL will be able to 
use single-column-indexes for join conditions. Is this correct?


Thank you,
Peter

Heikki Linnakangas wrote:

Arnaud Lesauvage wrote:

I have two table with a 2-column index on both of them.
In the first table, the first colum of the index is the primary key, 
the second one is an integer field.

In the second table, the two columns are the primary key.
When I join these two tables, the 2-column index of the first table 
is not used.

Why does the query planner think that this plan  is better ?

ALTER TABLE geo.subcities_names
  ADD CONSTRAINT subcities_names_pkey PRIMARY KEY(subcity_gid, 
language_id);


CREATE INDEX subcities_gid_language_id
  ON geo.subcities
  USING btree
  (gid, official_language_id);

EXPLAIN ANALYZE
SELECT * FROM geo.subcities sc, geo.subcities_names scn
WHERE sc.gid = scn.subcity_gid AND sc.official_language_id = 
scn.language_id;


My theory:

There's no additional restrictions besides the join condition, so the 
system has to scan both tables completely. It chooses to use a full 
index scan instead of a seq scan to be able to do a merge join. 
Because it's going to have to scan the indexes completely anyway, it 
chooses the smallest index which is subcities_pkey.


You'd think that the system could do the merge using just the indexes, 
and only fetch the heap tuples for matches. If that were the case, 
using the 2-column index would indeed be a good idea. However, 
PostgreSQL can't use the values stored in the index to check the join 
condition, so all the heap tuples are fetched anyway. There was just 
recently discussion about this on this list: 
http://archives.postgresql.org/pgsql-performance/2006-09/msg00080.php.




---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] pgBench on Windows

2006-10-21 Thread Magnus Hagander
 Hello Performancers,
 
 has anyone a pgBench tool running on Windows?

Does the one that ships in the installer not work?

//Magnus

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Is ODBC that slow?

2006-10-21 Thread Carlo Stonebanks
 carlo: please, please, get your mail server to quit telling me your
 mailbox is full :)

Merlin, sorry about that. This is the first I've heard of it.

Carlo 



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[PERFORM] Optimizing disk throughput on quad Opteron

2006-10-21 Thread John Philips
Hello,

I'm working out specs for a new database server to be
purchased for our organization.  The applications the
server will handle are mainly related to network
operations (monitoring, logging, statistical/trend
reports, etc.).  Disk I/O will be especially high with
relation to processing network stats.

You can find a diagram of my initial
spec here: 
http://img266.imageshack.us/img266/9171/dbserverdiagramuc3.jpg

Server will be a HP ProLiant DL585 G2 with four
dual-core 2.6GHz processors and 8GB of RAM.

I can always throw in more RAM.  I'm trying to find
the most effective way to maximize disk throughput, as
the list archives suggest that it is the choke point
in most cases.  I separated the storage into multiple
arrays on multiple controllers, and plan to have 512MB
RAM on each controller using BBWC.  The plan is to
utilize multiple tablespaces as well as partitioned
tables when necessary.  (Note that the
StorageWorks 4214R enclosure with Ultra3 disks is used
because we already have it lying around.)

I heard some say that the transaction log should be on
it's own array, others say it doesn't hurt to have it
on the same array as the OS.  Is it really worthwhile
to put it on it's own array?

Can you guys see any glaring bottlenecks in my layout?
 Any other suggestions to offer (throw in more
controllers, different RAID layout, etc.)?  Our budget
limit is $50k.

Thanks!

P.S. I know there was a very similar thread started by
Ben Suffolk recently, I'd still like to have your
eyes of experience look at my proposed layout :-)










__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Is ODBC that slow?

2006-10-21 Thread Joshua D. Drake

 Yeah, but they had to back-off from that plan, and AFAIK it only uses
 libpq for the auth stuff and then switch to dealing with the protocol
 directly.
 
 I don't know what the reasoning was though :-)  I guess Joshua would
 know.  I'm not involved in that project.  I only know that recently a
 user posted some measurements showing that ODBCng was way slower that
 psqlODBC, and it was discovered that it was using v3 Prepare/Bind/
 Execute, which was problematic performance-wise due to the planner
 issues with that.  So AFAIK it currently parses the statements
 internally before passing them to the server.

That is correct, we were using PostgreSQL server side prepare which has
shown to be ridiculously slow. So we moved to client side prepare and
ODBCng now moves very, very quickly.

You can see results here:

http://projects.commandprompt.com/public/odbcng/wiki/Performance

As in, it moves quickly enough to compete with other bindings such as
DBD::Pg.

One of the libpq features we wanted to avoid was the receiving of all
results on the server before sending to the client. With ODBCng we have
a buffering option that will receive all results over the wire directly.

This can increase performance quite a bit in specific circumstances but
also has the downside of using more memory on the ODBC client.

We also have a security through obscurity feature as described here:

http://projects.commandprompt.com/public/odbcng/wiki/PatternMatch


Sincerely,

Joshua D. Drake





-- 

   === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
   Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Optimizing disk throughput on quad Opteron

2006-10-21 Thread Joshua D. Drake

 I heard some say that the transaction log should be on
 it's own array, others say it doesn't hurt to have it
 on the same array as the OS.  Is it really worthwhile
 to put it on it's own array?
 
 Can you guys see any glaring bottlenecks in my layout?
  Any other suggestions to offer (throw in more
 controllers, different RAID layout, etc.)?  Our budget
 limit is $50k.

You should easily be able to fit in 50k since you already have the
storage device. I would suggest the following:

1. Throw in as much RAM as you can.
2. Yes put the transaction logs on a separate array. There are a couple
of reasons for this:

  1. transaction logs are written sequentially so a RAID 1 is enough
  2. You don't have to use a journaled fs for the transaction logs so it
is really fast.

3. IIRC the MSA 30 can take 14 drives. Make sure you put in all 14
drives and delegate two of them to hot spare duty.

I actually wonder if you would be better off putting the indexes on
tablespace A and use your core data set on the larger storage works array...

Sincerely,

Joshua D. Drake


Sincerely,

Joshua D. Drake


 
 Thanks!
 
 P.S. I know there was a very similar thread started by
 Ben Suffolk recently, I'd still like to have your
 eyes of experience look at my proposed layout :-)
 
 
 
 
 
 
 
 
 
 
 __
 Do You Yahoo!?
 Tired of spam?  Yahoo! Mail has the best spam protection around 
 http://mail.yahoo.com 
 
 ---(end of broadcast)---
 TIP 2: Don't 'kill -9' the postmaster
 


-- 

   === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
   Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Optimizing disk throughput on quad Opteron

2006-10-21 Thread Ben Suffolk

You can find a diagram of my initial
spec here:
http://img266.imageshack.us/img266/9171/dbserverdiagramuc3.jpg

Can you guys see any glaring bottlenecks in my layout?
 Any other suggestions to offer (throw in more
controllers, different RAID layout, etc.)?  Our budget
limit is $50k.


The thing I would ask is would you not be better with SAS drives?

Since the comments on Dell, and the highlighted issues I have been  
looking at HP and the the Smart Array P600 controller with 512 BBWC.  
Although I am looking to stick with the 8 internal disks, rather than  
use external ones.


The HP Smart Array 50 is the external array for SAS drives. Not  
really looked into it much though.


Regards

Ben



---(end of broadcast)---
TIP 6: explain analyze is your friend