Re: [PERFORM] select count(*) performance

2007-08-13 Thread valgog
On Aug 11, 5:54 pm, Detlef Rudolph [EMAIL PROTECTED] wrote:
 Hello Group,

 I've tried the VACUUM ANALYSE, that doesn't help
 much, but VACUUM FULL improves Performance down
 from about 40 secs to 8. I think in future I would
 use the reltuples value from pg_class for the table.

 Thanks a lot for your answers and a good Sunday,
 Det

just do not forget, that reltuples is count and updated in pg_class
only during the vacuuming or analyzing of a table... so the value is
only an APPROXIMATE

-- Valentine


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Dell Hardware Recommendations

2007-08-13 Thread Vivek Khera


On Aug 10, 2007, at 4:36 PM, Merlin Moncure wrote:


I'm not so sure I agree.  They are using LSI firmware now (and so is
everyone else).  The servers are well built (highly subjective, I
admit) and configurable.  I have had some bad experiences with IBM
gear (adaptec controller though), and white box parts 3ware, etc.  I
can tell you that dell got us the storage and the server in record
time

do agree on adaptec however


Ok, perhaps you got luckier... I have two PowerVault 220 rack mounts  
with U320 SCSI drives in them. With an LSI 320-2X controller, it  
*refuses* to recognize some of the drives (channel 1 on either  
array).  Dell blames LSI, LSI blames dell's backplane.  This is  
consistent across multiple controllers we tried, and two different  
Dell disk arrays.  Dropping the SCSI speed to 160 is the only way to  
make them work.  I tend to believe LSI here.


The Adaptec 2230SLP controller recognizes the arrays fine, but tends  
to drop devices at inopportune moments.  Re-seating dropped devices  
starts a rebuild, but the speed is recognized as 1 and the rebuild  
takes two lifetimes to complete unless you insert a reboot of the  
system in there.  Totally unacceptable.  Again, dropping the scsi  
rate to 160 seems to make it more stable.



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] How to ENABLE SQL capturing???

2007-08-13 Thread Jonathan Ellis
On 8/10/07, Joshua D. Drake [EMAIL PROTECTED] wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 smiley2211 wrote:
  Jeff,
 
  You are CORRECT...my queries were going to /var/log/messages...had to get
  the Linux Admin to grant me READ access to the file...

 You may want to actually get that to stop. Syslog is a notorious
 performance bottleneck for postgresql.

Can you elaborate?  The only reference to this I could find was a
thread from 2004 where someone wasn't rotating his logs.

-Jonathan

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Help optimize view

2007-08-13 Thread Relyea, Mike
 -Original Message-
 From: Tom Lane [mailto:[EMAIL PROTECTED] 
 Sent: Friday, August 10, 2007 5:44 PM
 To: Relyea, Mike
 Cc: pgsql-performance@postgresql.org
 Subject: Re: [PERFORM] Help optimize view 
 
 Try increasing join_collapse_limit --- you have just enough 
 tables here that the planner isn't going to consider all 
 possible join orders.
 And it sorta looks like it's picking a bad one.
 
   regards, tom lane
 

I tried increasing join_collapse_limit with no significant change in run
time although a different plan was chosen.

I've included a re-send of my original post, it looks like it didn't go
through - it's not in the archives.  I've also included an explain
analyze before and after the join_collapse_limit change.

I'm have the following view as part of a larger, aggregate query that is
running slower than I'd like.  There are 4 views total, each very
similar to this one.  Each of the views is then left joined with data
from some other tables to give me the final result that I'm looking for.

I'm hoping that if I can get some insight in to how to make this view
execute faster, I can apply that learning to the other 3 views and
thereby decrease the run time for my aggregate query.

I'm running 8.2.4 on Windows XP with a single 10K rpm disk dedicated to
the data directory and 1.5 GB memory.
shared_buffers = 12288
work_mem = 262144
maintenance_work_mem = 131072
max_fsm_pages = 204800
random_page_cost = 2.0
effective_cache_size = 1
autovacuum = on



EXPLAIN ANALYZE SELECT PrintSamples.MachineID,
PrintSamples.PrintCopyID, tblColors.ColorID,
avg(ParameterValues.ParameterValue) AS Mottle_NMF
   FROM AnalysisModules
   JOIN (tblColors
   JOIN (tblTPNamesAndColors
   JOIN PrintSamples ON tblTPNamesAndColors.TestPatternName::text
= PrintSamples.TestPatternName::text
   JOIN (DigitalImages
   JOIN PrintSampleAnalyses ON DigitalImages.ImageID =
PrintSampleAnalyses.ImageID
   JOIN (ParameterNames
   JOIN (Measurements
   JOIN ParameterValues ON Measurements.MeasurementID =
ParameterValues.MeasurementID) ON ParameterNames.ParameterID =
ParameterValues.ParameterID) ON PrintSampleAnalyses.psaID =
Measurements.psaID) ON PrintSamples.PrintSampleID =
DigitalImages.PrintSampleID) ON tblColors.ColorID =
tblTPNamesAndColors.ColorID) ON AnalysisModules.MetricID =
Measurements.MetricID
  GROUP BY PrintSamples.MachineID, PrintSamples.PrintCopyID,
tblColors.ColorID, AnalysisModules.AnalysisModuleName,
ParameterNames.ParameterName, PrintSamples.TestPatternName
 HAVING AnalysisModules.AnalysisModuleName::text = 'NMF'::text AND
ParameterNames.ParameterName::text = 'NMF'::text AND
tblColors.ColorID  3 AND PrintSamples.TestPatternName::text ~~
'IQAF-TP8%'::text;

HashAggregate (cost=519801.96..519898.00 rows=7683 width=70) (actual
time=121101.027..121146.385 rows=14853 loops=1)
 - Hash Join (cost=286101.76..519667.51 rows=7683 width=70) (actual
time=52752.600..120989.713 rows=15123 loops=1)
Hash Cond: (Measurements.MetricID =
AnalysisModules.MetricID)
- Hash Join (cost=286099.98..519260.45 rows=87588 width=61) (actual
time=52752.502..120933.784 rows=15123 loops=1)
   Hash Cond: (ParameterValues.MeasurementID =
Measurements.MeasurementID)
   - Nested Loop (cost=8054.81..238636.75 rows=454040 width=21)
(actual time=165.510..67811.086 rows=289724 loops=1)
  - Seq Scan on ParameterNames (cost=0.00..1.94 rows=1
width=17) (actual time=0.012..0.026 rows=1 loops=1)
 Filter: ((ParameterName)::text = 'NMF'::text)
  - Bitmap Heap Scan on ParameterValues
(cost=8054.81..231033.70 rows=608089 width=12) (actual
time=165.481..67094.656 rows=289724 loops=1)
 Recheck Cond: (ParameterNames.ParameterID =
ParameterValues.ParameterID)
 - Bitmap Index Scan on PVParameterID_idx
(cost=0.00..7902.79 rows=608089 width=0) (actual time=141.013..141.013
rows=289724 loops=1)
Index Cond: (ParameterNames.ParameterID =
ParameterValues.ParameterID)
   - Hash (cost=259861.12..259861.12 rows=1454724 width=48) (actual
time=52573.270..52573.270 rows=961097 loops=1)
  - Hash Join (cost=8139.75..259861.12 rows=1454724 width=48)
(actual time=1399.575..50896.641 rows=961097 loops=1)
 Hash Cond: (Measurements.psaID =
PrintSampleAnalyses.psaID)
 - Seq Scan on Measurements (cost=0.00..199469.09
rows=7541009 width=12) (actual time=6.697..37199.702 rows=7539838
loops=1)
 - Hash (cost=7949.67..7949.67 rows=15206 width=44) (actual
time=1392.743..1392.743 rows=18901 loops=1)
- Hash Join (cost=5069.24..7949.67 rows=15206 width=44)
(actual time=986.589..1358.908 rows=18901 loops=1)
   Hash Cond: (PrintSampleAnalyses.ImageID =
DigitalImages.ImageID)
   - Seq Scan on PrintSampleAnalyses
(cost=0.00..2334.25 rows=78825 width=8) (actual time=13.747..158.867
rows=78859 loops=1)
   - Hash (cost=4879.10..4879.10 rows=15211 

Re: [PERFORM] How to ENABLE SQL capturing???

2007-08-13 Thread Joshua D. Drake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Jonathan Ellis wrote:
 On 8/10/07, Joshua D. Drake [EMAIL PROTECTED] wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 smiley2211 wrote:
 Jeff,

 You are CORRECT...my queries were going to /var/log/messages...had to get
 the Linux Admin to grant me READ access to the file...
 You may want to actually get that to stop. Syslog is a notorious
 performance bottleneck for postgresql.
 
 Can you elaborate?  The only reference to this I could find was a
 thread from 2004 where someone wasn't rotating his logs.

I am not sure what to elaborate on :). Syslog is slow, logging to file
isn't. Although both will certainly slow down your installation quite a
bit, syslog will slow it down more.

If I recall correctly, it is because syslog is blocking.

Joshua D. Drake



 
 -Jonathan
 


- --

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/
UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGwHv6ATb/zqfZUUQRAgMlAKCcZpj+CCP50Deo/CsSCN21IyjrCACghXfN
uJQ+qsu4FI4Kjf8fpNiWgnw=
=BJ8E
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] How to ENABLE SQL capturing???

2007-08-13 Thread Joshua D. Drake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Gregory Stark wrote:
 Joshua D. Drake [EMAIL PROTECTED] writes:
 
 If I recall correctly, it is because syslog is blocking.
 
 Are you sure it isn't just that syslog fsyncs its log files after every log
 message?

Nope I am not sure at all ;). Darcy actually found the issue and can
speak better to it, I never use syslog and have always logged direct to
file.

 I don't think the individual syslogs are synchronous but if syslog
 falls behind the buffer will fill and throttle the sender.
 
 If your Postgres data is on the same device as the syslogs those fsyncs will
 probably cause a big slowdown directly on Postgres's I/O as well.
 
 You can turn off the fsyncs in syslog by putting a - before the filename.
 


- --

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/
UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGwIOnATb/zqfZUUQRAqWqAKCEhoW/01Hc//cDEpREit8ipn2SZwCfUxPE
1Ir6eyuD4EcShwsn4sMAeKA=
=W2cJ
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] How to ENABLE SQL capturing???

2007-08-13 Thread Gregory Stark
Joshua D. Drake [EMAIL PROTECTED] writes:

 If I recall correctly, it is because syslog is blocking.

Are you sure it isn't just that syslog fsyncs its log files after every log
message? I don't think the individual syslogs are synchronous but if syslog
falls behind the buffer will fill and throttle the sender.

If your Postgres data is on the same device as the syslogs those fsyncs will
probably cause a big slowdown directly on Postgres's I/O as well.

You can turn off the fsyncs in syslog by putting a - before the filename.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] Help optimize view

2007-08-13 Thread Kevin Grittner
 On Mon, Aug 13, 2007 at 10:35 AM, in message
[EMAIL PROTECTED], Relyea,
Mike [EMAIL PROTECTED] wrote: 
 I'm running 8.2.4 on Windows XP with 1.5 GB memory.
 shared_buffers = 12288
 effective_cache_size = 1
 
For starters, you might want to adjust one or both of these.  It looks to me
like you're telling it that it only has 78.125 MB cache space.  That will
make it tend to want to scan entire tables, on the assumption that the cache
hit ratio will be poor for random reads.
 
Since you're on 8.2.4, you can use units of measure to help make this easier
to read.  You could, for example, say:
 
shared_buffers = 96MB
effective_cache_size = 1200MB
 
-Kevin
 



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Help optimize view

2007-08-13 Thread Relyea, Mike
  On Mon, Aug 13, 2007 at 10:35 AM, in message
 [EMAIL PROTECTED]
 .net, Relyea, Mike [EMAIL PROTECTED] wrote: 
  I'm running 8.2.4 on Windows XP with 1.5 GB memory.
  shared_buffers = 12288
  effective_cache_size = 1
  
 For starters, you might want to adjust one or both of these.  
 It looks to me like you're telling it that it only has 78.125 
 MB cache space.  That will make it tend to want to scan 
 entire tables, on the assumption that the cache hit ratio 
 will be poor for random reads.
  
 Since you're on 8.2.4, you can use units of measure to help 
 make this easier to read.  You could, for example, say:
  
 shared_buffers = 96MB
 effective_cache_size = 1200MB
  
 -Kevin

I've increased shared_buffers to 128MB, and restarted the server.  My
total run time didn't really change.

SELECT set_config('effective_cache_size', '1000MB', false); I have
another app that uses about 500MB.
SELECT set_config('join_collapse_limit', '20', false);

explain analyze SELECT PrintSamples.MachineID,
PrintSamples.PrintCopyID, tblColors.ColorID,
avg(ParameterValues.ParameterValue) AS Mottle_NMF
   FROM AnalysisModules
   JOIN (tblColors
   JOIN (tblTPNamesAndColors
   JOIN PrintSamples ON tblTPNamesAndColors.TestPatternName::text
= PrintSamples.TestPatternName::text
   JOIN (DigitalImages
   JOIN PrintSampleAnalyses ON DigitalImages.ImageID =
PrintSampleAnalyses.ImageID
   JOIN (ParameterNames
   JOIN (Measurements
   JOIN ParameterValues ON Measurements.MeasurementID =
ParameterValues.MeasurementID) ON ParameterNames.ParameterID =
ParameterValues.ParameterID) ON PrintSampleAnalyses.psaID =
Measurements.psaID) ON PrintSamples.PrintSampleID =
DigitalImages.PrintSampleID) ON tblColors.ColorID =
tblTPNamesAndColors.ColorID) ON AnalysisModules.MetricID =
Measurements.MetricID
  GROUP BY PrintSamples.MachineID, PrintSamples.PrintCopyID,
tblColors.ColorID, AnalysisModules.AnalysisModuleName,
ParameterNames.ParameterName, PrintSamples.TestPatternName
 HAVING AnalysisModules.AnalysisModuleName::text = 'NMF'::text AND
ParameterNames.ParameterName::text = 'NMF'::text AND
tblColors.ColorID  3 AND PrintSamples.TestPatternName::text ~~
'IQAF-TP8%'::text;

HashAggregate (cost=489274.71..489372.94 rows=7858 width=70) (actual
time=117632.844..117663.228 rows=14853 loops=1)
 - Hash Join (cost=256774.03..489137.20 rows=7858 width=70) (actual
time=50297.022..117530.665 rows=15123 loops=1)
Hash Cond: (ParameterValues.MeasurementID =
Measurements.MeasurementID)
- Nested Loop (cost=8054.81..238636.75 rows=454040 width=21)
(actual time=172.341..66959.288 rows=289724 loops=1)
   - Seq Scan on ParameterNames (cost=0.00..1.94 rows=1 width=17)
(actual time=0.020..0.034 rows=1 loops=1)
  Filter: ((ParameterName)::text = 'NMF'::text)
   - Bitmap Heap Scan on ParameterValues (cost=8054.81..231033.70
rows=608089 width=12) (actual time=172.297..66241.380 rows=289724
loops=1)
  Recheck Cond: (ParameterNames.ParameterID =
ParameterValues.ParameterID)
  - Bitmap Index Scan on PVParameterID_idx
(cost=0.00..7902.79 rows=608089 width=0) (actual time=147.690..147.690
rows=289724 loops=1)
 Index Cond: (ParameterNames.ParameterID =
ParameterValues.ParameterID)
- Hash (cost=247087.84..247087.84 rows=130510 width=57) (actual
time=50109.022..50109.022 rows=15123 loops=1)
   - Hash Join (cost=8141.52..247087.84 rows=130510 width=57)
(actual time=11095.022..50057.777 rows=15123 loops=1)
  Hash Cond: (Measurements.psaID =
PrintSampleAnalyses.psaID)
  - Hash Join (cost=1.77..234364.57 rows=661492 width=21)
(actual time=31.457..48123.380 rows=289724 loops=1)
 Hash Cond: (Measurements.MetricID =
AnalysisModules.MetricID)
 - Seq Scan on Measurements (cost=0.00..199469.09
rows=7541009 width=12) (actual time=10.920..37814.792 rows=7539838
loops=1)
 - Hash (cost=1.71..1.71 rows=5 width=17) (actual
time=0.066..0.066 rows=5 loops=1)
- Seq Scan on AnalysisModules (cost=0.00..1.71 rows=5
width=17) (actual time=0.032..0.049 rows=5 loops=1)
   Filter: ((AnalysisModuleName)::text = 'NMF'::text)
  - Hash (cost=7949.67..7949.67 rows=15206 width=44) (actual
time=1424.025..1424.025 rows=18901 loops=1)
 - Hash Join (cost=5069.24..7949.67 rows=15206 width=44)
(actual time=1007.901..1387.787 rows=18901 loops=1)
Hash Cond: (PrintSampleAnalyses.ImageID =
DigitalImages.ImageID)
- Seq Scan on PrintSampleAnalyses (cost=0.00..2334.25
rows=78825 width=8) (actual time=4.432..153.090 rows=78859 loops=1)
- Hash (cost=4879.10..4879.10 rows=15211 width=44)
(actual time=1003.424..1003.424 rows=18901 loops=1)
   - Hash Join (cost=2220.11..4879.10 rows=15211
width=44) (actual time=348.841..968.194 rows=18901 loops=1)
  Hash Cond: (DigitalImages.PrintSampleID =
PrintSamples.PrintSampleID)
  - Seq Scan on DigitalImages 

[PERFORM] Proposal: Pluggable Optimizer Interface

2007-08-13 Thread Julius Stroffek

Hi All,

Tomas Kovarik and I have presented at PGCon 2007 in Ottawa
the ideas about other possible optimizer algorithms to be used
in PostgreSQL.

We are quite new to PostgreSQL project so it took us some
time to go through the sources end explore the possibilities
how things could be implemented.

There is a proposal attached to this mail about the interface
we would like to implement for switching between different
optimizers. Please review it and provide a feedback to us.
Thank You.

Regards

Julius Stroffek

Proposal for Pluggable Optimizer Interface
==

Overview


We have presented at PGCon 2007 in Ottawa couple of other
approaches and algorithms that can be used for query optimization,
see 
http://www.pgcon.org/2008/papers/Execution_Plan_Optimization_Techniques_Julius_Stroffek.pdf
We have focused on algorithms for searching the space of
all possible orders of joins including bushy trees.
The possible algorithms include

 * Dynamic Programming (already implemented
   in PostgreSQL)
 * Genetic Algorithm (already implemented
   in PostgreSQL)
 * Dijkstra's Algorithm
 * A* Search Algorithm
 * Greedy “Nearest Neighbor” Algorithm
 * Hill-Climbing
 * Simulated Annealing
 * Iterative Improvement
 * Two Phase Optimization
 * Toured Simulated Annealing

Choosing the best algorithm from the above list is
difficult. We have to consider the length of the optimizer
computation vs. the quality of the solution which we would
like to achieve. We may want to do some hybrid optimization
– run couple of the algorithms from the above and choose
the best solution found. We might run some very fast
algorithm at the beginning and depending on the solution
cost we may decide whether it is worthwhile to try
to optimize the plan even more (using other algorithm
with longer running time but producing better solutions).
Therefore we would like to propose an interface which
can be used to switch between different optimizer algorithms
and/or allow a user to write and use his own implementation.
It would allow the community to ship more optimizer
algorithms as contrib modules and users may then decide
which of those algorithms should be used for their queries.

Creating an optimizer
=

We would propose to create a catalog holding the available
optimizers in the system called pg_optimizer. Users could
than use a SET command to switch between different
optimizers.

postgres=# select * from pg_optimizer;
 optname |  optproc
-+---
 geqo| geqo_optimizer
 dynamic | dynamic_optimizer
 greedy  | greedy_optimizer
(4 rows)

postgres=# set optimizer=greedy;
SET


Optimizer Invocation Point
==
There is a code in function make_rel_from_joinlist which
decides whether to invoke dynamic programming or genetic
algorithm for query optimization. We would propose to place
the invocation of the plugged optimizer to the same place
and with the same parameters as function geqo and
make_one_rel_by_joins are currently invoked.

Creating and dropping an optimizer
==
The optimizer function have to be implemented as a C-Language
Function using “Version 1 Calling Conventions”. The return
type of the function is RelOptInfo * and the arguments
passed to the function are

 1.PlannerInfo *root
 2.int levels_needed
 3.List * initial_rels

The proper “CREATE FUNCTION” statement have to be used
to create the optimizer function.

 CREATE FUNCTION greedyoptimizer(internal, int, internal)
 RETURNS internal
 AS 'mylib', 'greedy_optimizer'
 LANGUAGE C
 ;

Once, the optimizer function is created user may create
an optimizer using the function with the statement

 CREATE OPTIMIZER greedy (
 function = greedyoptimizer
 comment = 'Greedy Nearest Neighbor Optimizer'
 );

If the user decides not to use the optimizer anymore
he can invoke

 DROP OPTIMIZER greedy;

User have to also drop the optimizer function with

 DROP FUNCTION greedyoptimizer;

Project TODO List
=
1.Create a pg_optimizer catalog to hold available
  optimizers.
2.Create wrappers above the current dynamic
  programming and genetic algorithm optimizers
  to be used to call those implementations.
3.Modify the parser and add the functions to handle
  and execute the CREATE/DROP OPTIMIZER statements.
4.Modify GUC that it would be possible to switch
  between optimizers.
5.Change the code at the optimizer invocation
  point that the appropriate optimizer function
  would be called.
6.Handle object dependencies – make an entry
  in pg_depend that optimizer depends on its
  function.
7.Implement '\dO' command that will list the
  available optimizers.
8.Create a contrib module and ship some other
  optimizer algorithms.
9.Any other suggestion, comments and changes
  that will come out from the review of this
  proposal.

Things to Decide

1.Rights. Who can create/drop optimizers?
  Who can use 

Re: [PERFORM] Help optimize view

2007-08-13 Thread Kevin Grittner
 On Mon, Aug 13, 2007 at  1:48 PM, in message
[EMAIL PROTECTED], Relyea,
Mike [EMAIL PROTECTED] wrote: 
 I've increased shared_buffers to 128MB, and restarted the server.  My
 total run time didn't really change.
 
Please forgive me if this guess doesn't help either, but could you try 
eliminating the GROUP BY options which don't echo values in the select value 
list, and move the HAVING conditions to a WHERE clause?  Something like:
 
explain analyze
SELECT
PrintSamples.MachineID,
PrintSamples.PrintCopyID,
tblColors.ColorID,
avg(ParameterValues.ParameterValue) AS Mottle_NMF
  FROM AnalysisModules
  JOIN
  (
tblColors
JOIN
(
  tblTPNamesAndColors
  JOIN PrintSamples
ON (tblTPNamesAndColors.TestPatternName::text = 
PrintSamples.TestPatternName::text)
  JOIN
  (
DigitalImages
JOIN PrintSampleAnalyses
  ON (DigitalImages.ImageID = PrintSampleAnalyses.ImageID)
JOIN
(
  ParameterNames
  JOIN
  (
Measurements
JOIN ParameterValues
  ON Measurements.MeasurementID = 
ParameterValues.MeasurementID
  ) ON ParameterNames.ParameterID = ParameterValues.ParameterID
) ON PrintSampleAnalyses.psaID = Measurements.psaID
  ) ON PrintSamples.PrintSampleID = DigitalImages.PrintSampleID
) ON tblColors.ColorID = tblTPNamesAndColors.ColorID
  ) ON AnalysisModules.MetricID = Measurements.MetricID
  WHERE AnalysisModules.AnalysisModuleName::text = 'NMF'::text
AND ParameterNames.ParameterName::text = 'NMF'::text
AND PrintSamples.TestPatternName::text ~~ 'IQAF-TP8%'::text
AND tblColors.ColorID  3
  GROUP BY
PrintSamples.MachineID,
PrintSamples.PrintCopyID,
tblColors.ColorID
;
 
I'd also be inclined to simplify the FROM clause by eliminating the parentheses 
and putting the ON conditions closer to where they are used, but that would be 
more for readability than any expectation that it would affect the plan.
 
-Kevin
 


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] Help optimize view

2007-08-13 Thread Tom Lane
Relyea, Mike [EMAIL PROTECTED] writes:
 I've increased shared_buffers to 128MB, and restarted the server.  My
 total run time didn't really change.

It doesn't look like you can hope for much in terms of improving the
plan.  The bulk of the time is going into scanning ParameterValues and
Measurements, but AFAICS there is no way for the query to pull fewer
rows from those tables than it is doing, and the size of the join means
that a nestloop indexscan is likely to suck.  (You could try forcing one
by setting enable_hashjoin and enable_mergejoin to OFF, but I don't have
much hope for that.)

If you haven't played with work_mem yet, increasing that might make the
hash joins go a bit faster --- but it looks like most of the time is
going into the raw relation scans, so there's not going to be a lot of
win to be had there either.

Basically, joining lots of rows like this takes awhile.  If you have to
have a faster answer, I can only suggest rethinking your table design.
Sometimes denormalization of the schema is necessary for performance.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] [HACKERS] Proposal: Pluggable Optimizer Interface

2007-08-13 Thread Stefan Kaltenbrunner
Julius Stroffek wrote:
 Hi All,
 
 Tomas Kovarik and I have presented at PGCon 2007 in Ottawa
 the ideas about other possible optimizer algorithms to be used
 in PostgreSQL.
 
 We are quite new to PostgreSQL project so it took us some
 time to go through the sources end explore the possibilities
 how things could be implemented.
 
 There is a proposal attached to this mail about the interface
 we would like to implement for switching between different
 optimizers. Please review it and provide a feedback to us.
 Thank You.

hmm - how does is that proposal different from what got implemented with:

http://archives.postgresql.org/pgsql-committers/2007-05/msg00315.php


Stefan

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[PERFORM] Stable function optimisation

2007-08-13 Thread Philipp Specht

Hello!

Here's my test database:

# table
CREATE TABLE public.t
(
  id integer NOT NULL,
  a integer NOT NULL,
  CONSTRAINT pk_t PRIMARY KEY (id)
)
CREATE INDEX idx_t_a
  ON public.t
  USING btree
  (a);

# function
CREATE OR REPLACE FUNCTION public.f()
  RETURNS integer AS
$BODY$BEGIN
RETURN 1;
END$BODY$
  LANGUAGE 'plpgsql' STABLE;

# view
CREATE OR REPLACE VIEW public.v AS
 SELECT t.id, t.a
   FROM public.t
  WHERE public.f() = t.a;



# f() is stable

test=# explain analyze select * from public.v;
   QUERY PLAN
 

 Seq Scan on t  (cost=0.00..1991.00 rows=51200 width=8) (actual  
time=0.060..458.476 rows=50003 loops=1)

   Filter: (f() = a)
 Total runtime: 626.341 ms
(3 rows)

# changing f() to immutable

test=# explain analyze select * from public.v;
   QUERY PLAN
 

 Seq Scan on t  (cost=0.00..1741.00 rows=51200 width=8) (actual  
time=0.165..199.215 rows=50003 loops=1)

   Filter: (1 = a)
 Total runtime: 360.819 ms
(3 rows)

# changing f() to volatile

test=# explain analyze select * from public.v;
   QUERY PLAN
 

 Seq Scan on t  (cost=0.00..1991.00 rows=5 width=8) (actual  
time=0.217..560.426 rows=50003 loops=1)

   Filter: (f() = a)
 Total runtime: 732.655 ms
(3 rows)



The biggest question here is: Why is the runtime of the query with  
the stable function not near the runtime of the immutable function?  
It's definitely one query and the manual states that a stable  
function does not change in one statement and therefore can be  
optimised.


Is this a pg problem or did I do something wrong?

Thank you for your help!

Philipp

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] Help optimize view

2007-08-13 Thread Relyea, Mike
  On Mon, Aug 13, 2007 at  1:48 PM, in message
 [EMAIL PROTECTED]
 .net, Relyea, Mike [EMAIL PROTECTED] wrote: 
  I've increased shared_buffers to 128MB, and restarted the 
 server.  My 
  total run time didn't really change.
  
 Please forgive me if this guess doesn't help either, but 
 could you try eliminating the GROUP BY options which don't 
 echo values in the select value list, and move the HAVING 
 conditions to a WHERE clause?  Something like:
  
 explain analyze
 SELECT
 PrintSamples.MachineID,
 PrintSamples.PrintCopyID,
 tblColors.ColorID,
 avg(ParameterValues.ParameterValue) AS Mottle_NMF
   FROM AnalysisModules
   JOIN
   (
 tblColors
 JOIN
 (
   tblTPNamesAndColors
   JOIN PrintSamples
 ON (tblTPNamesAndColors.TestPatternName::text = 
 PrintSamples.TestPatternName::text)
   JOIN
   (
 DigitalImages
 JOIN PrintSampleAnalyses
   ON (DigitalImages.ImageID = 
 PrintSampleAnalyses.ImageID)
 JOIN
 (
   ParameterNames
   JOIN
   (
 Measurements
 JOIN ParameterValues
   ON Measurements.MeasurementID = 
 ParameterValues.MeasurementID
   ) ON ParameterNames.ParameterID = 
 ParameterValues.ParameterID
 ) ON PrintSampleAnalyses.psaID = Measurements.psaID
   ) ON PrintSamples.PrintSampleID = 
 DigitalImages.PrintSampleID
 ) ON tblColors.ColorID = tblTPNamesAndColors.ColorID
   ) ON AnalysisModules.MetricID = Measurements.MetricID
   WHERE AnalysisModules.AnalysisModuleName::text = 'NMF'::text
 AND ParameterNames.ParameterName::text = 'NMF'::text
 AND PrintSamples.TestPatternName::text ~~ 'IQAF-TP8%'::text
 AND tblColors.ColorID  3
   GROUP BY
 PrintSamples.MachineID,
 PrintSamples.PrintCopyID,
 tblColors.ColorID
 ;
  
 I'd also be inclined to simplify the FROM clause by 
 eliminating the parentheses and putting the ON conditions 
 closer to where they are used, but that would be more for 
 readability than any expectation that it would affect the plan.
  
 -Kevin

Thanks for your help.  Re-writing the view like this maybe bought me
something.  I've pasted the explain analyze results below.  Tough to
tell because I also increased some of the statistics.  From what Tom
says, it sounds like if I want the data returned faster I'm likely to
have to get beefier hardware.

ALTER TABLE ParameterValues ALTER MeasurementID SET STATISTICS 500;

ALTER TABLE ParameterValues ALTER ParameterID SET STATISTICS 500;

ANALYZE ParameterValues;

ALTER TABLE Measurements ALTER COLUMN MetricID SET STATISTICS 500;

ALTER TABLE Measurements ALTER COLUMN psaID SET STATISTICS 500;

ANALYZE Measurements;

Running the above SQL:

HashAggregate (cost=461541.53..461634.88 rows=7468 width=16) (actual
time=110002.041..110024.777 rows=14853 loops=1)
 - Hash Join (cost=230789.57..461464.70 rows=7683 width=16) (actual
time=56847.814..109936.722 rows=15123 loops=1)
Hash Cond: (Measurements.MetricID =
AnalysisModules.MetricID)
- Hash Join (cost=230787.80..461057.64 rows=87588 width=20) (actual
time=56847.697..109884.122 rows=15123 loops=1)
   Hash Cond: (ParameterValues.MeasurementID =
Measurements.MeasurementID)
   - Nested Loop (cost=6353.15..234044.47 rows=454038 width=8)
(actual time=179.154..52780.680 rows=289724 loops=1)
  - Seq Scan on ParameterNames (cost=0.00..1.94 rows=1
width=4) (actual time=0.012..0.027 rows=1 loops=1)
 Filter: ((ParameterName)::text = 'NMF'::text)
  - Bitmap Heap Scan on ParameterValues
(cost=6353.15..228047.32 rows=479617 width=12) (actual
time=179.123..52102.572 rows=289724 loops=1)
 Recheck Cond: (ParameterNames.ParameterID =
ParameterValues.ParameterID)
 - Bitmap Index Scan on PVParameterID_idx
(cost=0.00..6233.25 rows=479617 width=0) (actual time=152.752..152.752
rows=289724 loops=1)
Index Cond: (ParameterNames.ParameterID =
ParameterValues.ParameterID)
   - Hash (cost=206253.42..206253.42 rows=1454498 width=20) (actual
time=56657.022..56657.022 rows=961097 loops=1)
  - Nested Loop (cost=5069.24..206253.42 rows=1454498 width=20)
(actual time=932.249..55176.315 rows=961097 loops=1)
 - Hash Join (cost=5069.24..7949.67 rows=15206 width=16)
(actual time=908.275..1257.120 rows=18901 loops=1)
Hash Cond: (PrintSampleAnalyses.ImageID =
DigitalImages.ImageID)
- Seq Scan on PrintSampleAnalyses (cost=0.00..2334.25
rows=78825 width=8) (actual time=10.440..139.945 rows=78859 loops=1)
- Hash (cost=4879.10..4879.10 rows=15211 width=16)
(actual time=897.776..897.776 rows=18901 loops=1)
   - Hash Join (cost=2220.11..4879.10 rows=15211
width=16) (actual time=297.330..868.632 rows=18901 loops=1)
  Hash Cond: (DigitalImages.PrintSampleID =
PrintSamples.PrintSampleID)
  - Seq Scan on DigitalImages (cost=0.00..1915.50

Re: [PERFORM] Stable function optimisation

2007-08-13 Thread Tom Lane
Philipp Specht [EMAIL PROTECTED] writes:
 The biggest question here is: Why is the runtime of the query with  
 the stable function not near the runtime of the immutable function?  

Stable functions don't get folded to constants.

 It's definitely one query and the manual states that a stable  
 function does not change in one statement and therefore can be  
 optimised.

That's not the type of optimization that gets done with it.  What
STABLE is for is marking functions that are safe to use in index
conditions.  If you'd been using an indexable condition you'd have
seen three different behaviors here.

(I see that you do have an index on t.a, but apparently there are
too many matching rows for the planner to think the index is worth
using.)

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] [HACKERS] Proposal: Pluggable Optimizer Interface

2007-08-13 Thread Tom Lane
Stefan Kaltenbrunner [EMAIL PROTECTED] writes:
 Julius Stroffek wrote:
 There is a proposal attached to this mail about the interface
 we would like to implement for switching between different
 optimizers. Please review it and provide a feedback to us.

 hmm - how does is that proposal different from what got implemented with:
 http://archives.postgresql.org/pgsql-committers/2007-05/msg00315.php

Well, it's a very different level of abstraction.  The planner_hook
would allow you to replace the *entire* planner, but if you only want to
replace GEQO (that is, only substitute some other heuristics for partial
search of a large join-order space), doing it from planner_hook will
probably require duplicating a great deal of code.  A hook right at the
place where we currently choose geqo or regular would be a lot easier
to experiment with.

Replacing GEQO sounds like a fine area for investigation to me; I've
always been dubious about whether it's doing a good job.  But I'd prefer
a simple hook function pointer designed in the same style as
planner_hook (ie, intended to be overridden by a loadable module).
The proposed addition of a system catalog and SQL-level management
commands sounds like a great way to waste a lot of effort on mere
decoration, before ever getting to the point of being able to
demonstrate that there's any value in it.  Also, while we might accept
a small hook-function patch for 8.3, there's zero chance of any of that
other stuff making it into this release cycle.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Help optimize view

2007-08-13 Thread Kevin Grittner
 On Mon, Aug 13, 2007 at  4:00 PM, in message
[EMAIL PROTECTED], Relyea,
Mike [EMAIL PROTECTED] wrote: 
 
 Re-writing the view like this maybe bought me something.
 Tough to tell because I also increased some of the statistics.
 
I don't know whether it was the finer-grained statistics or the simplification,
but it bought you a new plan.  I don't know if the seven second improvement
is real or within the run-to-run variation, though; it could be because you
happened to be better-cached at the time.
 
 From what Tom
 says, it sounds like if I want the data returned faster I'm likely to
 have to get beefier hardware.
 
That's not what he suggested.  If you introduce redundancy in a controlled
fashion, you could have a single table with an index to more quickly get you
to the desired set of data.  That can be maintained on an ongoing basis
(possibly using triggers) or could be materialized periodically or prior to
running a series of reports or queries.
 
Such redundancies violate the normalization rules which are generally used
in database design, but some denormalization is often needed for acceptable
performance.
 
-Kevin
 



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] [HACKERS] Proposal: Pluggable Optimizer Interface

2007-08-13 Thread Julius Stroffek

Stefan,

thanks for pointing this out. I missed this change.

We would like to place the hooks to a different place in the planner and 
we would like to just replace the non-deterministic algorithm searching 
for the best order of joins and keep the rest of the planner untouched.


I am not quite sure about the usage from the user point of view of what 
got implemented. I read just the code of the patch. Are there more 
explanations somewhere else?


I understood that if the user creates his own implementation of the 
planner which can be stored in some external library, he have to provide 
some C language function as a hook activator which will assign the 
desired value to the planner_hook variable. Both, the activator function 
and the new planner implementation have to be located in the same 
dynamic library which will be loaded when CREATE FUNCTION statement 
would be used on hook activator function.


Am I correct? Have I missed something?

If the above is the case than it is exactly what we wanted except we 
would like to have the hook also in the different place.


There are more things in the proposal as a new pg_optimizer catalog and 
different way of configuring the hooks. However, this thinks are not 
mandatory for the functionality but are more user friendly.


Thanks

Julo

Stefan Kaltenbrunner wrote:

Julius Stroffek wrote:
  

Hi All,

Tomas Kovarik and I have presented at PGCon 2007 in Ottawa
the ideas about other possible optimizer algorithms to be used
in PostgreSQL.

We are quite new to PostgreSQL project so it took us some
time to go through the sources end explore the possibilities
how things could be implemented.

There is a proposal attached to this mail about the interface
we would like to implement for switching between different
optimizers. Please review it and provide a feedback to us.
Thank You.



hmm - how does is that proposal different from what got implemented with:

http://archives.postgresql.org/pgsql-committers/2007-05/msg00315.php


Stefan
  


Re: [PERFORM] Performance on writable views

2007-08-13 Thread Jim Nasby

On Aug 11, 2007, at 8:58 AM, Joshua D. Drake wrote:

Heikki Linnakangas wrote:

Enrico Weigelt wrote:

I'm often using writable views as interfaces to clients, so
they only see virtual objects and never have to cope with
the actual storage, ie. to give some client an totally
denormalized view of certain things, containing only those
information required for certain kind of operations.



Now I've got the strange feeling that this makes updates
slow, since it always has to run the whole view query to
fetch an record to be updated (ie. to get OLD.*).


There is some overhead in rewriting the query, but it shouldn't be
significantly slower than issuing the statements behind the view
directly. I wouldn't worry about it, unless you have concrete  
evidence

that it's causing problems.


I don't know about that, at least when using rules for partitioning  
the

impact can be significant in comparison to triggers.


That's because you have to re-evaluate the input query for each rule  
that's defined, so even if you only have rules for 2 partitions in a  
table (which is really about the minimum you can have, at least for  
some period of overlap surrounding the time when you switch to a new  
partition), you're looking at evaluating every input query twice.


In this case, the rules presumably are just simply re-directing DML,  
so there'd only be one rule in play at a time. That means the only  
real overhead is in the rewrite engine.

--
Decibel!, aka Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] [HACKERS] Proposal: Pluggable Optimizer Interface

2007-08-13 Thread Tom Lane
Julius Stroffek [EMAIL PROTECTED] writes:
 I understood that if the user creates his own implementation of the 
 planner which can be stored in some external library, he have to provide 
 some C language function as a hook activator which will assign the 
 desired value to the planner_hook variable. Both, the activator function 
 and the new planner implementation have to be located in the same 
 dynamic library which will be loaded when CREATE FUNCTION statement 
 would be used on hook activator function.

You could do it that way if you wanted, but a minimalistic solution is
just to install the hook from the _PG_init function of a loadable
library, and then LOAD is sufficient for a user to execute the thing.
There's a small example at
http://archives.postgresql.org/pgsql-patches/2007-05/msg00421.php

Also, having the loadable module add a custom GUC variable would likely
be a preferable solution for control purposes than making specialized
functions.  I attach another small hack I made recently, which simply
scales all the planner's relation size estimates by a scale_factor GUC;
this is handy for investigating how a plan will change with relation
size, without having to actually create gigabytes of test data.

 There are more things in the proposal as a new pg_optimizer catalog and 
 different way of configuring the hooks. However, this thinks are not 
 mandatory for the functionality but are more user friendly.

Granted, but at this point we are talking about infrastructure for
planner-hackers to play with, not something that's intended to be a
long-term API for end users.  It may or may not happen that we ever
need a user API for this at all.  I think a planner that just does the
right thing is far preferable to one with a lot of knobs that users
have to know how to twiddle, so I see this more as scaffolding on which
someone can build and test the replacement for GEQO; which ultimately
would go in without any user-visible API additions.

regards, tom lane

#include postgres.h

#include fmgr.h
#include commands/explain.h
#include optimizer/plancat.h
#include optimizer/planner.h
#include utils/guc.h


PG_MODULE_MAGIC;

void_PG_init(void);
void_PG_fini(void);

static double scale_factor = 1.0;

static void my_get_relation_info(PlannerInfo *root, Oid relationObjectId,
 bool 
inhparent, RelOptInfo *rel);


/*
 * Get control during planner's get_relation_info() function, which sets up
 * a RelOptInfo struct based on the system catalog contents.  We can modify
 * the struct contents to cause the planner to work with a hypothetical
 * situation rather than what's actually in the catalogs.
 *
 * This simplistic example just scales all relation size estimates by a
 * user-settable factor.
 */
static void
my_get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 RelOptInfo *rel)
{
ListCell   *ilist;

/* Do nothing for an inheritance parent RelOptInfo */
if (inhparent)
return;

rel-pages = (BlockNumber) ceil(rel-pages * scale_factor);
rel-tuples = ceil(rel-tuples * scale_factor);

foreach(ilist, rel-indexlist)
{
IndexOptInfo *ind = (IndexOptInfo *) lfirst(ilist);

ind-pages = (BlockNumber) ceil(ind-pages * scale_factor);
ind-tuples = ceil(ind-tuples * scale_factor);
}
}


/*
 * _pg_init()   - library load-time initialization
 *
 * DO NOT make this static nor change its name!
 */
void
_PG_init(void)
{
/* Get into the hooks we need to be in all the time */
get_relation_info_hook = my_get_relation_info;
/* Make scale_factor accessible through GUC */
DefineCustomRealVariable(scale_factor,
 ,
 ,
 scale_factor,
 0.0001,
 1e9,
 PGC_USERSET,
 NULL,
 NULL);
}


/*
 * _PG_fini()   - library unload-time finalization
 *
 * DO NOT make this static nor change its name!
 */
void
_PG_fini(void)
{
/* Get out of all the hooks (just to be sure) */
get_relation_info_hook = NULL;
}

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Help optimize view

2007-08-13 Thread Kevin Grittner
 On Mon, Aug 13, 2007 at  4:25 PM, in message
[EMAIL PROTECTED], Kevin Grittner
[EMAIL PROTECTED] wrote: 
 On Mon, Aug 13, 2007 at  4:00 PM, in message
 [EMAIL PROTECTED], Relyea,
 Mike [EMAIL PROTECTED] wrote: 
  
 From what Tom
 says, it sounds like if I want the data returned faster I'm likely to
 have to get beefier hardware.
  
 That's not what he suggested.  If you introduce redundancy in a controlled
 fashion, you could have a single table with an index to more quickly get you
 to the desired set of data.  That can be maintained on an ongoing basis
 (possibly using triggers) or could be materialized periodically or prior to
 running a series of reports or queries.
  
 Such redundancies violate the normalization rules which are generally used
 in database design, but some denormalization is often needed for acceptable
 performance.
 
One last thought regarding your table structure -- I noticed you were often
joining on column names ending in ID and selecting using column names
ending in Name, where the values for the name columns were only a few
characters long.  It is not always a good idea to create a meaningless ID
number for a primary key if you have a meaningful value (or combination of
values) which would uniquely identify a row.
 
If you were able to use the columns in your search criteria as keys, you
would have them in the Measurements table without creating any troublesome
redundancy.  You could then add Measurements indexes on these columns, and
your query might run in under a second.
 
The down side of meaningful keys (oft cited by proponents of the technique)
is that if you decide that everything with an AnalysisModuleName name of
'NMF' should now be named 'NMX', you would have to update all rows which
contain the old value.  To be able to do this safely and reliably, you would
want to use DOMAIN definitions rigorously.  If you link through meaningless
ID numbers (and what would be the point of changing those?) you can change
'NMF' to 'NMX' in one place, and everything would reflect the new value,
since it would always join to one place for those characters.
 
-Kevin
 


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] [HACKERS] Proposal: Pluggable Optimizer Interface

2007-08-13 Thread Josh Berkus
Tom,

 Also, while we might accept
 a small hook-function patch for 8.3, there's zero chance of any of that
 other stuff making it into this release cycle.

I don't think anyone was thinking about 8.3.  This is pretty much 8.4 
stuff; Julius is just raising it now becuase they don't want to go down 
the wrong path and waste everyone's time.

-- 
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] [HACKERS] Proposal: Pluggable Optimizer Interface

2007-08-13 Thread Tom Lane
Josh Berkus [EMAIL PROTECTED] writes:
 Tom,
 Also, while we might accept
 a small hook-function patch for 8.3, there's zero chance of any of that
 other stuff making it into this release cycle.

 I don't think anyone was thinking about 8.3.  This is pretty much 8.4 
 stuff; Julius is just raising it now becuase they don't want to go down 
 the wrong path and waste everyone's time.

Well, if they get the hook in now, then in six months or so when they
have something to play with, people would be able to play with it.
If not, there'll be zero uptake till after 8.4 is released...

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Dell Hardware Recommendations

2007-08-13 Thread Dave Cramer


On 13-Aug-07, at 9:50 AM, Vivek Khera wrote:



On Aug 10, 2007, at 4:36 PM, Merlin Moncure wrote:


I'm not so sure I agree.  They are using LSI firmware now (and so is
everyone else).  The servers are well built (highly subjective, I
admit) and configurable.  I have had some bad experiences with IBM
gear (adaptec controller though), and white box parts 3ware, etc.  I
can tell you that dell got us the storage and the server in record
time

do agree on adaptec however


Ok, perhaps you got luckier... I have two PowerVault 220 rack  
mounts with U320 SCSI drives in them. With an LSI 320-2X  
controller, it *refuses* to recognize some of the drives (channel 1  
on either array).  Dell blames LSI, LSI blames dell's backplane.   
This is consistent across multiple controllers we tried, and two  
different Dell disk arrays.  Dropping the SCSI speed to 160 is the  
only way to make them work.  I tend to believe LSI here.


This is the crux of the argument here. Perc/5 is a dell trademark.  
They can ship any hardware they want and call it a Perc/5.


Dave
The Adaptec 2230SLP controller recognizes the arrays fine, but  
tends to drop devices at inopportune moments.  Re-seating dropped  
devices starts a rebuild, but the speed is recognized as 1 and  
the rebuild takes two lifetimes to complete unless you insert a  
reboot of the system in there.  Totally unacceptable.  Again,  
dropping the scsi rate to 160 seems to make it more stable.





---(end of  
broadcast)---

TIP 4: Have you searched our list archives?

  http://archives.postgresql.org



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match