[PERFORM] Explicit joins

2011-05-05 Thread Rishabh Kumar Jain
I referred chapter 14.3 of postgres document version 9.0.

explicit joins help the planner in planninng  thus improve performance.
On what relations are explicit joins to be added??

I am getting data from 10 tables in a view.
I don't know on which pair of tables I have to add explicit joins to improve
performance.

--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Explicit-joins-tp4372000p4372000.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Postgres 9.0.4 + Hot Standby + FusionIO Drive + Performance = Query failed ERROR: catalog is missing 1 attribute(s) for relid 172226

2011-05-05 Thread Sethu Prasad
I did the hot standby configured earlier and at that time I started
using(querying) the standby database.

May be something missed on the archive command.


On Wed, May 4, 2011 at 8:44 PM, Josh Berkus j...@agliodbs.com wrote:


  While doing so I got the *Query failed ERROR: catalog is missing 1
  attribute(s) for relid 172226*. Any idea on this error? Is that
 combination
  PG + HotSB + Fusion IO Drive is not advisable?!

 What were you doing when you got this error?

 --
 Josh Berkus
 PostgreSQL Experts Inc.
 http://pgexperts.com

 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance



Re: [PERFORM] amazon ec2

2011-05-05 Thread Dave Page
On Thu, May 5, 2011 at 1:02 AM, Josh Berkus j...@agliodbs.com wrote:

 FWIW, EnterpriseDB's InfiniCache provides the same caching benefit. The 
 way that works is when PG goes to evict a page from shared buffers that page 
 gets compressed and stuffed into a memcache cluster. When PG determines that 
 a given page isn't in shared buffers it will then check that memcache 
 cluster before reading the page from disk. This allows you to cache amounts 
 of data that far exceed the amount of memory you could put in a physical 
 server.

 So memcached basically replaces the filesystem?

No, it sits in between shared buffers and the filesystem, effectively
providing an additional layer of extremely large, compressed cache.
Even on a single server there can be benefits over larger shared
buffers due to the compression.

 That sounds cool, but I'm wondering if it's actually a performance
 speedup.  Seems like it would only be a benefit for single-row lookups;
 any large reads would be a mess.

Depends on the database and the workload - if you can fit your entire
100GB database in cache, and your workload is read intensive then the
speedups are potentially huge (I've seen benchmarks showing 20x+).
Write intensive workloads, less so, similarly if the working set is
far larger than your cache size.

-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Explicit joins

2011-05-05 Thread Kevin Grittner
Rishabh Kumar Jain rishabh.w...@yahoo.com wrote:
 
 I am getting data from 10 tables in a view.
 I don't know on which pair of tables I have to add explicit joins
 to improve performance.
 
There's usually some fairly natural order in terms of understanding
the request.  I find it's often good to try to state in words what
data I want to see (*not* how I think I could get that data, but
describe logically which set of data I want), and list the tables in
the order the appear in that description.  The description will
naturally tend to include or imply your join conditions and other
selection criteria.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Poor query plan chosen in 9.0.3 vs 8.3.7

2011-05-05 Thread Brian Connolly
We have a generated query from our web application which takes far longer to
complete in 9.0.4, than in 8.3.7 (60sec in 9.0.4 ~10sec in 8.3.7)

The query plan generated in 9.0, includes a Materialize step which takes the
bulk of the time for the query.  If I disable materialize (by running set
enable_material='off';) the query takes ~6 seconds to run (NOTE: the query
plan chosen in this case is very similar to the plan chosen in 8.3.7).
From reading
http://rhaas.blogspot.com/2010/04/materialization-in-postgresql-90.html, I
realize that the planner will be more aggressive in choosing Materialize in
9.0.  Is there a way to modify the planner cost settings to minimize it's
use in cases like this?

Below, I have included the query in question and the explain analyze
output for the case where enable_material is on. .

NOTE: I will send explain analyze output for when enable_material is 'off'
and information on the postgresql version, settings and server configuration
in a follow up email due the length of email restrictions on the mailing
list.

(A vacuum analyze was run prior to running the queries for this email)

Any help that you can provide would be greatly appreciated.



Query

 SELECT *
FROM (
 SELECT
 Peter_SizeByType.RowId AS RowId,
 Peter_SizeByType.Name AS Name,
 Peter_SizeByType.Type AS Type,
 Peter_SizeByType.ZAve AS ZAve,
 Peter_SizeByType.TimeLabel AS TimeLabel,
 Peter_SizeByType.StorageTemperature AS StorageTemperature,
 Peter_SizeByType.Pdl AS Pdl,
 Peter_SizeByType.Cumulants AS Cumulants,
 Peter_SizeByType.AnalysisTool AS AnalysisTool,
 Peter_SizeByType.meanCountRate AS meanCountRate,
 Peter_SizeByType.ExtractionNumber AS ExtractionNumber,
 Peter_SizeByType.TestNumber AS TestNumber,
 Peter_SizeByType.Sort AS Sort
 FROM (SELECT PS_2.rowid AS RowId,
 F_3.Name AS Name,
 F_3.Type AS Type,
 PS_2.zave AS ZAve,
 PS_2.timelabel AS TimeLabel,
 PS_2.storagetemperature AS StorageTemperature,
 PS_2.pdl AS Pdl,
 PS_2.cumulants AS Cumulants,
 PS_2.analysistool AS AnalysisTool,
 PS_2.meancountrate AS meanCountRate,
 PS_2.extractionnumber AS ExtractionNumber,
 PS_2.testnumber AS TestNumber,
 T_4.sort AS Sort
 FROM (SELECT c69d129_particle_size_result_fields_5.analysistool AS
analysistool, c69d129_particle_size_result_fields_5.cumulants AS cumulants,
c69d129_particle_size_result_fields_5.extractionnumber AS extractionnumber,
c69d129_particle_size_result_fields_5.pdl AS pdl,
c69d129_particle_size_result_fields_5.rowid AS rowid,
 (SELECT RunId FROM exp.Data WHERE RowId =
c69d129_particle_size_result_fields_5.DataId)
  AS Run, c69d129_particle_size_result_fields_5$Run$.container AS
Run_Folder, c69d129_particle_size_result_fields_5$Run$Folder$.entityid AS
Run_Folder_EntityId, c69d129_particle_size_result_fields_5$Run$.name AS
Run_Name, c69d129_particle_size_result_fields_5.storagetemperature AS
storagetemperature, c69d129_particle_size_result_fields_5.testnumber AS
testnumber, c69d129_particle_size_result_fields_5.timelabel AS timelabel,
c69d129_particle_size_result_fields_5.zave AS zave,
c69d129_particle_size_result_fields_5.meancountrate AS meancountrate,
c69d129_particle_size_result_fields_5.rowid AS rowid1
 FROM (SELECT * FROM assayresult.c69d129_particle_size_result_fields
 WHERE (((SELECT Container FROM exp.Data WHERE RowId = DataId) IN
('d938da12-1b43-102d-a8a2-78911b79dd1c'
c69d129_particle_size_result_fields_5
 LEFT OUTER JOIN (SELECT * FROM exp.experimentrun
 WHERE (((protocollsid LIKE
'urn:lsid:labkey.com:Particle+SizeProtocol.Folder-69:Particle+Size')))
AND (container IN ('d938da12-1b43-102d-a8a2-78911b79dd1c')))
c69d129_particle_size_result_fields_5$Run$ ON ((SELECT RunId FROM exp.Data
WHERE RowId = c69d129_particle_size_result_fields_5.DataId) =
c69d129_particle_size_result_fields_5$Run$.rowid)
 LEFT OUTER JOIN core.containers
c69d129_particle_size_result_fields_5$Run$Folder$ ON
(c69d129_particle_size_result_fields_5$Run$.container =
c69d129_particle_size_result_fields_5$Run$Folder$.entityid)) PS_2
 INNER JOIN (SELECT Formulations_6.container AS Folder,
Formulations_6$Folder$.entityid AS Folder_EntityId, Formulations_6.name AS
Name, Formulations_6.rowid AS RowId,
 CAST((SELECT StringValue FROM exp.ObjectProperty WHERE
exp.ObjectProperty.PropertyId = 560 AND exp.ObjectProperty.ObjectId =
Formulations_6$LSID$_C.ObjectId) AS VARCHAR(4000))
  AS Type
 FROM (SELECT * FROM exp.material
 WHERE (container IN ('d938da12-1b43-102d-a8a2-78911b79dd1c')) AND
((cpastype = 'urn:lsid:labkey.com:SampleSet.Folder-69:Formulations')))
Formulations_6
 LEFT OUTER JOIN core.containers Formulations_6$Folder$ ON
(Formulations_6.container = Formulations_6$Folder$.entityid)
 LEFT OUTER JOIN exp.object Formulations_6$LSID$_C ON
(Formulations_6.lsid = Formulations_6$LSID$_C.objecturi) AND
Formulations_6.Container = 'd938da12-1b43-102d-a8a2-78911b79dd1c') F_3 ON
PS_2.Run_Name=F_3.Name || '.xls' AND

Re: [PERFORM] VX_CONCURRENT flag on vxfs( 5.1 or later) for performance for postgresql?

2011-05-05 Thread Robert Haas
On Sat, Apr 30, 2011 at 4:51 AM, Hsien-Wen Chu chu.hsien@gmail.com wrote:
 since the block size is 8k for the default, and it consisted with many
 tuple/line; as my understand, if any tuple/line is changed(maybe
 update, insert, delete). the block will be marked as dirty block. and
 then it will be flashed to disk by bgwriter.

True...

 so my question is if the data block(8k) is aligned with the file
 system block?  if it is aligned with file system block, so what's the
 potential issue make it is not safe for direct io. (please  assume
 vxfs, vxvm and the disk sector is aligned ).please correct me if any
 incorrect.

It's not about safety - it's about performance.  On a machine with
64GB of RAM, a typical setting for shared_buffers set to 8GB.  If you
start reading blocks into the PostgreSQL cache - or writing them out
of the cache - in a way that bypasses the filesystem cache, you're
going to have only 8GB of cache, instead of some much larger amount.
More cache = better performance.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] ask the database engine tuning on the server

2011-05-05 Thread Kevin Grittner
Didik Prasetyo prasetyodidi...@yahoo.com wrote:
 I had a problem with performance engine database, I use the server
 with the following specifications
 
   1. its storage configuration?
Storage SCSI 15K RAID 5
   2. how his network?
   2 gigabit bonding.
 3. type / behavior of applications that connect to the db?
   Direct connects one segment.
 4. use the machine?
DB only
 5. its os? * nix or windows?
   Linux
 6. how large the existing data? growth?
   growth 100M / day
 
 I had a problem with memory is always exhausted, causing
 database stack or very slowly, I wanted to ask how to do the
 tuning postgresql.conf settings for the above case, or whether
 there is another solution
 
I'm not entirely sure what problem you are seeing.  If you post
again, please show your version and configuration.  An easy way to
do this is to run the query on this page and pastte the output into
your post:
 
http://wiki.postgresql.org/wiki/Server_Configuration
 
For general tuning advice you should read this page:
 
http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server
 
If there is a particular query which is causing problems, please
post detail related to that query.  See this page:
 
http://wiki.postgresql.org/wiki/SlowQueryQuestions
 
If it is some other sort of problem, it helps to provide more detail
and to copy and paste any error messages:
 
http://wiki.postgresql.org/wiki/Guide_to_reporting_problems
 
A couple general points which might apply:
 
(1) RAID 5 is OK for reads, but is slow for random writes in heavy
loads.  This is mitigated somewhat by having a good RAID controller
with a battery backed RAM cache configured for write-back.
 
(2) You need to be running autovacuum and you need to avoid
long-running transactions (including those which show as idle in
transaction) to avoid bloat, which can cause your database to grow
rapidly.
 
I hope this helps,
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Poor query plan chosen in 9.0.3 vs 8.3.7

2011-05-05 Thread Brian Connolly
Here is the explain analyze output for when enable_material is 'off' and
information on the postgresql version, settings and server configuration (I
had to send a follow up email due the length of email restrictions on the
mailing list.)

(I apologize for the length of these email messages. And if this message
does not get threaded properly as I did not receive a copy of my original
message to mailing list)



Explain Analyze output with enable_material='off'


Limit  (cost=231785.69..231785.94 rows=101 width=71) (actual
time=6616.943..6617.130 rows=101 loops=1)
   -  Sort  (cost=231785.69..231785.95 rows=104 width=71) (actual
time=6616.940..6617.003 rows=101 loops=1)
 Sort Key: material.name
 Sort Method:  top-N heapsort  Memory: 39kB
 -  Nested Loop Left Join  (cost=239.46..231782.21 rows=104
width=71) (actual time=3.840..6484.883 rows=67044 loops=1)
   Join Filter: ((material.container)::text =
'd938da12-1b43-102d-a8a2-78911b79dd1c'::text)
   -  Hash Join  (cost=239.46..230812.42 rows=104 width=155)
(actual time=3.785..2676.643 rows=67044 loops=1)
 Hash Cond: ((experimentrun.name)::text = ((
material.name)::text || '.xls'::text))
 -  Nested Loop  (cost=3.27..230557.06 rows=93
width=129) (actual time=0.170..2502.106 rows=67044 loops=1)
   -  Nested Loop  (cost=0.00..229848.97 rows=93
width=137) (actual time=0.153..1368.990 rows=67044 loops=1)
 -  Nested Loop  (cost=0.00..2.33 rows=1
width=74) (actual time=0.016..0.025 rows=1 loops=1)
   -  Seq Scan on containers
c69d129_particle_size_result_fields_5$run$folder$  (cost=0.00..1.16 rows=1
width=37) (actual time=0.010..0.013 rows=1 loops=1)
 Filter: ((entityid)::text =
'd938da12-1b43-102d-a8a2-78911b79dd1c'::text)
   -  Seq Scan on containers
formulations_6$folder$  (cost=0.00..1.16 rows=1 width=37) (actual
time=0.003..0.007 rows=1 loops=1)
 Filter:
((formulations_6$folder$.entityid)::text =
'd938da12-1b43-102d-a8a2-78911b79dd1c'::text)
 -  Nested Loop  (cost=0.00..229845.71
rows=93 width=63) (actual time=0.133..1244.558 rows=67044 loops=1)
   -  Seq Scan on
c69d129_particle_size_result_fields  (cost=0.00..229742.02 rows=348
width=59) (actual time=0.018..572.402 rows=69654 loops=1)
 Filter: (((SubPlan 3))::text =
'd938da12-1b43-102d-a8a2-78911b79dd1c'::text)
 SubPlan 3
   -  Index Scan using pk_data
on data  (cost=0.00..3.27 rows=1 width=37) (actual time=0.004..0.005 rows=1
loops=69654)
 Index Cond: (rowid =
$2)
   -  Index Scan using pk_indexvarchar
on indexvarchar  (cost=0.00..0.29 rows=1 width=10) (actual time=0.004..0.006
rows=1 loops=69654)
 Index Cond:
((indexvarchar.listid = 17) AND ((indexvarchar.key)::text =
(c69d129_particle_size_result_fields.timelabel)::text))
   -  Index Scan using pk_experimentrun on
experimentrun  (cost=3.27..4.33 rows=1 width=74) (actual time=0.005..0.006
rows=1 loops=67044)
 Index Cond: (experimentrun.rowid = (SubPlan
4))
 Filter:
(((experimentrun.protocollsid)::text ~~
'urn:lsid:labkey.com:Particle+SizeProtocol.Folder-69:Particle+Size'::text)
AND ((experimentrun.container)::text = 'd938da12-1b43-102d-a8a2-7891
1b79dd1c'::text))
 SubPlan 4
   -  Index Scan using pk_data on data
 (cost=0.00..3.27 rows=1 width=4) (actual time=0.004..0.005 rows=1
loops=67044)
 Index Cond: (rowid = $2)
 SubPlan 4
   -  Index Scan using pk_data on data
 (cost=0.00..3.27 rows=1 width=4) (actual time=0.004..0.005 rows=1
loops=67044)
 Index Cond: (rowid = $2)
 -  Hash  (cost=226.51..226.51 rows=774 width=96)
(actual time=3.587..3.587 rows=1303 loops=1)
   Buckets: 1024  Batches: 1  Memory Usage: 158kB
   -  Seq Scan on material  (cost=0.00..226.51
rows=774 width=96) (actual time=0.071..2.354 rows=1303 loops=1)
 Filter: (((container)::text =
'd938da12-1b43-102d-a8a2-78911b79dd1c'::text) AND ((cpastype)::text =
'urn:lsid:labkey.com:SampleSet.Folder-69:Formulations'::text))
   -  Index Scan using uq_object on object

Re: [PERFORM] Poor query plan chosen in 9.0.3 vs 8.3.7

2011-05-05 Thread Bosco Rama
Hey Brian,

Brian Connolly wrote:
 (I had to send a follow up email due the length of email restrictions on the
 mailing list.)

A tip for when you have this problem in the future -- turn off html mail.
It will reduce your email message length by 50% - 90%.

HTH

Bosco.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Poor query plan chosen in 9.0.3 vs 8.3.7

2011-05-05 Thread Tom Lane
Brian Connolly bc...@labkey.com writes:
 Any help that you can provide would be greatly appreciated.

I'd suggest trying to get rid of the weird little subselects, like this
one:

  ... SELECT * FROM assayresult.c69d129_particle_size_result_fields
  WHERE (((SELECT Container FROM exp.Data WHERE RowId = DataId) IN
 ('d938da12-1b43-102d-a8a2-78911b79dd1c'))) ...

If you turned that into a regular join between
c69d129_particle_size_result_fields and Data, the planner probably
wouldn't be nearly as confused about how many rows would result.
It's the way-off rowcount estimate for this construct that's
causing most of the problem, AFAICS:

   -  Seq Scan on 
c69d129_particle_size_result_fields  (cost=0.00..229742.02 rows=348 width=59) 
(actual time=0.018..572.402 rows=69654 loops=1)
 Filter: (((SubPlan 3))::text = 
'd938da12-1b43-102d-a8a2-78911b79dd1c'::text)
 SubPlan 3
   -  Index Scan using pk_data on 
data  (cost=0.00..3.27 rows=1 width=37) (actual time=0.004..0.005 rows=1 
loops=69654)
 Index Cond: (rowid = $2)

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance