Re: [PERFORM] Abnormal performance difference between Postgres and MySQL

2009-02-26 Thread Claus Guttesen
 The execution time has not improved. I am going to increase the
 shared_buffers now keeping the work_mem same.

Have you performed a vacuum analyze?

-- 
regards
Claus

When lenity and cruelty play for a kingdom,
the gentler gamester is the soonest winner.

Shakespeare

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Abnormal performance difference between Postgres and MySQL

2009-02-26 Thread Akos Gabriel
Thu, 26 Feb 2009 09:00:07 +0100 -n
Claus Guttesen kome...@gmail.com írta:

  The execution time has not improved. I am going to increase the
  shared_buffers now keeping the work_mem same.
 
 Have you performed a vacuum analyze?
 

and reindex

-- 
Üdvözlettel,
Gábriel Ákos
-=E-Mail :akos.gabr...@i-logic.hu|Web:  http://www.i-logic.hu=-
-=Tel/fax:+3612367353|Mobil:+36209278894 =-

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Abnormal performance difference between Postgres and MySQL

2009-02-26 Thread Farhan Husain
On Wed, Feb 25, 2009 at 4:10 PM, Kevin Grittner kevin.gritt...@wicourts.gov
 wrote:

  Farhan Husain russ...@gmail.com wrote:
  The machine postgres is running on has 4 GB of RAM.

 In addition to the other suggestions, you should be sure that
 effective_cache_size is set to a reasonable value, which would
 probably be somewhere in the neighborhood of '3GB'.  This doesn't
 affect actual RAM allocation, but gives the optimizer a rough idea how
 much data is going to be kept in cache, between both the PostgreSQL
 shared_memory setting and the OS cache.  It can make better choices
 with more accurate information.

 -Kevin


I reran the query with new values of work_mem, effective_cache_size and
shared_buffers. There is no change in runtime. Here is the output:

ingentadb=# show work_mem;
 work_mem
--
 16MB
(1 row)

ingentadb=# show shared_buffers;
 shared_buffers

 64MB
(1 row)

ingentadb=# show effective_cache_size;
 effective_cache_size
--
 2GB
(1 row)

ingentadb=# EXPLAIN ANALYZE select A0.Subj, A2.Obj From jena_g1t1_stmt A0,
jena_g1t1_stmt A1, jena_g1t1_stmt A2 Where A0.Prop='Uv::
http://prismstandard.org/namespaces/1.2/basic/isPartOf' AND A0.Obj='Uv::
http://www.utdallas.edu/~farhan.husain/IngentaConnect/issue1_1' AND
A0.GraphID=1 AND A0.Subj=A1.Subj AND A1.Prop='Uv::
http://www.w3.org/1999/02/22-rdf-syntax-ns#type' AND A1.Obj='Uv::
http://metastore.ingenta.com/ns/structure/Article' AND A1.GraphID=1 AND
A0.Subj=A2.Subj AND A2.Prop='Uv::
http://prismstandard.org/namespaces/1.2/basic/startingPage' AND
A2.GraphID=1;

QUERY
PLAN


 Merge Join  (cost=698313.99..711229.09 rows=733195 width=134) (actual
time=7659407.195..7659418.630 rows=30 loops=1)
   Merge Cond: ((a0.subj)::text = (a1.subj)::text)
   -  Sort  (cost=84743.03..84822.90 rows=31949 width=208) (actual
time=77.269..77.300 rows=30 loops=1)
 Sort Key: a0.subj
 Sort Method:  quicksort  Memory: 24kB
 -  Nested Loop  (cost=0.00..82352.69 rows=31949 width=208) (actual
time=4.821..66.390 rows=30 loops=1)
   -  Index Scan using jena_g1t1_stmt_ixo on jena_g1t1_stmt a0
(cost=0.00..5428.34 rows=487 width=74) (actual time=2.334..2.675 rows=30
loops=1)
 Index Cond: ((obj)::text = 'Uv::
http://www.utdallas.edu/~farhan.husain/IngentaConnect/issue1_1'::text)
 Filter: (((prop)::text = 'Uv::
http://prismstandard.org/namespaces/1.2/basic/isPartOf'::text) AND (graphid
= 1))
   -  Index Scan using jena_g1t1_stmt_ixsp on jena_g1t1_stmt
a2  (cost=0.00..157.32 rows=51 width=134) (actual time=2.114..2.119 rows=1
loops=30)
 Index Cond: (((a2.subj)::text = (a0.subj)::text) AND
((a2.prop)::text = 'Uv::
http://prismstandard.org/namespaces/1.2/basic/startingPage'::text))
 Filter: (a2.graphid = 1)
   -  Materialize  (cost=613570.96..627129.46 rows=1084680 width=74)
(actual time=7659329.799..7659334.251 rows=31 loops=1)
 -  Sort  (cost=613570.96..616282.66 rows=1084680 width=74) (actual
time=7659329.781..7659334.185 rows=31 loops=1)
   Sort Key: a1.subj
   Sort Method:  external merge  Disk: 282480kB
   -  Seq Scan on jena_g1t1_stmt a1  (cost=0.00..456639.59
rows=1084680 width=74) (actual time=0.042..46465.020 rows=3192000 loops=1)
 Filter: ((graphid = 1) AND ((prop)::text = 'Uv::
http://www.w3.org/1999/02/22-rdf-syntax-ns#type'::text) AND ((obj)::text =
'Uv::http://metastore.ingenta.com/ns/structure/Article'::text))
 Total runtime: 7659420.128 ms
(19 rows)


I will try out other suggestions posted yesterday now.

Thanks,

-- 
Mohammad Farhan Husain
Research Assistant
Department of Computer Science
Erik Jonsson School of Engineering and Computer Science
University of Texas at Dallas


Re: [PERFORM] Abnormal performance difference between Postgres and MySQL

2009-02-26 Thread Farhan Husain
On Wed, Feb 25, 2009 at 6:07 PM, Scott Carey sc...@richrelevance.comwrote:

  I will second Kevin’s suggestion.  Unless you think you will have more
 than a few dozen concurrent queries, start with work_mem around 32MB.
 For the query here, a very large work_mem might help it hash join depending
 on the data... But that’s not the real problem here.

 The real problem is that it does a huge scan of all of the a1 table, and
 sorts it.  Its pretty clear that this table has incorrect statistics.  It
 thinks that it will get about 1 million rows back in the scan, but it is
 actually 3 million in the scan.

 Crank up the statistics target on that table from the default to at least
 100, perhaps even 1000.  This is a large table, the default statistics
 target of 10 is not good for large tables with skewed column data.  Those to
 try increasing the target on are the columns filtered in the explain:
 graphid, prop, and obj.  Then run vacuum analzye on that table (a1).  The
 planner should then have better stats and will likely be able to use a
 better plan for the join.

 The other tables involved in the join also seem to have  bad statistics.
  You might just take the easiest solution and change the global statistics
 target and vacuum analyze the tables involved:

 set default_statistics_target = 50;
 vacuum analyze jena_g1t1_stmt ;

 (test the query)

 Repeat for several values of the default statistics target.  You can run
 “explain” before running the actual query, to see if the plan changed.  If
 it has not, the time will not likely change.
 The max value for the statistics target is 1000, which makes analyzing and
 query planning slower, but more accurate.  In most cases, dramatic
 differences can happen between the default of 10 and values of 25 or 50.
  Sometimes, you have to go into the hundreds, and it is safer to do this on
 a per-column basis once you get to larger values.

 For larger database, I recommend increasing the default to 20 to 40 and
 re-analyzing all the tables.






 On 2/25/09 3:11 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote:

  Farhan Husain russ...@gmail.com wrote:
  Kevin Grittner kevin.gritt...@wicourts.gov wrote:
   Farhan Husain russ...@gmail.com wrote:
   The machine postgres is running on has 4 GB of RAM.
 
  In addition to the other suggestions, you should be sure that
  effective_cache_size is set to a reasonable value, which would
  probably be somewhere in the neighborhood of '3GB'.

  The execution time has not improved. I am going to increase the
  shared_buffers now keeping the work_mem same.

 Increasing shared_buffers is good, but it has been mentioned that this
 will not change the plan, which currently scans and sorts the whole
 table for a1.  Nobody will be surprised when you report minimal
 change, if any.  If you have not changed effective_cache_size (be sure
 not to confuse this with any of the other configuration values) it
 will think you only have 128MB of cache, which will be off by a factor
 of about 24 from reality.

 Also, I'm going to respectfully differ with some of the other posts on
 the best setting for work_mem.  Most benchmarks I've run and can
 remember seeing posted found best performance for this at somewhere
 between 16MB and 32MB.  You do have to be careful if you have a large
 number of concurrent queries, however, and keep it lower.  In most
 such cases, though, you're better off using a connection pool to limit
 concurrent queries instead.

 -Kevin

 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance

  Thanks a lot Scott! I think that was the problem. I just changed the
default statistics target to 50 and ran explain. The plan changed and I ran
explain analyze. Now it takes a fraction of a second!

Thanks to all of you who wanted to help me. I would be happy if someone does
me one last favor. I want to know how these query plans are generated and
how the parameters you suggested to change affects it. If there is any
article, paper or book on it please give me the name or url.

Here is the output of my latest tasks:

ingentadb=# set default_statistics_target=50;
SET
ingentadb=# show default_statistics_target;
 default_statistics_target
---
 50
(1 row)

ingentadb=# vacuum analyze jena_g1t1_stmt;
VACUUM
ingentadb=# EXPLAIN select A0.Subj, A2.Obj From jena_g1t1_stmt A0,
jena_g1t1_stmt A1, jena_g1t1_stmt A2 Where A0.Prop='Uv::
http://prismstandard.org/namespaces/1.2/basic/isPartOf' AND A0.Obj='Uv::
http://www.utdallas.edu/~farhan.husain/IngentaConnect/issue1_1' AND
A0.GraphID=1 AND A0.Subj=A1.Subj AND A1.Prop='Uv::
http://www.w3.org/1999/02/22-rdf-syntax-ns#type' AND A1.Obj='Uv::
http://metastore.ingenta.com/ns/structure/Article' AND A1.GraphID=1 AND
A0.Subj=A2.Subj AND A2.Prop='Uv::
http://prismstandard.org/namespaces/1.2/basic/startingPage' AND
A2.GraphID=1;

QUERY
PLAN

Re: [PERFORM] Abnormal performance difference between Postgres and MySQL

2009-02-26 Thread Kevin Grittner
 Farhan Husain russ...@gmail.com wrote: 
 Thanks a lot Scott! I think that was the problem. I just changed the
 default statistics target to 50 and ran explain. The plan changed
 and I ran explain analyze. Now it takes a fraction of a second!
 
Yeah, the default of 10 has been too low.  In 8.4 it is being raised
to 100.
 
 Thanks to all of you who wanted to help me. I would be happy if
 someone does me one last favor. I want to know how these query plans
 are generated and how the parameters you suggested to change affects
 it. If there is any article, paper or book on it please give me the
 name or url.
 
In terms of tuning in general, you might start with these:
 
http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server
 
http://www.postgresql.org/docs/8.3/interactive/runtime-config-query.html
 
To understand the mechanics of the optimizer you might be best off
downloading the source code and reading through the README files and
comments in the source code.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Abnormal performance difference between Postgres and MySQL

2009-02-26 Thread Scott Marlowe
On Thu, Feb 26, 2009 at 12:10 PM, Steve Clark scl...@netwolves.com wrote:

 Can this be set in the postgresql.conf file?
 default_statistics_target = 50

Yep.  It will take affect after a reload and after the current
connection has been reset.

If you want to you also set a default for a database or a role.  Fine
tuning as needed.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Abnormal performance difference between Postgres and MySQL

2009-02-26 Thread Steve Clark

Kevin Grittner wrote:
Farhan Husain russ...@gmail.com wrote: 

Thanks a lot Scott! I think that was the problem. I just changed the
default statistics target to 50 and ran explain. The plan changed
and I ran explain analyze. Now it takes a fraction of a second!
 
Yeah, the default of 10 has been too low.  In 8.4 it is being raised

to 100.
 

Thanks to all of you who wanted to help me. I would be happy if
someone does me one last favor. I want to know how these query plans
are generated and how the parameters you suggested to change affects
it. If there is any article, paper or book on it please give me the
name or url.
 
In terms of tuning in general, you might start with these:
 
http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server
 
http://www.postgresql.org/docs/8.3/interactive/runtime-config-query.html
 
To understand the mechanics of the optimizer you might be best off

downloading the source code and reading through the README files and
comments in the source code.
 
-Kevin



Hello List,

Can this be set in the postgresql.conf file?
default_statistics_target = 50

Thanks,
Steve

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Abnormal performance difference between Postgres and MySQL

2009-02-25 Thread Farhan Husain
On Wed, Feb 25, 2009 at 12:49 PM, Robert Haas robertmh...@gmail.com wrote:

 You still haven't answered the work_mem question, and you probably
 want to copy the list, rather than just sending this to me.

 ...Robert

 On Wed, Feb 25, 2009 at 1:34 PM, Farhan Husain russ...@gmail.com wrote:
 
 
  On Tue, Feb 24, 2009 at 8:21 PM, Robert Haas robertmh...@gmail.com
 wrote:
 
   Can you please elaborate a bit?
  
   I thought that A0.Prop would ignore the composite index created on the
   columns subj and prop but this does not seem to be the case.
 
  Yeah, I think you're barking up the wrong tree here.  I think Tom had
  the correct diagnosis - what do you get from show work_mem?
 
  What kind of machine are you running this on?  If it's a UNIX-ish
  machine, what do you get from free -mand uname -a?
 
  ...Robert
 
  Here is the machine info:
 
  Machine: SunOS 5.10 Generic_127111-11 sun4u sparc SUNW, Sun-Fire-880
  Memory: 4 GB
  Number of physical processors: 2
 
 
  --
  Mohammad Farhan Husain
  Research Assistant
  Department of Computer Science
  Erik Jonsson School of Engineering and Computer Science
  University of Texas at Dallas
 


Did you mean the work_mem field in the config file?


-- 
Mohammad Farhan Husain
Research Assistant
Department of Computer Science
Erik Jonsson School of Engineering and Computer Science
University of Texas at Dallas


Re: [PERFORM] Abnormal performance difference between Postgres and MySQL

2009-02-25 Thread Robert Haas
Just start up psql and type:

show work_mem;

(You could look in the config file too I suppose.)

...Robert

On Wed, Feb 25, 2009 at 1:53 PM, Farhan Husain russ...@gmail.com wrote:


 On Wed, Feb 25, 2009 at 12:49 PM, Robert Haas robertmh...@gmail.com wrote:

 You still haven't answered the work_mem question, and you probably
 want to copy the list, rather than just sending this to me.

 ...Robert

 On Wed, Feb 25, 2009 at 1:34 PM, Farhan Husain russ...@gmail.com wrote:
 
 
  On Tue, Feb 24, 2009 at 8:21 PM, Robert Haas robertmh...@gmail.com
  wrote:
 
   Can you please elaborate a bit?
  
   I thought that A0.Prop would ignore the composite index created on
   the
   columns subj and prop but this does not seem to be the case.
 
  Yeah, I think you're barking up the wrong tree here.  I think Tom had
  the correct diagnosis - what do you get from show work_mem?
 
  What kind of machine are you running this on?  If it's a UNIX-ish
  machine, what do you get from free -mand uname -a?
 
  ...Robert
 
  Here is the machine info:
 
  Machine: SunOS 5.10 Generic_127111-11 sun4u sparc SUNW, Sun-Fire-880
  Memory: 4 GB
  Number of physical processors: 2
 
 
  --
  Mohammad Farhan Husain
  Research Assistant
  Department of Computer Science
  Erik Jonsson School of Engineering and Computer Science
  University of Texas at Dallas
 

 Did you mean the work_mem field in the config file?


 --
 Mohammad Farhan Husain
 Research Assistant
 Department of Computer Science
 Erik Jonsson School of Engineering and Computer Science
 University of Texas at Dallas


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Abnormal performance difference between Postgres and MySQL

2009-02-25 Thread Farhan Husain
On Wed, Feb 25, 2009 at 12:58 PM, Robert Haas robertmh...@gmail.com wrote:

 Just start up psql and type:

 show work_mem;

 (You could look in the config file too I suppose.)

 ...Robert

 On Wed, Feb 25, 2009 at 1:53 PM, Farhan Husain russ...@gmail.com wrote:
 
 
  On Wed, Feb 25, 2009 at 12:49 PM, Robert Haas robertmh...@gmail.com
 wrote:
 
  You still haven't answered the work_mem question, and you probably
  want to copy the list, rather than just sending this to me.
 
  ...Robert
 
  On Wed, Feb 25, 2009 at 1:34 PM, Farhan Husain russ...@gmail.com
 wrote:
  
  
   On Tue, Feb 24, 2009 at 8:21 PM, Robert Haas robertmh...@gmail.com
   wrote:
  
Can you please elaborate a bit?
   
I thought that A0.Prop would ignore the composite index created on
the
columns subj and prop but this does not seem to be the case.
  
   Yeah, I think you're barking up the wrong tree here.  I think Tom had
   the correct diagnosis - what do you get from show work_mem?
  
   What kind of machine are you running this on?  If it's a UNIX-ish
   machine, what do you get from free -mand uname -a?
  
   ...Robert
  
   Here is the machine info:
  
   Machine: SunOS 5.10 Generic_127111-11 sun4u sparc SUNW, Sun-Fire-880
   Memory: 4 GB
   Number of physical processors: 2
  
  
   --
   Mohammad Farhan Husain
   Research Assistant
   Department of Computer Science
   Erik Jonsson School of Engineering and Computer Science
   University of Texas at Dallas
  
 
  Did you mean the work_mem field in the config file?
 
 
  --
  Mohammad Farhan Husain
  Research Assistant
  Department of Computer Science
  Erik Jonsson School of Engineering and Computer Science
  University of Texas at Dallas
 


I did it, it does not show anything. Here is what I have got from the config
file:


#--
# RESOURCE USAGE (except WAL)
#--

# - Memory -

shared_buffers = 32MB   # min 128kB or max_connections*16kB
# (change requires restart)
temp_buffers = 1024MB   # min 800kB
#max_prepared_transactions = 5  # can be 0 or more
# (change requires restart)
# Note:  Increasing max_prepared_transactions costs ~600 bytes of shared
memory
# per transaction slot, plus lock space (see max_locks_per_transaction).
work_mem = 1792MB   # min 64kB
#maintenance_work_mem = 16MB# min 1MB
#max_stack_depth = 32MB # min 100kB

# - Free Space Map -

max_fsm_pages = 204800  # min max_fsm_relations*16, 6 bytes
each
# (change requires restart)
#max_fsm_relations = 1000   # min 100, ~70 bytes each
# (change requires restart)

# - Kernel Resource Usage -

#max_files_per_process = 1000   # min 25
# (change requires restart)
#shared_preload_libraries = ''  # (change requires restart)

# - Cost-Based Vacuum Delay -

#vacuum_cost_delay = 0  # 0-1000 milliseconds
#vacuum_cost_page_hit = 1   # 0-1 credits
#vacuum_cost_page_miss = 10 # 0-1 credits
#vacuum_cost_page_dirty = 20# 0-1 credits
#vacuum_cost_limit = 200# 1-1 credits

# - Background Writer -

#bgwriter_delay = 200ms # 10-1ms between rounds
#bgwriter_lru_maxpages = 100# 0-1000 max buffers written/round
#bgwriter_lru_multiplier = 2.0  # 0-10.0 multipler on buffers
scanned/round


Please note that this (1792MB) is the highest that I could set for work_mem.
-- 
Mohammad Farhan Husain
Research Assistant
Department of Computer Science
Erik Jonsson School of Engineering and Computer Science
University of Texas at Dallas


Re: [PERFORM] Abnormal performance difference between Postgres and MySQL

2009-02-25 Thread Robert Haas
 Please note that this (1792MB) is the highest that I could set for work_mem.

Yeah, that's almost certainly part of your problem.

You need to make that number MUCH smaller.  You probably want a value
like 1MB or 5MB or maybe if you have really a lot of memory 20MB.

That's insanely high.

...Robert

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Abnormal performance difference between Postgres and MySQL

2009-02-25 Thread Scott Marlowe
On Wed, Feb 25, 2009 at 12:05 PM, Farhan Husain russ...@gmail.com wrote:

 On Wed, Feb 25, 2009 at 12:58 PM, Robert Haas robertmh...@gmail.com wrote:

 Just start up psql and type:

 show work_mem;

 I did it, it does not show anything.

Did you remember the ; symbol?

 Here is what I have got from the config
 file:

 shared_buffers = 32MB   # min 128kB or max_connections*16kB
     # (change requires restart)

Assuming your machine has more than a few hundred megs of ram in it,
this is kinda small.  Generally setting it to 10 to 25% of total
memory is about right.

 work_mem = 1792MB   # min 64kB

That's crazy high.  work_mem is the amount of memory a single sort can
use.  Each query can have multiple sorts.  So, if you have 10 users
running 10 sorts, you could use 100*1792MB memory.

Look to set it into the 1 to 16Meg.  If testing shows a few queries
can do better with more work_mem, then look at setting it higher for
just that one connection.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Abnormal performance difference between Postgres and MySQL

2009-02-25 Thread Farhan Husain
On Wed, Feb 25, 2009 at 1:52 PM, Robert Haas robertmh...@gmail.com wrote:

  Please note that this (1792MB) is the highest that I could set for
 work_mem.

 Yeah, that's almost certainly part of your problem.

 You need to make that number MUCH smaller.  You probably want a value
 like 1MB or 5MB or maybe if you have really a lot of memory 20MB.

 That's insanely high.

 ...Robert


Initially, it was the default value (32MB). Later I played with that value
thinking that it might improve the performance. But all the values resulted
in same amount of time.

-- 
Mohammad Farhan Husain
Research Assistant
Department of Computer Science
Erik Jonsson School of Engineering and Computer Science
University of Texas at Dallas


Re: [PERFORM] Abnormal performance difference between Postgres and MySQL

2009-02-25 Thread Robert Haas
On Wed, Feb 25, 2009 at 3:44 PM, Farhan Husain russ...@gmail.com wrote:
 Initially, it was the default value (32MB). Later I played with that value
 thinking that it might improve the performance. But all the values resulted
 in same amount of time.

Well, if you set it back to what we consider to be a reasonable value,
rerun EXPLAIN ANALYZE, and post that plan, it might help us tell you
what to do next.

...Robert

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Abnormal performance difference between Postgres and MySQL

2009-02-25 Thread Farhan Husain
On Wed, Feb 25, 2009 at 3:30 PM, Robert Haas robertmh...@gmail.com wrote:

 On Wed, Feb 25, 2009 at 3:44 PM, Farhan Husain russ...@gmail.com wrote:
  Initially, it was the default value (32MB). Later I played with that
 value
  thinking that it might improve the performance. But all the values
 resulted
  in same amount of time.

 Well, if you set it back to what we consider to be a reasonable value,
 rerun EXPLAIN ANALYZE, and post that plan, it might help us tell you
 what to do next.

 ...Robert


Right now I am running the query again with 32MB work_mem. It is taking a
long time as before. However, I have kept the following values unchanged:

shared_buffers = 32MB   # min 128kB or max_connections*16kB
   temp_buffers = 1024MB   # min
800kB

Do you think I should change them to something else?

Thanks,


-- 
Mohammad Farhan Husain
Research Assistant
Department of Computer Science
Erik Jonsson School of Engineering and Computer Science
University of Texas at Dallas


Re: [PERFORM] Abnormal performance difference between Postgres and MySQL

2009-02-25 Thread Scott Marlowe
On Wed, Feb 25, 2009 at 2:32 PM, Farhan Husain russ...@gmail.com wrote:

 On Wed, Feb 25, 2009 at 3:30 PM, Robert Haas robertmh...@gmail.com wrote:

 On Wed, Feb 25, 2009 at 3:44 PM, Farhan Husain russ...@gmail.com wrote:
  Initially, it was the default value (32MB). Later I played with that
  value
  thinking that it might improve the performance. But all the values
  resulted
  in same amount of time.

 Well, if you set it back to what we consider to be a reasonable value,
 rerun EXPLAIN ANALYZE, and post that plan, it might help us tell you
 what to do next.

 ...Robert

 Right now I am running the query again with 32MB work_mem. It is taking a
 long time as before. However, I have kept the following values unchanged:

 shared_buffers = 32MB   # min 128kB or max_connections*16kB

That's REALLY small for pgsql.  Assuming your machine has at least 1G
of ram, I'd set it to 128M to 256M as a minimum.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Abnormal performance difference between Postgres and MySQL

2009-02-25 Thread Robert Haas
On Wed, Feb 25, 2009 at 4:32 PM, Farhan Husain russ...@gmail.com wrote:
 On Wed, Feb 25, 2009 at 3:30 PM, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Feb 25, 2009 at 3:44 PM, Farhan Husain russ...@gmail.com wrote:
  Initially, it was the default value (32MB). Later I played with that
  value
  thinking that it might improve the performance. But all the values
  resulted
  in same amount of time.

 Well, if you set it back to what we consider to be a reasonable value,
 rerun EXPLAIN ANALYZE, and post that plan, it might help us tell you
 what to do next.

 ...Robert

 Right now I am running the query again with 32MB work_mem. It is taking a
 long time as before. However, I have kept the following values unchanged:

 shared_buffers = 32MB   # min 128kB or max_connections*16kB

 temp_buffers = 1024MB   # min 800kB

 Do you think I should change them to something else?

It would probably be good to change them, but I don't think it's going
to fix the problem you're having with this query.

...Robert

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Abnormal performance difference between Postgres and MySQL

2009-02-25 Thread Farhan Husain
On Wed, Feb 25, 2009 at 3:35 PM, Scott Marlowe scott.marl...@gmail.comwrote:

 On Wed, Feb 25, 2009 at 2:32 PM, Farhan Husain russ...@gmail.com wrote:
 
  On Wed, Feb 25, 2009 at 3:30 PM, Robert Haas robertmh...@gmail.com
 wrote:
 
  On Wed, Feb 25, 2009 at 3:44 PM, Farhan Husain russ...@gmail.com
 wrote:
   Initially, it was the default value (32MB). Later I played with that
   value
   thinking that it might improve the performance. But all the values
   resulted
   in same amount of time.
 
  Well, if you set it back to what we consider to be a reasonable value,
  rerun EXPLAIN ANALYZE, and post that plan, it might help us tell you
  what to do next.
 
  ...Robert
 
  Right now I am running the query again with 32MB work_mem. It is taking a
  long time as before. However, I have kept the following values unchanged:
 
  shared_buffers = 32MB   # min 128kB or
 max_connections*16kB

 That's REALLY small for pgsql.  Assuming your machine has at least 1G
 of ram, I'd set it to 128M to 256M as a minimum.


As I wrote in a previous email, I had the value set to 1792MB (the highest I
could set) and had the same execution time. This value is not helping me to
bring down the execution time.

-- 
Mohammad Farhan Husain
Research Assistant
Department of Computer Science
Erik Jonsson School of Engineering and Computer Science
University of Texas at Dallas


Re: [PERFORM] Abnormal performance difference between Postgres and MySQL

2009-02-25 Thread Robert Haas
  shared_buffers = 32MB   # min 128kB or
  max_connections*16kB

 That's REALLY small for pgsql.  Assuming your machine has at least 1G
 of ram, I'd set it to 128M to 256M as a minimum.

 As I wrote in a previous email, I had the value set to 1792MB (the highest I
 could set) and had the same execution time. This value is not helping me to
 bring down the execution time.

No, you increased work_mem, not shared_buffers.  You might want to go
and read the documentation:

http://www.postgresql.org/docs/current/interactive/runtime-config-resource.html

But at any rate, the large work_mem was producing a very strange plan.
 It may help to see what the system does without that setting.  But
changing shared_buffers will not change the plan, so let's not worry
about that right now.

...Robert

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Abnormal performance difference between Postgres and MySQL

2009-02-25 Thread Farhan Husain
It was only after I got this high execution time when I started to look into
the configuration file and change those values. I tried several combinations
in which all those values were higher than the default values. I got no
improvement in runtime. The machine postgres is running on has 4 GB of RAM.

On Wed, Feb 25, 2009 at 3:40 PM, Robert Haas robertmh...@gmail.com wrote:

   shared_buffers = 32MB   # min 128kB or
   max_connections*16kB
 
  That's REALLY small for pgsql.  Assuming your machine has at least 1G
  of ram, I'd set it to 128M to 256M as a minimum.
 
  As I wrote in a previous email, I had the value set to 1792MB (the
 highest I
  could set) and had the same execution time. This value is not helping me
 to
  bring down the execution time.

 No, you increased work_mem, not shared_buffers.  You might want to go
 and read the documentation:


 http://www.postgresql.org/docs/current/interactive/runtime-config-resource.html

 But at any rate, the large work_mem was producing a very strange plan.
  It may help to see what the system does without that setting.  But
 changing shared_buffers will not change the plan, so let's not worry
 about that right now.

 ...Robert




-- 
Mohammad Farhan Husain
Research Assistant
Department of Computer Science
Erik Jonsson School of Engineering and Computer Science
University of Texas at Dallas


Re: [PERFORM] Abnormal performance difference between Postgres and MySQL

2009-02-25 Thread Scott Marlowe
On Wed, Feb 25, 2009 at 2:38 PM, Farhan Husain russ...@gmail.com wrote:


 On Wed, Feb 25, 2009 at 3:35 PM, Scott Marlowe scott.marl...@gmail.com
 wrote:

 On Wed, Feb 25, 2009 at 2:32 PM, Farhan Husain russ...@gmail.com wrote:
 
  On Wed, Feb 25, 2009 at 3:30 PM, Robert Haas robertmh...@gmail.com
  wrote:
 
  On Wed, Feb 25, 2009 at 3:44 PM, Farhan Husain russ...@gmail.com
  wrote:
   Initially, it was the default value (32MB). Later I played with that
   value
   thinking that it might improve the performance. But all the values
   resulted
   in same amount of time.
 
  Well, if you set it back to what we consider to be a reasonable value,
  rerun EXPLAIN ANALYZE, and post that plan, it might help us tell you
  what to do next.
 
  ...Robert
 
  Right now I am running the query again with 32MB work_mem. It is taking
  a
  long time as before. However, I have kept the following values
  unchanged:
 
  shared_buffers = 32MB   # min 128kB or
  max_connections*16kB

 That's REALLY small for pgsql.  Assuming your machine has at least 1G
 of ram, I'd set it to 128M to 256M as a minimum.

 As I wrote in a previous email, I had the value set to 1792MB (the highest I
 could set) and had the same execution time. This value is not helping me to
 bring down the execution time.

No, that was work_mem.  This is shared_buffers.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Abnormal performance difference between Postgres and MySQL

2009-02-25 Thread Farhan Husain
On Wed, Feb 25, 2009 at 3:55 PM, Scott Marlowe scott.marl...@gmail.comwrote:

 On Wed, Feb 25, 2009 at 2:38 PM, Farhan Husain russ...@gmail.com wrote:
 
 
  On Wed, Feb 25, 2009 at 3:35 PM, Scott Marlowe scott.marl...@gmail.com
  wrote:
 
  On Wed, Feb 25, 2009 at 2:32 PM, Farhan Husain russ...@gmail.com
 wrote:
  
   On Wed, Feb 25, 2009 at 3:30 PM, Robert Haas robertmh...@gmail.com
   wrote:
  
   On Wed, Feb 25, 2009 at 3:44 PM, Farhan Husain russ...@gmail.com
   wrote:
Initially, it was the default value (32MB). Later I played with
 that
value
thinking that it might improve the performance. But all the values
resulted
in same amount of time.
  
   Well, if you set it back to what we consider to be a reasonable
 value,
   rerun EXPLAIN ANALYZE, and post that plan, it might help us tell you
   what to do next.
  
   ...Robert
  
   Right now I am running the query again with 32MB work_mem. It is
 taking
   a
   long time as before. However, I have kept the following values
   unchanged:
  
   shared_buffers = 32MB   # min 128kB or
   max_connections*16kB
 
  That's REALLY small for pgsql.  Assuming your machine has at least 1G
  of ram, I'd set it to 128M to 256M as a minimum.
 
  As I wrote in a previous email, I had the value set to 1792MB (the
 highest I
  could set) and had the same execution time. This value is not helping me
 to
  bring down the execution time.

 No, that was work_mem.  This is shared_buffers.


Oh, sorry for the confusion. I will change the shared_buffer once the
current run finishes.

-- 
Mohammad Farhan Husain
Research Assistant
Department of Computer Science
Erik Jonsson School of Engineering and Computer Science
University of Texas at Dallas


Re: [PERFORM] Abnormal performance difference between Postgres and MySQL

2009-02-25 Thread Akos Gabriel
Wed, 25 Feb 2009 15:43:49 -0600 -n
Farhan Husain russ...@gmail.com írta:

OK, you have two options:

1. Learn to read carefully, and differentiate between work_mem and
shared_buffers options. Lower work_mem and rise shared_buffers as
others wrote.
2. Leave Postgresql alone and go for Oracle or Microsoft SQL...

Rgds,
Akos

 It was only after I got this high execution time when I started to
 look into the configuration file and change those values. I tried
 several combinations in which all those values were higher than the
 default values. I got no improvement in runtime. The machine postgres
 is running on has 4 GB of RAM.
 
 On Wed, Feb 25, 2009 at 3:40 PM, Robert Haas robertmh...@gmail.com
 wrote:
 
shared_buffers = 32MB   # min 128kB or
max_connections*16kB
  
   That's REALLY small for pgsql.  Assuming your machine has at
   least 1G of ram, I'd set it to 128M to 256M as a minimum.
  
   As I wrote in a previous email, I had the value set to 1792MB (the
  highest I
   could set) and had the same execution time. This value is not
   helping me
  to
   bring down the execution time.
 
  No, you increased work_mem, not shared_buffers.  You might want to
  go and read the documentation:
 
 
  http://www.postgresql.org/docs/current/interactive/runtime-config-resource.html
 
  But at any rate, the large work_mem was producing a very strange
  plan. It may help to see what the system does without that
  setting.  But changing shared_buffers will not change the plan, so
  let's not worry about that right now.
 
  ...Robert
 
 
 
 


-- 
Üdvözlettel,
Gábriel Ákos
-=E-Mail :akos.gabr...@i-logic.hu|Web:  http://www.i-logic.hu=-
-=Tel/fax:+3612367353|Mobil:+36209278894 =-

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Abnormal performance difference between Postgres and MySQL

2009-02-25 Thread Farhan Husain
I am trying to find the reason of the problem so going to Oracle or
something else is not the solution. I tried with several combinations of
those parameters before posting the problem here. I have read
http://www.postgresql.org/docs/current/interactive/runtime-config-resource.htmlbefore
and I think I understood what it said.

2009/2/25 Akos Gabriel akos.gabr...@i-logic.hu

 Wed, 25 Feb 2009 15:43:49 -0600 -n
 Farhan Husain russ...@gmail.com írta:

 OK, you have two options:

 1. Learn to read carefully, and differentiate between work_mem and
 shared_buffers options. Lower work_mem and rise shared_buffers as
 others wrote.
 2. Leave Postgresql alone and go for Oracle or Microsoft SQL...

 Rgds,
 Akos

  It was only after I got this high execution time when I started to
  look into the configuration file and change those values. I tried
  several combinations in which all those values were higher than the
  default values. I got no improvement in runtime. The machine postgres
  is running on has 4 GB of RAM.
 
  On Wed, Feb 25, 2009 at 3:40 PM, Robert Haas robertmh...@gmail.com
  wrote:
 
 shared_buffers = 32MB   # min 128kB or
 max_connections*16kB
   
That's REALLY small for pgsql.  Assuming your machine has at
least 1G of ram, I'd set it to 128M to 256M as a minimum.
   
As I wrote in a previous email, I had the value set to 1792MB (the
   highest I
could set) and had the same execution time. This value is not
helping me
   to
bring down the execution time.
  
   No, you increased work_mem, not shared_buffers.  You might want to
   go and read the documentation:
  
  
  
 http://www.postgresql.org/docs/current/interactive/runtime-config-resource.html
  
   But at any rate, the large work_mem was producing a very strange
   plan. It may help to see what the system does without that
   setting.  But changing shared_buffers will not change the plan, so
   let's not worry about that right now.
  
   ...Robert
  
 
 
 


 --
 Üdvözlettel,
 Gábriel Ákos
 -=E-Mail :akos.gabr...@i-logic.hu|Web:  http://www.i-logic.hu=-
 -=Tel/fax:+3612367353|Mobil:+36209278894 =-

 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance




-- 
Mohammad Farhan Husain
Research Assistant
Department of Computer Science
Erik Jonsson School of Engineering and Computer Science
University of Texas at Dallas


Re: [PERFORM] Abnormal performance difference between Postgres and MySQL

2009-02-25 Thread Kevin Grittner
 Farhan Husain russ...@gmail.com wrote: 
 The machine postgres is running on has 4 GB of RAM.
 
In addition to the other suggestions, you should be sure that
effective_cache_size is set to a reasonable value, which would
probably be somewhere in the neighborhood of '3GB'.  This doesn't
affect actual RAM allocation, but gives the optimizer a rough idea how
much data is going to be kept in cache, between both the PostgreSQL
shared_memory setting and the OS cache.  It can make better choices
with more accurate information.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Abnormal performance difference between Postgres and MySQL

2009-02-25 Thread Farhan Husain
On Wed, Feb 25, 2009 at 4:10 PM, Kevin Grittner kevin.gritt...@wicourts.gov
 wrote:

  Farhan Husain russ...@gmail.com wrote:
  The machine postgres is running on has 4 GB of RAM.

 In addition to the other suggestions, you should be sure that
 effective_cache_size is set to a reasonable value, which would
 probably be somewhere in the neighborhood of '3GB'.  This doesn't
 affect actual RAM allocation, but gives the optimizer a rough idea how
 much data is going to be kept in cache, between both the PostgreSQL
 shared_memory setting and the OS cache.  It can make better choices
 with more accurate information.

 -Kevin


Here is the latest output:

ingentadb=# EXPLAIN ANALYZE select A0.Subj, A2.Obj From jena_g1t1_stmt A0,
jena_g1t1_stmt A1, jena_g1t1_stmt A2 Where A0.Prop='Uv::
http://prismstandard.org/namespaces/1.2/basic/isPartOf' AND A0.Obj='Uv::
http://www.utdallas.edu/~farhan.husain/IngentaConnect/issue1_1' AND
A0.GraphID=1 AND A0.Subj=A1.Subj AND A1.Prop='Uv::
http://www.w3.org/1999/02/22-rdf-syntax-ns#type' AND A1.Obj='Uv::
http://metastore.ingenta.com/ns/structure/Article' AND A1.GraphID=1 AND
A0.Subj=A2.Subj AND A2.Prop='Uv::
http://prismstandard.org/namespaces/1.2/basic/startingPage' AND
A2.GraphID=1;

QUERY
PLAN


 Merge Join  (cost=799852.37..812767.47 rows=733195 width=134) (actual
time=5941553.710..5941569.192 rows=30 loops=1)
   Merge Cond: ((a0.subj)::text = (a1.subj)::text)
   -  Sort  (cost=89884.41..89964.28 rows=31949 width=208) (actual
time=243.711..243.731 rows=30 loops=1)
 Sort Key: a0.subj
 Sort Method:  quicksort  Memory: 24kB
 -  Nested Loop  (cost=0.00..84326.57 rows=31949 width=208) (actual
time=171.255..232.765 rows=30 loops=1)
   -  Index Scan using jena_g1t1_stmt_ixo on jena_g1t1_stmt a0
(cost=0.00..5428.34 rows=487 width=74) (actual time=96.735..97.070 rows=30
loops=1)
 Index Cond: ((obj)::text = 'Uv::
http://www.utdallas.edu/~farhan.husain/IngentaConnect/issue1_1'::text)
 Filter: (((prop)::text = 'Uv::
http://prismstandard.org/namespaces/1.2/basic/isPartOf'::text) AND (graphid
= 1))
   -  Index Scan using jena_g1t1_stmt_ixsp on jena_g1t1_stmt
a2  (cost=0.00..161.37 rows=51 width=134) (actual time=4.513..4.518 rows=1
loops=30)
 Index Cond: (((a2.subj)::text = (a0.subj)::text) AND
((a2.prop)::text = 'Uv::
http://prismstandard.org/namespaces/1.2/basic/startingPage'::text))
 Filter: (a2.graphid = 1)
   -  Materialize  (cost=709967.96..723526.46 rows=1084680 width=74)
(actual time=5941309.876..5941318.552 rows=31 loops=1)
 -  Sort  (cost=709967.96..712679.66 rows=1084680 width=74) (actual
time=5941309.858..5941318.488 rows=31 loops=1)
   Sort Key: a1.subj
   Sort Method:  external merge  Disk: 282480kB
   -  Seq Scan on jena_g1t1_stmt a1  (cost=0.00..456639.59
rows=1084680 width=74) (actual time=0.054..44604.597 rows=3192000 loops=1)
 Filter: ((graphid = 1) AND ((prop)::text = 'Uv::
http://www.w3.org/1999/02/22-rdf-syntax-ns#type'::text) AND ((obj)::text =
'Uv::http://metastore.ingenta.com/ns/structure/Article'::text))
 Total runtime: 5941585.248 ms
(19 rows)

ingentadb=# show work_mem;
 work_mem
--
 1MB
(1 row)

ingentadb=# show shared_buffers;
 shared_buffers

 32MB
(1 row)

ingentadb=# show temp_buffers;
 temp_buffers
--
 131072
(1 row)


The execution time has not improved. I am going to increase the
shared_buffers now keeping the work_mem same.

-- 
Mohammad Farhan Husain
Research Assistant
Department of Computer Science
Erik Jonsson School of Engineering and Computer Science
University of Texas at Dallas


Re: [PERFORM] Abnormal performance difference between Postgres and MySQL

2009-02-25 Thread Kevin Grittner
 Farhan Husain russ...@gmail.com wrote: 
 Kevin Grittner kevin.gritt...@wicourts.gov wrote:
  Farhan Husain russ...@gmail.com wrote:
  The machine postgres is running on has 4 GB of RAM.

 In addition to the other suggestions, you should be sure that
 effective_cache_size is set to a reasonable value, which would
 probably be somewhere in the neighborhood of '3GB'.
 
 The execution time has not improved. I am going to increase the
 shared_buffers now keeping the work_mem same.
 
Increasing shared_buffers is good, but it has been mentioned that this
will not change the plan, which currently scans and sorts the whole
table for a1.  Nobody will be surprised when you report minimal
change, if any.  If you have not changed effective_cache_size (be sure
not to confuse this with any of the other configuration values) it
will think you only have 128MB of cache, which will be off by a factor
of about 24 from reality.
 
Also, I'm going to respectfully differ with some of the other posts on
the best setting for work_mem.  Most benchmarks I've run and can
remember seeing posted found best performance for this at somewhere
between 16MB and 32MB.  You do have to be careful if you have a large
number of concurrent queries, however, and keep it lower.  In most
such cases, though, you're better off using a connection pool to limit
concurrent queries instead.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Abnormal performance difference between Postgres and MySQL

2009-02-25 Thread Scott Carey
I will second Kevin's suggestion.  Unless you think you will have more than a 
few dozen concurrent queries, start with work_mem around 32MB.
For the query here, a very large work_mem might help it hash join depending on 
the data... But that's not the real problem here.

The real problem is that it does a huge scan of all of the a1 table, and sorts 
it.  Its pretty clear that this table has incorrect statistics.  It thinks that 
it will get about 1 million rows back in the scan, but it is actually 3 million 
in the scan.

Crank up the statistics target on that table from the default to at least 100, 
perhaps even 1000.  This is a large table, the default statistics target of 10 
is not good for large tables with skewed column data.  Those to try increasing 
the target on are the columns filtered in the explain: graphid, prop, and obj.  
Then run vacuum analzye on that table (a1).  The planner should then have 
better stats and will likely be able to use a better plan for the join.

The other tables involved in the join also seem to have  bad statistics.  You 
might just take the easiest solution and change the global statistics target 
and vacuum analyze the tables involved:

set default_statistics_target = 50;
vacuum analyze jena_g1t1_stmt ;

(test the query)

Repeat for several values of the default statistics target.  You can run 
explain before running the actual query, to see if the plan changed.  If it 
has not, the time will not likely change.
The max value for the statistics target is 1000, which makes analyzing and 
query planning slower, but more accurate.  In most cases, dramatic differences 
can happen between the default of 10 and values of 25 or 50.  Sometimes, you 
have to go into the hundreds, and it is safer to do this on a per-column basis 
once you get to larger values.

For larger database, I recommend increasing the default to 20 to 40 and 
re-analyzing all the tables.





On 2/25/09 3:11 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote:

 Farhan Husain russ...@gmail.com wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov wrote:
  Farhan Husain russ...@gmail.com wrote:
  The machine postgres is running on has 4 GB of RAM.

 In addition to the other suggestions, you should be sure that
 effective_cache_size is set to a reasonable value, which would
 probably be somewhere in the neighborhood of '3GB'.

 The execution time has not improved. I am going to increase the
 shared_buffers now keeping the work_mem same.

Increasing shared_buffers is good, but it has been mentioned that this
will not change the plan, which currently scans and sorts the whole
table for a1.  Nobody will be surprised when you report minimal
change, if any.  If you have not changed effective_cache_size (be sure
not to confuse this with any of the other configuration values) it
will think you only have 128MB of cache, which will be off by a factor
of about 24 from reality.

Also, I'm going to respectfully differ with some of the other posts on
the best setting for work_mem.  Most benchmarks I've run and can
remember seeing posted found best performance for this at somewhere
between 16MB and 32MB.  You do have to be careful if you have a large
number of concurrent queries, however, and keep it lower.  In most
such cases, though, you're better off using a connection pool to limit
concurrent queries instead.

-Kevin

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



Re: [PERFORM] Abnormal performance difference between Postgres and MySQL

2009-02-25 Thread Robert Haas
 Here is the latest output:

 ingentadb=# EXPLAIN ANALYZE select A0.Subj, A2.Obj From jena_g1t1_stmt A0,
 jena_g1t1_stmt A1, jena_g1t1_stmt A2 Where
 A0.Prop='Uv::http://prismstandard.org/namespaces/1.2/basic/isPartOf' AND
 A0.Obj='Uv::http://www.utdallas.edu/~farhan.husain/IngentaConnect/issue1_1'
 AND A0.GraphID=1 AND A0.Subj=A1.Subj AND
 A1.Prop='Uv::http://www.w3.org/1999/02/22-rdf-syntax-ns#type' AND
 A1.Obj='Uv::http://metastore.ingenta.com/ns/structure/Article' AND
 A1.GraphID=1 AND A0.Subj=A2.Subj AND
 A2.Prop='Uv::http://prismstandard.org/namespaces/1.2/basic/startingPage' AND
 A2.GraphID=1;

 QUERY
 PLAN
 
  Merge Join  (cost=799852.37..812767.47 rows=733195 width=134) (actual
 time=5941553.710..5941569.192 rows=30 loops=1)
    Merge Cond: ((a0.subj)::text = (a1.subj)::text)
    -  Sort  (cost=89884.41..89964.28 rows=31949 width=208) (actual
 time=243.711..243.731 rows=30 loops=1)
  Sort Key: a0.subj
  Sort Method:  quicksort  Memory: 24kB
  -  Nested Loop  (cost=0.00..84326.57 rows=31949 width=208) (actual
 time=171.255..232.765 rows=30 loops=1)
    -  Index Scan using jena_g1t1_stmt_ixo on jena_g1t1_stmt a0
 (cost=0.00..5428.34 rows=487 width=74) (actual time=96.735..97.070 rows=30
 loops=1)
  Index Cond: ((obj)::text =
 'Uv::http://www.utdallas.edu/~farhan.husain/IngentaConnect/issue1_1'::text)
  Filter: (((prop)::text =
 'Uv::http://prismstandard.org/namespaces/1.2/basic/isPartOf'::text) AND
 (graphid = 1))
    -  Index Scan using jena_g1t1_stmt_ixsp on jena_g1t1_stmt
 a2  (cost=0.00..161.37 rows=51 width=134) (actual time=4.513..4.518 rows=1
 loops=30)
  Index Cond: (((a2.subj)::text = (a0.subj)::text) AND
 ((a2.prop)::text =
 'Uv::http://prismstandard.org/namespaces/1.2/basic/startingPage'::text))
  Filter: (a2.graphid = 1)
    -  Materialize  (cost=709967.96..723526.46 rows=1084680 width=74)
 (actual time=5941309.876..5941318.552 rows=31 loops=1)
  -  Sort  (cost=709967.96..712679.66 rows=1084680 width=74) (actual
 time=5941309.858..5941318.488 rows=31 loops=1)
    Sort Key: a1.subj
    Sort Method:  external merge  Disk: 282480kB
    -  Seq Scan on jena_g1t1_stmt a1  (cost=0.00..456639.59
 rows=1084680 width=74) (actual time=0.054..44604.597 rows=3192000 loops=1)
  Filter: ((graphid = 1) AND ((prop)::text =
 'Uv::http://www.w3.org/1999/02/22-rdf-syntax-ns#type'::text) AND
 ((obj)::text =
 'Uv::http://metastore.ingenta.com/ns/structure/Article'::text))
  Total runtime: 5941585.248 ms
 (19 rows)

Can you do this:

select * from pg_statistic where starelid = 'jena_g1t1_stmt'::regclass;

Thanks,

...Robert

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Abnormal performance difference between Postgres and MySQL

2009-02-24 Thread Farhan Husain
The result set should have 31 rows, that is correct.

On Mon, Feb 23, 2009 at 7:53 PM, Scott Marlowe scott.marl...@gmail.comwrote:

 On Mon, Feb 23, 2009 at 6:24 PM, Farhan Husain russ...@gmail.com wrote:
 This sort here:

 -  Sort  (cost=565372.46..568084.16 rows=1084680 width=74) (actual
  time=5410606.604..5410606.628 rows=31 loops=1)
   Sort Key: a1.subj
   Sort Method:  quicksort  Memory: 489474kB
   -  Seq Scan on jena_g1t1_stmt a1  (cost=0.00..456639.59
  rows=1084680 width=74) (actual time=0.043..44005.780 rows=3192000
 loops=1)

 Seems to be the problem.  There are a few things that seem odd, the
 first is that it estimates it will return 1M ros, but returns only 31.
  The other is that sorting 31 rows is taking 5410606 milliseconds.

 My first guess would be to crank up the statistics on a1.subj to a few
 hundred (going up to a thousand if necessary) re-analyzing and seeing
 if the query plan changes.

 I'm not expert enough on explain analyze to offer any more.




-- 
Mohammad Farhan Husain
Research Assistant
Department of Computer Science
Erik Jonsson School of Engineering and Computer Science
University of Texas at Dallas


Re: [PERFORM] Abnormal performance difference between Postgres and MySQL

2009-02-24 Thread Farhan Husain
On Tue, Feb 24, 2009 at 1:28 AM, Claus Guttesen kome...@gmail.com wrote:

  I am doing a performance comparison between running Jena with MySQL and
  Postgres. I used the 8.3-community version of Postgres and MySQL 5.0.67.
 I
  have run several queries to both MySQL and Postgres and all of them took
  similar amount of time to execute except one. For the following query to
 a
  table having 10,003,728 rows, MySQL takes 0.11 seconds to return results
  whereas Postgres takes like 1 hour and 20 minutes!
 
  Query:
 
  select A0.Subj, A2.Obj From jena_g1t1_stmt A0, jena_g1t1_stmt A1,
  jena_g1t1_stmt A2 Where
  A0.Prop='Uv::http://prismstandard.org/namespaces/1.2/basic/isPartOf' AND
  A0.Obj='Uv::
 http://www.utdallas.edu/~farhan.husain/IngentaConnect/issue1_1http://www.utdallas.edu/%7Efarhan.husain/IngentaConnect/issue1_1
 '
  AND A0.GraphID=1 AND A0.Subj=A1.Subj AND
  A1.Prop='Uv::http://www.w3.org/1999/02/22-rdf-syntax-ns#type' AND
  A1.Obj='Uv::http://metastore.ingenta.com/ns/structure/Article' AND
  A1.GraphID=1 AND A0.Subj=A2.Subj AND
  A2.Prop='Uv::http://prismstandard.org/namespaces/1.2/basic/startingPage'
 AND
  A2.GraphID=1;
 
  Table:
 
  Table public.jena_g1t1_stmt
   Column  |  Type  | Modifiers
  -+
  +---
   subj| character varying(250) | not null
   prop| character varying(250) | not null
   obj | character varying(250) | not null
   graphid | integer|
  Indexes:
  jena_g1t1_stmt_ixo btree (obj)
  jena_g1t1_stmt_ixsp btree (subj, prop)

 Isn't it missing an index on the column prop?

 select ... where A0.Prop='foo' and ...

 --
 regards
 Claus

 When lenity and cruelty play for a kingdom,
 the gentler gamester is the soonest winner.

 Shakespeare


Can you please elaborate a bit?

Thanks,

-- 
Mohammad Farhan Husain
Research Assistant
Department of Computer Science
Erik Jonsson School of Engineering and Computer Science
University of Texas at Dallas


Re: [PERFORM] Abnormal performance difference between Postgres and MySQL

2009-02-24 Thread Claus Guttesen
  Query:
 
  select A0.Subj, A2.Obj From jena_g1t1_stmt A0, jena_g1t1_stmt A1,
  jena_g1t1_stmt A2 Where
  A0.Prop='Uv::http://prismstandard.org/namespaces/1.2/basic/isPartOf' AND
 
  A0.Obj='Uv::http://www.utdallas.edu/~farhan.husain/IngentaConnect/issue1_1'
  AND A0.GraphID=1 AND A0.Subj=A1.Subj AND
  A1.Prop='Uv::http://www.w3.org/1999/02/22-rdf-syntax-ns#type' AND
  A1.Obj='Uv::http://metastore.ingenta.com/ns/structure/Article' AND
  A1.GraphID=1 AND A0.Subj=A2.Subj AND
  A2.Prop='Uv::http://prismstandard.org/namespaces/1.2/basic/startingPage'
  AND
  A2.GraphID=1;
 
  Table:
 
      Table public.jena_g1t1_stmt
   Column  |  Type  | Modifiers
  -+
  +---
   subj    | character varying(250) | not null
   prop    | character varying(250) | not null
   obj | character varying(250) | not null
   graphid | integer    |
  Indexes:
      jena_g1t1_stmt_ixo btree (obj)
      jena_g1t1_stmt_ixsp btree (subj, prop)

 Isn't it missing an index on the column prop?

 select ... where A0.Prop='foo' and ...
 --
 Can you please elaborate a bit?

I thought that A0.Prop would ignore the composite index created on the
columns subj and prop but this does not seem to be the case.

-- 
regards
Claus

When lenity and cruelty play for a kingdom,
the gentler gamester is the soonest winner.

Shakespeare

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Abnormal performance difference between Postgres and MySQL

2009-02-24 Thread Robert Haas
 Can you please elaborate a bit?

 I thought that A0.Prop would ignore the composite index created on the
 columns subj and prop but this does not seem to be the case.

Yeah, I think you're barking up the wrong tree here.  I think Tom had
the correct diagnosis - what do you get from show work_mem?

What kind of machine are you running this on?  If it's a UNIX-ish
machine, what do you get from free -mand uname -a?

...Robert

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Abnormal performance difference between Postgres and MySQL

2009-02-23 Thread Gregory Stark
Farhan Husain russ...@gmail.com writes:

 I can provide any other information needed and also the data if anyone
 wants.

What did the query plans look like in both databases?

In Postgres you can get the query plan with

EXPLAIN ANALYZE select ...

You can leave out the ANALYZE if you can't wait until the query completes but
it will have much less information to diagnosis any problems.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Abnormal performance difference between Postgres and MySQL

2009-02-23 Thread Scott Marlowe
On Mon, Feb 23, 2009 at 4:16 PM, Farhan Husain russ...@gmail.com wrote:
 Hello,

 I am doing a performance comparison between running Jena with MySQL and
 Postgres. I used the 8.3-community version of Postgres and MySQL 5.0.67. I
 have run several queries to both MySQL and Postgres and all of them took
 similar amount of time to execute except one. For the following query to a
 table having 10,003,728 rows, MySQL takes 0.11 seconds to return results
 whereas Postgres takes like 1 hour and 20 minutes!

 Query:

 select A0.Subj, A2.Obj From jena_g1t1_stmt A0, jena_g1t1_stmt A1,
 jena_g1t1_stmt A2 Where
 A0.Prop='Uv::http://prismstandard.org/namespaces/1.2/basic/isPartOf' AND
 A0.Obj='Uv::http://www.utdallas.edu/~farhan.husain/IngentaConnect/issue1_1'
 AND A0.GraphID=1 AND A0.Subj=A1.Subj AND
 A1.Prop='Uv::http://www.w3.org/1999/02/22-rdf-syntax-ns#type' AND
 A1.Obj='Uv::http://metastore.ingenta.com/ns/structure/Article' AND
 A1.GraphID=1 AND A0.Subj=A2.Subj AND
 A2.Prop='Uv::http://prismstandard.org/namespaces/1.2/basic/startingPage' AND
 A2.GraphID=1;

 Table:

 Table public.jena_g1t1_stmt
  Column  |  Type  | Modifiers
 -+
 +---
  subj| character varying(250) | not null
  prop| character varying(250) | not null
  obj | character varying(250) | not null
  graphid | integer|
 Indexes:
 jena_g1t1_stmt_ixo btree (obj)
 jena_g1t1_stmt_ixsp btree (subj, prop)

 Machine: SunOS 5.10 Generic_127111-11 sun4u sparc SUNW, Sun-Fire-880
 Memory: 4 GB
 Number of physical processors: 2

 I tried to re-arrage the query but each time the amount of time needed is
 the same. Can anyone help me find the answer to why Postgres is taking so
 much time?

 I can provide any other information needed and also the data if anyone
 wants.

What is the locale of your database?  I.e.:

# show lc_collate ;
 lc_collate
-
 en_US.UTF-8
(1 row)

If it's not C then string compares are going to probably need special
indexes to work the way you expect them. (varchar pattern ops).  Look
here for more information:

http://www.postgresql.org/docs/8.3/static/indexes-opclass.html

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Abnormal performance difference between Postgres and MySQL

2009-02-23 Thread Farhan Husain
On Mon, Feb 23, 2009 at 5:27 PM, Scott Marlowe scott.marl...@gmail.comwrote:

 On Mon, Feb 23, 2009 at 4:16 PM, Farhan Husain russ...@gmail.com wrote:
  Hello,
 
  I am doing a performance comparison between running Jena with MySQL and
  Postgres. I used the 8.3-community version of Postgres and MySQL 5.0.67.
 I
  have run several queries to both MySQL and Postgres and all of them took
  similar amount of time to execute except one. For the following query to
 a
  table having 10,003,728 rows, MySQL takes 0.11 seconds to return results
  whereas Postgres takes like 1 hour and 20 minutes!
 
  Query:
 
  select A0.Subj, A2.Obj From jena_g1t1_stmt A0, jena_g1t1_stmt A1,
  jena_g1t1_stmt A2 Where
  A0.Prop='Uv::http://prismstandard.org/namespaces/1.2/basic/isPartOf' AND
  A0.Obj='Uv::
 http://www.utdallas.edu/~farhan.husain/IngentaConnect/issue1_1http://www.utdallas.edu/%7Efarhan.husain/IngentaConnect/issue1_1
 '
  AND A0.GraphID=1 AND A0.Subj=A1.Subj AND
  A1.Prop='Uv::http://www.w3.org/1999/02/22-rdf-syntax-ns#type' AND
  A1.Obj='Uv::http://metastore.ingenta.com/ns/structure/Article' AND
  A1.GraphID=1 AND A0.Subj=A2.Subj AND
  A2.Prop='Uv::http://prismstandard.org/namespaces/1.2/basic/startingPage'
 AND
  A2.GraphID=1;
 
  Table:
 
  Table public.jena_g1t1_stmt
   Column  |  Type  | Modifiers
  -+
  +---
   subj| character varying(250) | not null
   prop| character varying(250) | not null
   obj | character varying(250) | not null
   graphid | integer|
  Indexes:
  jena_g1t1_stmt_ixo btree (obj)
  jena_g1t1_stmt_ixsp btree (subj, prop)
 
  Machine: SunOS 5.10 Generic_127111-11 sun4u sparc SUNW, Sun-Fire-880
  Memory: 4 GB
  Number of physical processors: 2
 
  I tried to re-arrage the query but each time the amount of time needed is
  the same. Can anyone help me find the answer to why Postgres is taking so
  much time?
 
  I can provide any other information needed and also the data if anyone
  wants.

 What is the locale of your database?  I.e.:

 # show lc_collate ;
  lc_collate
 -
  en_US.UTF-8
 (1 row)

 If it's not C then string compares are going to probably need special
 indexes to work the way you expect them. (varchar pattern ops).  Look
 here for more information:

 http://www.postgresql.org/docs/8.3/static/indexes-opclass.html


Here it is:

ingentadb=# show lc_collate;
   lc_collate
-
 en_US.ISO8859-1
(1 row)

Do you think this is the source of the problem?

Thanks,

-- 
Mohammad Farhan Husain
Research Assistant
Department of Computer Science
Erik Jonsson School of Engineering and Computer Science
University of Texas at Dallas


Re: [PERFORM] Abnormal performance difference between Postgres and MySQL

2009-02-23 Thread Guillaume Smet
On Tue, Feb 24, 2009 at 12:27 AM, Scott Marlowe scott.marl...@gmail.com wrote:
 If it's not C then string compares are going to probably need special
 indexes to work the way you expect them. (varchar pattern ops).  Look
 here for more information:

 http://www.postgresql.org/docs/8.3/static/indexes-opclass.html

It's only relevant for pattern matching (eg LIKE or regexp). AFAICS,
the OP only uses plain equals in his query.

An EXPLAIN ANALYZE output would be nice to diagnose the problem.

-- 
Guillaume

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Abnormal performance difference between Postgres and MySQL

2009-02-23 Thread Farhan Husain
On Mon, Feb 23, 2009 at 5:27 PM, Gregory Stark st...@enterprisedb.comwrote:

 Farhan Husain russ...@gmail.com writes:

  I can provide any other information needed and also the data if anyone
  wants.

 What did the query plans look like in both databases?

 In Postgres you can get the query plan with

 EXPLAIN ANALYZE select ...

 You can leave out the ANALYZE if you can't wait until the query completes
 but
 it will have much less information to diagnosis any problems.

 --
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!


I am doing the EXPLAIN ANALYZE now, it will take about 1 hour and 20 minutes
again. I will get back to you once it is finished. Do you know how to get
the query plan in MySQL?

-- 
Mohammad Farhan Husain
Research Assistant
Department of Computer Science
Erik Jonsson School of Engineering and Computer Science
University of Texas at Dallas


Re: [PERFORM] Abnormal performance difference between Postgres and MySQL

2009-02-23 Thread Scott Marlowe
On Mon, Feb 23, 2009 at 4:33 PM, Guillaume Smet
guillaume.s...@gmail.com wrote:
 On Tue, Feb 24, 2009 at 12:27 AM, Scott Marlowe scott.marl...@gmail.com 
 wrote:
 If it's not C then string compares are going to probably need special
 indexes to work the way you expect them. (varchar pattern ops).  Look
 here for more information:

 http://www.postgresql.org/docs/8.3/static/indexes-opclass.html

 It's only relevant for pattern matching (eg LIKE or regexp). AFAICS,
 the OP only uses plain equals in his query.

True, I had a bit of a headache trying to read that unindented query.
(OP here's a hint, if you want people to read your queries / code,
indent it in some way that makes it fairly readable  Note that
varchar_pattern_ops indexes can't be used for straight equal compares
either.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Abnormal performance difference between Postgres and MySQL

2009-02-23 Thread Scott Marlowe
On Mon, Feb 23, 2009 at 4:35 PM, Farhan Husain russ...@gmail.com wrote:


 On Mon, Feb 23, 2009 at 5:27 PM, Gregory Stark st...@enterprisedb.com
 wrote:

 Farhan Husain russ...@gmail.com writes:

  I can provide any other information needed and also the data if anyone
  wants.

 What did the query plans look like in both databases?

 In Postgres you can get the query plan with

 EXPLAIN ANALYZE select ...

 You can leave out the ANALYZE if you can't wait until the query completes
 but
 it will have much less information to diagnosis any problems.

 --
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

 I am doing the EXPLAIN ANALYZE now, it will take about 1 hour and 20 minutes
 again. I will get back to you once it is finished. Do you know how to get
 the query plan in MySQL?

Explain works in mysql.  It just doesn't tell you a whole lot, because
the query planner's dumb as a brick.  Note that often that stupid
query planner makes queries run really fast.  When it doesn't, there's
not a lot of tuning you can do to fix it.

What does plain explain on pgsql tell you?  Please attach output back
to the list from it for us to peruse.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Abnormal performance difference between Postgres and MySQL

2009-02-23 Thread Farhan Husain
On Mon, Feb 23, 2009 at 5:27 PM, Gregory Stark st...@enterprisedb.comwrote:

 Farhan Husain russ...@gmail.com writes:

  I can provide any other information needed and also the data if anyone
  wants.

 What did the query plans look like in both databases?

 In Postgres you can get the query plan with

 EXPLAIN ANALYZE select ...

 You can leave out the ANALYZE if you can't wait until the query completes
 but
 it will have much less information to diagnosis any problems.

 --
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!


Here is the output:

ingentadb=# EXPLAIN ANALYZE select A0.Subj, A2.Obj From jena_g1t1_stmt A0,
jena_g1t1_stmt A1, jena_g1t1_stmt A2 Where A0.Prop='Uv::
http://prismstandard.org/namespaces/1.2/basic/isPartOf' AND A0.Obj='Uv::
http://www.utdallas.edu/~farhan.husain/IngentaConnect/issue1_1' AND
A0.GraphID=1 AND A0.Subj=A1.Subj AND A1.Prop='Uv::
http://www.w3.org/1999/02/22-rdf-syntax-ns#type' AND A1.Obj='Uv::
http://metastore.ingenta.com/ns/structure/Article' AND A1.GraphID=1 AND
A0.Subj=A2.Subj AND A2.Prop='Uv::
http://prismstandard.org/namespaces/1.2/basic/startingPage' AND
A2.GraphID=1;

QUERY
PLAN

--
 Merge Join  (cost=652089.37..665004.47 rows=733195 width=134) (actual
time=5410683.129..5410690.033 rows=30 loops=1)
   Merge Cond: ((a0.subj)::text = (a1.subj)::text)
   -  Sort  (cost=86716.91..86796.78 rows=31949 width=208) (actual
time=76.395..76.423 rows=30 loops=1)
 Sort Key: a0.subj
 Sort Method:  quicksort  Memory: 24kB
 -  Nested Loop  (cost=0.00..84326.57 rows=31949 width=208) (actual
time=4.146..65.409 rows=30 loops=1)
   -  Index Scan using jena_g1t1_stmt_ixo on jena_g1t1_stmt a0
(cost=0.00..5428.34 rows=487 width=74) (actual time=1.980..2.142 rows=30
loops=1)
 Index Cond: ((obj)::text = 'Uv::
http://www.utdallas.edu/~farhan.husain/IngentaConnect/issue1_1'::text)
 Filter: (((prop)::text = 'Uv::
http://prismstandard.org/namespaces/1.2/basic/isPartOf'::text) AND (graphid
= 1))
   -  Index Scan using jena_g1t1_stmt_ixsp on jena_g1t1_stmt
a2  (cost=0.00..161.37 rows=51 width=134) (actual time=2.101..2.104 rows=1
loops=30)
 Index Cond: (((a2.subj)::text = (a0.subj)::text) AND
((a2.prop)::text = 'Uv::
http://prismstandard.org/namespaces/1.2/basic/startingPage'::text))
 Filter: (a2.graphid = 1)
   -  Sort  (cost=565372.46..568084.16 rows=1084680 width=74) (actual
time=5410606.604..5410606.628 rows=31 loops=1)
 Sort Key: a1.subj
 Sort Method:  quicksort  Memory: 489474kB
 -  Seq Scan on jena_g1t1_stmt a1  (cost=0.00..456639.59
rows=1084680 width=74) (actual time=0.043..44005.780 rows=3192000 loops=1)
   Filter: ((graphid = 1) AND ((prop)::text = 'Uv::
http://www.w3.org/1999/02/22-rdf-syntax-ns#type'::text) AND ((obj)::text =
'Uv::http://metastore.ingenta.com/ns/structure/Article'::text))
 Total runtime: 5410691.012 ms
(18 rows)


-- 
Mohammad Farhan Husain
Research Assistant
Department of Computer Science
Erik Jonsson School of Engineering and Computer Science
University of Texas at Dallas


Re: [PERFORM] Abnormal performance difference between Postgres and MySQL

2009-02-23 Thread Scott Marlowe
On Mon, Feb 23, 2009 at 6:24 PM, Farhan Husain russ...@gmail.com wrote:
This sort here:

-  Sort  (cost=565372.46..568084.16 rows=1084680 width=74) (actual
 time=5410606.604..5410606.628 rows=31 loops=1)
  Sort Key: a1.subj
  Sort Method:  quicksort  Memory: 489474kB
  -  Seq Scan on jena_g1t1_stmt a1  (cost=0.00..456639.59
 rows=1084680 width=74) (actual time=0.043..44005.780 rows=3192000 loops=1)

Seems to be the problem.  There are a few things that seem odd, the
first is that it estimates it will return 1M ros, but returns only 31.
 The other is that sorting 31 rows is taking 5410606 milliseconds.

My first guess would be to crank up the statistics on a1.subj to a few
hundred (going up to a thousand if necessary) re-analyzing and seeing
if the query plan changes.

I'm not expert enough on explain analyze to offer any more.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Abnormal performance difference between Postgres and MySQL

2009-02-23 Thread Tom Lane
Scott Marlowe scott.marl...@gmail.com writes:
 On Mon, Feb 23, 2009 at 6:24 PM, Farhan Husain russ...@gmail.com wrote:
 This sort here:

 -  Sort  (cost=565372.46..568084.16 rows=1084680 width=74) (actual
 time=5410606.604..5410606.628 rows=31 loops=1)
   Sort Key: a1.subj
   Sort Method:  quicksort  Memory: 489474kB
   -  Seq Scan on jena_g1t1_stmt a1  (cost=0.00..456639.59
   rows=1084680 width=74) (actual time=0.043..44005.780 rows=3192000 
 loops=1)

 Seems to be the problem.  There are a few things that seem odd, the
 first is that it estimates it will return 1M ros, but returns only 31.
  The other is that sorting 31 rows is taking 5410606 milliseconds.

Uh, no, it's sorting 3192000 rows --- look at the input scan.  Evidently
only the first 31 of those rows are getting fetched out of the sort,
though.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Abnormal performance difference between Postgres and MySQL

2009-02-23 Thread Tom Lane
Farhan Husain russ...@gmail.com writes:
 Here is the output:

I see a couple of things going on here:

* The planner is choosing to use sort-and-mergejoin for the second join.
This requires sorting all of jena_g1t1_stmt.  If it had accurately
estimated the output size of the first join (ie 30 rows not 30K rows)
it'd likely have gone for a nestloop join instead, assuming that you
have an index on jena_g1t1_stmt.subj.  You need to try to reduce the
1000X error factor in that estimate.  I'm not sure how much it will
help to increase the stats targets on the input tables, but that's
the first thing to try.

* Considering that the sort is all in memory, 5400 seconds seems like
a heck of a long time even for sorting 3M rows.  Since we already found
out you're using a non-C locale, the sort comparisons are ultimately
strcoll() calls, and I'm betting that you are on a platform where
strcoll() is really really slow.  Another possibility is that you don't
really have 500MB of memory to spare, and the sort workspace is getting
swapped in and out (read thrashed...).  Setting work_mem higher than
you can afford is a bad idea.

In comparison to mysql, I think that their planner will use a indexed
nestloop if it possibly can, which makes it look great on this type
of query (but not so hot if the query actually does need to join a
lot of rows).

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Abnormal performance difference between Postgres and MySQL

2009-02-23 Thread Farhan Husain
On Mon, Feb 23, 2009 at 5:27 PM, Gregory Stark st...@enterprisedb.comwrote:

 Farhan Husain russ...@gmail.com writes:

  I can provide any other information needed and also the data if anyone
  wants.

 What did the query plans look like in both databases?

 In Postgres you can get the query plan with

 EXPLAIN ANALYZE select ...

 You can leave out the ANALYZE if you can't wait until the query completes
 but
 it will have much less information to diagnosis any problems.

 --
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!


Here is what I got from MySQL:

mysql explain Select A0.Subj, A2.Obj From jena_g1t1_stmt A0, jena_g1t1_stmt
A1, jena_g1t1_stmt A2 Where A0.Prop='Uv::
http://prismstandard.org/namespaces/1.2/basic/isPartOf:' AND A0.Obj='Uv::
http://ww
w.utdallas.edu/~farhan.husain/IngentaConnect/issue1_1:' AND A0.GraphID=1 AND
A0.Subj=A1.Subj AND A1.Prop='Uv::
http://www.w3.org/1999/02/22-rdf-syntax-ns#type:' AND A1.Obj='Uv::
http://metastore.ingenta
.com/ns/structure/Article:' AND A1.GraphID=1 AND A0.Subj=A2.Subj AND
A2.Prop='Uv::http://prismstandard.org/namespaces/1.2/basic/startingPage:'
AND A2.GraphID=1;
++-+---+--++---+-+-+--+-+
| id | select_type | table | type | possible_keys  |
key   | key_len | ref | rows | Extra   |
++-+---+--++---+-+-+--+-+
|  1 | SIMPLE  | A0| ref  | jena_g1t1_stmtXSP,jena_g1t1_stmtXO |
jena_g1t1_stmtXO  | 102 | const   |  628 | Using where |
|  1 | SIMPLE  | A1| ref  | jena_g1t1_stmtXSP,jena_g1t1_stmtXO |
jena_g1t1_stmtXSP | 204 | ingentadb.A0.Subj,const |1 | Using where |
|  1 | SIMPLE  | A2| ref  | jena_g1t1_stmtXSP  |
jena_g1t1_stmtXSP | 204 | ingentadb.A0.Subj,const |1 | Using where |
++-+---+--++---+-+-+--+-+
3 rows in set (0.00 sec)


-- 
Mohammad Farhan Husain
Research Assistant
Department of Computer Science
Erik Jonsson School of Engineering and Computer Science
University of Texas at Dallas


Re: [PERFORM] Abnormal performance difference between Postgres and MySQL

2009-02-23 Thread Claus Guttesen
 I am doing a performance comparison between running Jena with MySQL and
 Postgres. I used the 8.3-community version of Postgres and MySQL 5.0.67. I
 have run several queries to both MySQL and Postgres and all of them took
 similar amount of time to execute except one. For the following query to a
 table having 10,003,728 rows, MySQL takes 0.11 seconds to return results
 whereas Postgres takes like 1 hour and 20 minutes!

 Query:

 select A0.Subj, A2.Obj From jena_g1t1_stmt A0, jena_g1t1_stmt A1,
 jena_g1t1_stmt A2 Where
 A0.Prop='Uv::http://prismstandard.org/namespaces/1.2/basic/isPartOf' AND
 A0.Obj='Uv::http://www.utdallas.edu/~farhan.husain/IngentaConnect/issue1_1'
 AND A0.GraphID=1 AND A0.Subj=A1.Subj AND
 A1.Prop='Uv::http://www.w3.org/1999/02/22-rdf-syntax-ns#type' AND
 A1.Obj='Uv::http://metastore.ingenta.com/ns/structure/Article' AND
 A1.GraphID=1 AND A0.Subj=A2.Subj AND
 A2.Prop='Uv::http://prismstandard.org/namespaces/1.2/basic/startingPage' AND
 A2.GraphID=1;

 Table:

     Table public.jena_g1t1_stmt
  Column  |  Type  | Modifiers
 -+
 +---
  subj    | character varying(250) | not null
  prop    | character varying(250) | not null
  obj | character varying(250) | not null
  graphid | integer    |
 Indexes:
     jena_g1t1_stmt_ixo btree (obj)
     jena_g1t1_stmt_ixsp btree (subj, prop)

Isn't it missing an index on the column prop?

select ... where A0.Prop='foo' and ...

-- 
regards
Claus

When lenity and cruelty play for a kingdom,
the gentler gamester is the soonest winner.

Shakespeare

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance