Thanks for your response Vasillis. I've run pgbench on both machines `./pgbench -c 10 -t 10000 pgbench` getting 99.800650 tps on my local machine and 23.825332 tps on the server so quite a significant difference. Could this purely be down to the CPU clock speed or is it likely something else causing the issue? I have run ANALYZE on both databases and tried the queries a number of times on each to make sure the results are consistent, this is the case.
On 8 April 2013 18:19, Vasilis Ventirozos <v.ventiro...@gmail.com> wrote: > > Hello Mark, > PostgreSQL currently doesn't support parallel query so a faster cpu even > if it has less cores would be faster for a single query, about benchmarking > you can try pgbench that you will find in the contrib, > the execution plan may be different because of different statistics, have > you analyzed both databases when you compared the execution plans ? > > Vasilis Ventirozos > > > Been trying to progress with this today. Decided to setup the database on >> my local machine to try a few things and I'm getting much more sensible >> results and a totally different query plan >> http://explain.depesz.com/s/KGd in this case the query took about a >> minute but does sometimes take around 80 seconds. >> >> The config is exactly the same between the two database. The databases >> them selves are identical with all indexes the same on the tables. >> >> The server has an 2 x Intel Xeon E5420 running at 2.5Ghz each, 16GB RAM >> and the database is just on a SATA HDD which is a Western Digital >> WD5000AAKS. >> My desktop has a single i5-3570K running at 3.4Ghz, 16GB RAM and the >> database is running on a SATA HDD which is a Western Digital WD1002FAEX-0 >> >> Could anyone offer any reasoning as to why the plan would be so different >> across the two machines? I would have thought that the server would perform >> a lot better since it has more cores or is postgres more affected by the >> CPU speed? Could anyone suggest a way to bench mark the machines for their >> postgres performance? >> >> Thanks again for everyones input, >> >> Mark >> >> >> On 7 April 2013 23:22, Mark Davidson <m...@4each.co.uk> wrote: >> >>> Takes a little longer with the INNER join unfortunately. Takes about >>> ~3.5 minutes, here is the query plan http://explain.depesz.com/s/EgBl. >>> >>> With the JOIN there might not be a match if the data does not fall >>> within one of the areas that is selected in the IN query. >>> >>> So if we have data id (10) that might fall in areas ( 1, 5, 8, 167 ) but >>> the user might be querying areas ( 200 ... 500 ) so no match in area would >>> be found just to be absolutely clear. >>> >>> Is it worth considering adding additional statistics on any of the >>> columns? And / Or additional INDEXES or different types INDEX? Would it be >>> worth restructuring the query starting with areas and working to join data >>> to that? >>> >>> >>> On 7 April 2013 16:15, Kevin Grittner <kgri...@ymail.com> wrote: >>> >>>> Greg Williamson <gwilliamso...@yahoo.com> wrote: >>>> >>>> >> Thanks for your response. I tried doing what you suggested so >>>> >> that table now has a primary key of >>>> >> ' CONSTRAINT data_area_pkey PRIMARY KEY(area_id , data_id ); ' >>>> >> and I've added the INDEX of >>>> >> 'CREATE INDEX data_area_data_id_index ON data_area USING btree >>>> (data_id );' >>>> >>>> Yeah, that is what I was suggesting. >>>> >>>> >> unfortunately it hasn't resulted in an improvement of the query >>>> >> performance. >>>> >>>> > Did you run analyze on the table after creating the index ? >>>> >>>> That probably isn't necessary. Statistics are normally on relations >>>> and columns; there are only certain special cases where an ANALYZE >>>> is needed after an index build, like if the index is on an >>>> expression rather than a list of columns. >>>> >>>> Mark, what happens if you change that left join to a normal (inner) >>>> join? Since you're doing an inner join to data_area and that has a >>>> foreign key to area, there should always be a match anyway, right? >>>> The optimizer doesn't recognize that, so it can't start from the >>>> area and just match to the appropriate points. >>>> >>>> -- >>>> Kevin Grittner >>>> EnterpriseDB: http://www.enterprisedb.com >>>> The Enterprise PostgreSQL Company >>>> >>> >>> >> >