Sorry Vasillis I missed you asking that I just did './pgbench -i pgbench' didn't specific set any values. I can try some specific ones if you can suggest any.
On 8 April 2013 21:28, Vasilis Ventirozos <v.ventiro...@gmail.com> wrote: > > > > On Mon, Apr 8, 2013 at 11:18 PM, Mark Davidson <m...@4each.co.uk> wrote: > >> Wow my results are absolutely appalling compared to both of those which >> is really interesting. Are you running postgres 9.2.4 on both instances? >> Any specific configuration changes? >> Thinking there must be something up with my setup to be getting such a >> low tps compared with you. >> > > Both installations are 9.2.4 and both db's have absolutely default > configurations, i can't really explain why there is so much difference > between our results, i can only imagine the initialization, thats why i > asked how you populated your pgbench database (scale factor / fill factor). > > Vasilis Ventirozos > > >> On 8 April 2013 21:02, Vasilis Ventirozos <v.ventiro...@gmail.com> wrote: >> >>> >>> -c 10 means 10 clients so that should take advantage of all your cores >>> (see bellow) >>> >>> %Cpu0 : 39.3 us, 21.1 sy, 0.0 ni, 38.7 id, 0.9 wa, 0.0 hi, 0.0 si, 0.0 st >>> %Cpu1 : 38.0 us, 25.0 sy, 0.0 ni, 26.0 id, 4.2 wa, 0.0 hi, 6.8 si, 0.0 st >>> %Cpu2 : 39.3 us, 20.4 sy, 0.0 ni, 39.0 id, 1.3 wa, 0.0 hi, 0.0 si, 0.0 st >>> %Cpu3 : 40.0 us, 18.7 sy, 0.0 ni, 40.0 id, 1.3 wa, 0.0 hi, 0.0 si, 0.0 st >>> %Cpu4 : 13.9 us, 7.1 sy, 0.0 ni, 79.1 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st >>> %Cpu5 : 13.1 us, 8.4 sy, 0.0 ni, 78.5 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st >>> %Cpu6 : 14.8 us, 6.4 sy, 0.0 ni, 78.8 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st >>> %Cpu7 : 15.7 us, 6.7 sy, 0.0 ni, 77.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st >>> >>> i am pasting you the results of the same test on a i7-2600 16gb with a >>> sata3 SSD and the results from a VM with 2 cores and a normal 7200 rpm hdd >>> >>> -- DESKTOP >>> vasilis@Disorder ~ $ pgbench -c 10 -t 10000 bench >>> starting vacuum...end. >>> transaction type: TPC-B (sort of) >>> scaling factor: 1 >>> query mode: simple >>> number of clients: 10 >>> number of threads: 1 >>> number of transactions per client: 10000 >>> number of transactions actually processed: 100000/100000 >>> tps = 1713.338111 (including connections establishing) >>> tps = 1713.948478 (excluding connections establishing) >>> >>> -- VM >>> >>> postgres@pglab1:~/postgresql-9.2.4/contrib/pgbench$ ./pgbench -c 10 -t >>> 10000 bench >>> starting vacuum...end. >>> transaction type: TPC-B (sort of) >>> scaling factor: 1 >>> query mode: simple >>> number of clients: 10 >>> number of threads: 1 >>> number of transactions per client: 10000 >>> number of transactions actually processed: 100000/100000 >>> tps = 1118.976496 (including connections establishing) >>> tps = 1119.180126 (excluding connections establishing) >>> >>> i am assuming that you didn't populate your pgbench db with the default >>> values , if you tell me how you did i will be happy to re run the test and >>> see the differences. >>> >>> >>> >>> On Mon, Apr 8, 2013 at 10:31 PM, Mark Davidson <m...@4each.co.uk> wrote: >>> >>>> Thanks for your response Vasillis. I've run pgbench on both machines >>>> `./pgbench -c 10 -t 10000 pgbench` getting 99.800650 tps on my local >>>> machine and 23.825332 tps on the server so quite a significant difference. >>>> Could this purely be down to the CPU clock speed or is it likely >>>> something else causing the issue? >>>> I have run ANALYZE on both databases and tried the queries a number of >>>> times on each to make sure the results are consistent, this is the case. >>>> >>>> >>>> On 8 April 2013 18:19, Vasilis Ventirozos <v.ventiro...@gmail.com>wrote: >>>> >>>>> >>>>> Hello Mark, >>>>> PostgreSQL currently doesn't support parallel query so a faster cpu >>>>> even if it has less cores would be faster for a single query, about >>>>> benchmarking you can try pgbench that you will find in the contrib, >>>>> the execution plan may be different because of different statistics, >>>>> have you analyzed both databases when you compared the execution plans ? >>>>> >>>>> Vasilis Ventirozos >>>>> >>>>> >>>>> Been trying to progress with this today. Decided to setup the >>>>>> database on my local machine to try a few things and I'm getting much >>>>>> more >>>>>> sensible results and a totally different query plan >>>>>> http://explain.depesz.com/s/KGd in this case the query took about a >>>>>> minute but does sometimes take around 80 seconds. >>>>>> >>>>>> The config is exactly the same between the two database. The >>>>>> databases them selves are identical with all indexes the same on the >>>>>> tables. >>>>>> >>>>>> The server has an 2 x Intel Xeon E5420 running at 2.5Ghz each, 16GB >>>>>> RAM and the database is just on a SATA HDD which is a Western Digital >>>>>> WD5000AAKS. >>>>>> My desktop has a single i5-3570K running at 3.4Ghz, 16GB RAM and the >>>>>> database is running on a SATA HDD which is a Western Digital WD1002FAEX-0 >>>>>> >>>>>> Could anyone offer any reasoning as to why the plan would be so >>>>>> different across the two machines? I would have thought that the server >>>>>> would perform a lot better since it has more cores or is postgres more >>>>>> affected by the CPU speed? Could anyone suggest a way to bench mark the >>>>>> machines for their postgres performance? >>>>>> >>>>>> Thanks again for everyones input, >>>>>> >>>>>> Mark >>>>>> >>>>>> >>>>>> On 7 April 2013 23:22, Mark Davidson <m...@4each.co.uk> wrote: >>>>>> >>>>>>> Takes a little longer with the INNER join unfortunately. Takes about >>>>>>> ~3.5 minutes, here is the query plan >>>>>>> http://explain.depesz.com/s/EgBl. >>>>>>> >>>>>>> With the JOIN there might not be a match if the data does not fall >>>>>>> within one of the areas that is selected in the IN query. >>>>>>> >>>>>>> So if we have data id (10) that might fall in areas ( 1, 5, 8, 167 ) >>>>>>> but the user might be querying areas ( 200 ... 500 ) so no match in area >>>>>>> would be found just to be absolutely clear. >>>>>>> >>>>>>> Is it worth considering adding additional statistics on any of the >>>>>>> columns? And / Or additional INDEXES or different types INDEX? Would it >>>>>>> be >>>>>>> worth restructuring the query starting with areas and working to join >>>>>>> data >>>>>>> to that? >>>>>>> >>>>>>> >>>>>>> On 7 April 2013 16:15, Kevin Grittner <kgri...@ymail.com> wrote: >>>>>>> >>>>>>>> Greg Williamson <gwilliamso...@yahoo.com> wrote: >>>>>>>> >>>>>>>> >> Thanks for your response. I tried doing what you suggested so >>>>>>>> >> that table now has a primary key of >>>>>>>> >> ' CONSTRAINT data_area_pkey PRIMARY KEY(area_id , data_id ); ' >>>>>>>> >> and I've added the INDEX of >>>>>>>> >> 'CREATE INDEX data_area_data_id_index ON data_area USING btree >>>>>>>> (data_id );' >>>>>>>> >>>>>>>> Yeah, that is what I was suggesting. >>>>>>>> >>>>>>>> >> unfortunately it hasn't resulted in an improvement of the query >>>>>>>> >> performance. >>>>>>>> >>>>>>>> > Did you run analyze on the table after creating the index ? >>>>>>>> >>>>>>>> That probably isn't necessary. Statistics are normally on relations >>>>>>>> and columns; there are only certain special cases where an ANALYZE >>>>>>>> is needed after an index build, like if the index is on an >>>>>>>> expression rather than a list of columns. >>>>>>>> >>>>>>>> Mark, what happens if you change that left join to a normal (inner) >>>>>>>> join? Since you're doing an inner join to data_area and that has a >>>>>>>> foreign key to area, there should always be a match anyway, right? >>>>>>>> The optimizer doesn't recognize that, so it can't start from the >>>>>>>> area and just match to the appropriate points. >>>>>>>> >>>>>>>> -- >>>>>>>> Kevin Grittner >>>>>>>> EnterpriseDB: http://www.enterprisedb.com >>>>>>>> The Enterprise PostgreSQL Company >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >