Re: [PERFORM] random_page_cost = 2.0 on Heroku Postgres
On 08/02/12 21:15, Peter van Hardenberg wrote: Having read the thread, I don't really see how I could study what a more principled value would be. That said, I have access to a very large fleet in which to can collect data so I'm all ears for suggestions about how to measure and would gladly share the results with the list. Peter On Wed, Feb 8, 2012 at 5:39 PM, Scott Marlowescott.marl...@gmail.com wrote: On Wed, Feb 8, 2012 at 5:50 PM, Josh Berkusj...@agliodbs.com wrote: On 2/7/12 4:59 PM, Peter van Hardenberg wrote: Per the thread from last month, I've updated the default random_page_cost on Heroku Postgres to reduce the expected cost of a random_page on all new databases. This is because Heroku uses AWS storage, which has fast seeks but poor throughput compared to internal disk on a standard system, BTW. Also judging by the other thread, it might be something to stop closer to 1.2 to 1.4 or something. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance You can execute several queries with the three different values provided by Scott and Josh. - SET random_page_cost = 2.0 First execution of the queries with EXPLAIN ANALYZE - SET random_page_cost = 1.4 Second execution of the queries with EXPLAIN ANALYZE - SET random_page_cost = 1.2 Second execution of the queries with EXPLAIN ANALYZE And then, you can compare the pattern behind these queries executions Regards, -- Marcos Luis Ortíz Valmaseda Sr. Software Engineer (UCI) http://marcosluis2186.posterous.com http://www.linkedin.com/in/marcosluis2186 Twitter: @marcosluis2186 Fin a la injusticia, LIBERTAD AHORA A NUESTROS CINCO COMPATRIOTAS QUE SE ENCUENTRAN INJUSTAMENTE EN PRISIONES DE LOS EEUU! http://www.antiterroristas.cu http://justiciaparaloscinco.wordpress.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] random_page_cost = 2.0 on Heroku Postgres
On 09/02/12 00:09, Greg Sabino Mullane wrote: -BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 You can execute several queries with the three different values provided by Scott and Josh. - SET random_page_cost = 2.0 First execution of the queries with EXPLAIN ANALYZE - SET random_page_cost = 1.4 Second execution of the queries with EXPLAIN ANALYZE - SET random_page_cost = 1.2 Second execution of the queries with EXPLAIN ANALYZE Well, such a tool would ideally be smarter than that, such that you would run EXPLAIN and compare to the previous plan and only run EXPLAIN ANALYZE if the plan changed. One could even decrement rpc slowly and find out at one points it changes, which would be more interesting than testing arbitrary numbers. Would lead to some really sweet graphs as well. :) Well, the MyYearBook.com´s guys built something seemed called Posuta, I don´t know is this project is alive, but we can ask to them (michael.glaesem...@myyearbook.com). http://area51.myyearbook.com Posuta can be a starting point for it. It uses Ruby and Clojure for core functionalities, jQuery/Flot for graphics, -- Marcos Luis Ortíz Valmaseda Sr. Software Engineer (UCI) http://marcosluis2186.posterous.com http://www.linkedin.com/in/marcosluis2186 Twitter: @marcosluis2186 Fin a la injusticia, LIBERTAD AHORA A NUESTROS CINCO COMPATRIOTAS QUE SE ENCUENTRAN INJUSTAMENTE EN PRISIONES DE LOS EEUU! http://www.antiterroristas.cu http://justiciaparaloscinco.wordpress.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] High load,
Another advice is to look the presentation of Alexander Dymo, on the RailsConf2009 called: Advanced Performance Optimization of Rails Applications available on http://en.oreilly.com/rails2009/public/schedule/detail/8615 This talk are focused on Rails and PostgreSQL, based on the development of the Acunote ´s Project Management Platform http://blog.pluron.com - Mensaje original - De: Andy Colson a...@squeakycode.net Para: Michael Kohl michael.k...@tupalo.com CC: pgsql-performance@postgresql.org Enviados: Jueves, 27 de Enero 2011 12:20:18 GMT -05:00 Región oriental EE. UU./Canadá Asunto: Re: [PERFORM] High load, On 1/27/2011 9:09 AM, Michael Kohl wrote: On Thu, Jan 27, 2011 at 4:06 PM, Andy Colsona...@squeakycode.net wrote: Have you run each of your queries through explain analyze lately? A code review including checking of queries is on our agenda. You are vacuuming/autovacuuming, correct? Sure :-) Thank you, Michael Oh, also, when the box is really busy, have you watched vmstat to see if you start swapping? -Andy -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Ing. Marcos Luís Ortíz Valmaseda System Engineer -- Database Administrator Centro de Tecnologías de Gestión de Datos (DATEC) Universidad de las Ciencias Informáticas http://postgresql.uci.cu -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Queries becoming slow under heavy load
When you say that with a lot of concurrent access, queries get very slow, How many concurrent connections to your server have you had? more that max_connections´value? If you want to have many concurrent connections, you should have consider to use a pooling connection system like pgbouncer or pgpool. Which are the values for: - work_mem - shared_buffers - maintenance_work_mem - effective_cache_size - effective_io_concurrency - server_version Which are your platform? Regards -- Ing. Marcos Luís Ortíz Valmaseda System Engineer -- Database Administrator Centro de Tecnologías de Gestión de Datos (DATEC) Universidad de las Ciencias Informáticas http://postgresql.uci.cu -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Possible to improve query plan?
Which is the type of your application? You can see it on the Performance Whackamole Presentation from Josh Berkus on the PgCon 2009: - Web application - Online Transaction Processing (OLTP) - Data WareHousing (DW) And based on the type of your application, you can configure the postgresql.conf to gain a better performance of your PostgreSQL server. PostgreSQL postgresql.conf baseline: shared_buffers = 25% RAM work_mem = 512K[W] 2 MB[O] 128 MB[D] - but no more that RAM/no_connections maintenance_work_mem = 1/16 RAM checkpoint_segments = 8 [W], 16-64 [O], [D] wal_buffer = 1 MB [W], 8 MB [O], [D] effective_cache_size = 2/3 RAM Regards Ing. Marcos Luís Ortíz Valmaseda Linux User # 418229 PostgreSQL DBA Centro de Tecnologías Gestión de Datos (DATEC) http://postgresql.uci.cu http://www.postgresql.org http://it.toolbox.com/blogs/sql-apprentice - Mensaje original - De: Jeremy Palmer jpal...@linz.govt.nz Para: Andy Colson a...@squeakycode.net CC: pgsql-performance@postgresql.org Enviados: Lunes, 17 de Enero 2011 0:13:25 GMT -05:00 Región oriental EE. UU./Canadá Asunto: Re: [PERFORM] Possible to improve query plan? Hi Andy, Yes important omissions: Server version: 8.4.6 OS Windows Server 2003 Standard Ed :( The work mem is 50mb. I tried setting the work_mem to 500mb, but it didn't make a huge difference in query execution time. But then again the OS disk caching is probably taking over here. Ok here's the new plan with work_mem = 50mb: http://explain.depesz.com/s/xwv And here another plan with work_mem = 500mb: http://explain.depesz.com/s/VmO Thanks, Jeremy -Original Message- From: Andy Colson [mailto:a...@squeakycode.net] Sent: Monday, 17 January 2011 5:57 p.m. To: Jeremy Palmer Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Possible to improve query plan? Hum.. yeah it looks like it takes no time at all to pull data from the individual indexes, and them bitmap them. I'm not sure what the bitmap heap scan is, or why its slow. Hopefully someone smarter will come along. Also its weird that explain.depesz.com didnt parse and show your entire plan. Hum.. you seem to have ending quotes on some of the lines? One other though: quicksort Memory: 23960kB It needs 20Meg to sort... It could be your sort is swapping to disk. What sort of PG version is this? What are you using for work_mem? (you could try to bump it up a little (its possible to set for session only, no need for server restart) and see if that'd help. And sorry, but its my bedtime, good luck though. -Andy __ This message contains information, which is confidential and may be subject to legal privilege. If you are not the intended recipient, you must not peruse, use, disseminate, distribute or copy this message. If you have received this message in error, please notify us immediately (Phone 0800 665 463 or i...@linz.govt.nz) and destroy the original message. LINZ accepts no responsibility for changes to this email, or for any attachments, after its transmission from LINZ. Thank you. __ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] NOT IN substantially slower in 9.0.2 than 8.3.13 - NOT EXISTS runs fast in both 8.3.13 and 9.0.2
Query is : SELECT distinct m.id,coalesce(m.givenname,''), coalesce(m.midname,''), m.surname from marinerstates ms,vessels vsl,mariner m WHERE m.id=ms.marinerid and ms.vslid=vsl.id ANDms.state='Active' and coalesce(ms.endtime,now())::date = '2006-07-15' ANDms.starttime::date = '2007-01-11' AND m.marinertype='Mariner' and m.id NOT IN (SELECT distinct mold.id FROM marinerstates msold, vessels vslold, mariner mold WHERE mold.id=msold.marinerid AND msold.vslid=vslold.id AND msold.state='Active' AND coalesce(msold.endtime,now())::date = '2006-07-15' AND msold.starttime::date = '2007-01-11' AND EXISTS (SELECT 1 FROM marinerstates msold2 WHERE msold2.marinerid=msold.marinerid AND msold2.state='Active' AND msold2.id msold.id AND msold2.starttimemsold.starttime AND (msold.starttime-msold2.endtime)='18 months') AND mold.marinertype='Mariner' ) ORDER BY m.surname,coalesce(m.givenname,'') ,coalesce(m.midname,''); i get the following execution times: (with \timing) FBSD_DEV : query : 240.419 ms LINUX_PROD : query : 219.568 ms FBSD_TEST : query : 2285.509 ms LINUX_TEST : query : 5788.988 ms Re writing the query in the NOT EXIST variation like: SELECT distinct m.id,coalesce(m.givenname,''),coalesce(m.midname,''),m.surname from marinerstates ms,vessels vsl,mariner m where m.id=ms.marinerid and ms.vslid=vsl.id and ms.state='Active' and coalesce(ms.endtime,now())::date = '2006-07-15' and ms.starttime::date = '2007-01-11' and m.marinertype='Mariner' and NOT EXISTS (SELECT distinct mold.id from marinerstates msold,vessels vslold,mariner mold where mold.id=msold.marinerid and msold.vslid=vslold.id and msold.state='Active' and coalesce(msold.endtime,now())::date = '2006-07-15' and msold.starttime::date = '2007-01-11' and exists (select 1 from marinerstates msold2 where msold2.marinerid=msold.marinerid and msold2.state='Active' and msold2.id msold.id and msold2.starttimemsold.starttime AND (msold.starttime-msold2.endtime)='18 months') and mold.marinertype='Mariner' AND mold.id=m.id) order by m.surname,coalesce(m.givenname,''),coalesce(m.midname,''); gives: FBSD_DEV : query : 154.000 ms LINUX_PROD : query : 153.408 ms FBSD_TEST : query : 137.000 ms LINUX_TEST : query : 404.000 ms Well, on the Release Notes on the PostgreSQL-8.4 Documentation, the developers recommend to use NOT EXISTS instead NOT IN, because the first clause has a better performance. So, you can use it on that way. Other questions? - Do you have a partial index on marinerstates.marinerid where this condition is accomplished? - Do you have a index on mariner.id? - Can you provide a explain of these queries on the PostgreSQL-9.0 machines? Regards Ing. Marcos Luís Ortíz Valmaseda Linux User # 418229 PostgreSQL DBA Centro de Tecnologías Gestión de Datos (DATEC) http://postgresql.uci.cu http://www.postgresql.org http://it.toolbox.com/blogs/sql-apprentice -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgres DB maintainenance - vacuum and reindex
Meena_Ramkumar escribió: How to run vacuumdb and reindex for Postgres DB in a non-stop server? Will it be made without shutting the server? If so, then what will be performance degradation percentage? To execute vacuum, you can´t stop the server, is another process of it. If you are using a recent version of PostgreSQL, you can use autovacuum on the server and this process is charged of this or to use VACUUM with the right schedule. You should avoid to use VACUUM FULL, because is very slow and it requires exclusive locks of the tables that you are executing this, and it reduces the table size on the disc but It doesn´t reduce the index size, but iit can make indexes larger. With autovacuum = on, you can avoid to use VACUUM frecuently The performance degradation depends of the quantity of tables and databases that you have on your server. REINDEX is another task that you can execute periodicly on you server, but if you don´t want to affect the production task, the best thing yo do is to drop the index and reissue the CREATE INDEX CONCURRENTLY command. Regards -- -- Ing. Marcos Luís Ortíz Valmaseda -- -- Twitter: http://twitter.com/@marcosluis2186-- -- FreeBSD Fan/User -- -- http://www.freebsd.org/es -- -- Linux User # 418229-- -- Database Architect/Administrator -- -- PostgreSQL RDBMS -- -- http://www.postgresql.org -- -- http://planetpostgresql.org-- -- http://www.postgresql-es.org -- -- Data WareHouse -- Business Intelligence Apprentice -- -- http://www.tdwi.org-- -- Ruby on Rails Fan/Developer-- -- http://rubyonrails.org -- Comunidad Técnica Cubana de PostgreSQL http://postgresql.uci.cu http://personas.grm.uci.cu/+marcos Centro de Tecnologías de Gestión de Datos (DATEC) Contacto: Correo: centa...@uci.cu Telf: +53 07-837-3737 +53 07-837-3714 Universidad de las Ciencias Informáticas http://www.uci.cu -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] 10K vs 15k rpm for analytics
Pierre C escribió: On Tue, 09 Mar 2010 08:00:50 +0100, Greg Smith g...@2ndquadrant.com wrote: Scott Carey wrote: For high sequential throughput, nothing is as optimized as XFS on Linux yet. It has weaknesses elsewhere however. When files are extended one page at a time (as postgres does) fragmentation can be pretty high on some filesystems (ext3, but NTFS is the absolute worst) if several files (indexes + table) grow simultaneously. XFS has delayed allocation which really helps. I'm curious what you feel those weaknesses are. Handling lots of small files, especially deleting them, is really slow on XFS. Databases don't care about that. There is also the dark side of delayed allocation : if your application is broken, it will manifest itself very painfully. Since XFS keeps a lot of unwritten stuff in the buffers, an app that doesn't fsync correctly can lose lots of data if you don't have a UPS. Fortunately, postgres handles fsync like it should be. A word of advice though : a few years ago, we lost a few terabytes on XFS (after that, restoring from backup was quite slow !) because a faulty SCSI cable crashed the server, then crashed it again during xfsrepair. So if you do xfsrepair on a suspicious system, please image the disks first. And then Which file system do you recommend for the PostgreSQL data directory? I was seeying that ZFS brings very cool features for that. The problem with ZFS is that this file system is only on Solaris, OpenSolaris, FreeBSD and Mac OSX Server, and on Linux systems not What do you think about that? Regards -- -- Ing. Marcos Luís Ortíz Valmaseda -- -- Twitter: http://twitter.com/@marcosluis2186-- -- FreeBSD Fan/User -- -- http://www.freebsd.org/es -- -- Linux User # 418229-- -- Database Architect/Administrator -- -- PostgreSQL RDBMS -- -- http://www.postgresql.org -- -- http://planetpostgresql.org-- -- http://www.postgresql-es.org -- -- Data WareHouse -- Business Intelligence Apprentice -- -- http://www.tdwi.org-- -- Ruby on Rails Fan/Developer-- -- http://rubyonrails.org -- Comunidad Técnica Cubana de PostgreSQL http://postgresql.uci.cu http://personas.grm.uci.cu/+marcos Centro de Tecnologías de Gestión de Datos (DATEC) Contacto: Correo: centa...@uci.cu Telf: +53 07-837-3737 +53 07-837-3714 Universidad de las Ciencias Informáticas http://www.uci.cu -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Multiple data base on same server
elias ghanem escribió: Hi, I’m using postgresql 8.4 I need to install multiple postgresql dbs on one server but I have some questions: -Is there any problems (performance wise or other) if I have 10 to 15 DBs on the same server? -Each DB needs 10 tablespaces, so if I create 10 different tablespaces for each DB I will have 100 to 150 table space on the same server. So can this also cause any problems? Thanks It´s depends of the features of the server. If is a good server, for example of 16 GB to 32 of RAM, with 8 a 16 processors, with a good SAN with RAID -1 for the pg_xlog directory and RAID-10 for the $PG_DATA using ZFS if you are using Solaris or FreeBSD and xfs or ext3 using Linux , on a Operating Systems of 64 bits, I think that this load can be supported. There are installations of PostgreSQL with more than 400 db, but the environment is very distribuided on several servers. About the tablespaces, It´s very necesary to have 10 tablespaces on each database? Normally, you can separate the table or the tables with more activity to a rapid disc array (I ´m thinking on a SSD array), other tablespace for the indexes if you have many, and for example with pl/proxy you could handle the partitions of your data. There is not necessaty to have 100 or 150 tablespaces on the same server. You can separate this on a SAN, you can have two or more main PostgreSQL servers and several slaves with the data replicated on any case of data corruption on the main servers. Rebember look the configuration of the performance of the PostgreSQL servers: work_mem, shared_buffers, etc Regards and I hope that comments helps to you. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] [GENERAL] Strange performance degradation
Lorenzo Allegrucci escribió: Matthew Wakeling wrote: On Mon, 23 Nov 2009, Lorenzo Allegrucci wrote: Anyway, how can I get rid those idle in transaction processes? Can I just kill -15 them or is there a less drastic way to do it? Are you crazy? Sure, if you want to destroy all of the changes made to the database in that transaction and thoroughly confuse the client application, you can send a TERM signal to a backend, but the consequences to your data are on your own head. I'm not crazy, it was just a question.. Anyway, problem solved in the Django application. Matthew replied to you of that way because this is not a good manner to do this, not fot thr fact that you are crazy. You can find better ways to do this. Regards -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] RAID card recommendation
Gurgel, Flavio escribió: - Richard Neill rn...@cam.ac.uk escreveu: Matthew Wakeling wrote: We're about to purchase a new server to store some of our old databases, and I was wondering if someone could advise me on a RAID card. We want to make a 6-drive SATA RAID array out of 2TB drives, and it will be RAID 5 or 6 because there will be zero write traffic. The priority is stuffing as much storage into a small 2U rack as possible, with performance less important. We will be running Debian Linux. People have mentioned Areca as making good RAID controllers. We're looking at the Areca ARC-1220 PCI-Express x8 SATA II as a possibility. Does anyone have an opinion on whether it is a turkey or a star? Another possibility is a 3-ware card of some description. Do you actually need a RAID card at all? It's just another point of failure: the Linux software raid (mdadm) is pretty good. Also, be very wary of RAID5 for an array that size. It is highly probable that, if one disk has failed, then during the recovery process, you may lose a second disk. The unrecoverable error rate on standard disks is about 1 in 10^14 bits; your disk array is 10^11 bits in size... We got bitten by this Richard Linux kernel software RAID is fully supported in Debian Lenny, is quite cheap to implement and powerful. I would avoid SATA disks but it's just me. SAS controllers and disks are expensive but worth every penny spent on them. Prefer RAID 1+0 over RAID 5 not only because of the risk of failure of a second disk, but I have 3 cases of performance issues caused by RAID 5. It's said that performance is not the problem but think twice because a good application tends to scale fast to several users. Of course, keep a good continuous backup strategy of your databases and don't trust just the mirroring of disks in a RAID fashion. Flavio Henrique A. Gurgel Consultor -- 4Linux tel. 55-11-2125.4765 fax. 55-11-2125.4777 www.4linux.com.br Do you expose that performance issued caused by RAID 5? Because this is one of our solutions here on my country to save the data of our PostgreSQL database. Which model do you recommend ? RAID 0,RAID 1, RAID 5 or RAID 10? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] SSD + RAID
This is very fast. On IT Toolbox there are many whitepapers about it. On the ERP and DataCenter sections specifically. We need that all tests that we do, we can share it on the Project Wiki. Regards On Nov 13, 2009, at 7:02 AM, Karl Denninger wrote: Laszlo Nagy wrote: Hello, I'm about to buy SSD drive(s) for a database. For decision making, I used this tech report: http://techreport.com/articles.x/16255/9 http://techreport.com/articles.x/16255/10 Here are my concerns: * I need at least 32GB disk space. So DRAM based SSD is not a real option. I would have to buy 8x4GB memory, costs a fortune. And then it would still not have redundancy. * I could buy two X25-E drives and have 32GB disk space, and some redundancy. This would cost about $1600, not counting the RAID controller. It is on the edge. * I could also buy many cheaper MLC SSD drives. They cost about $140. So even with 10 drives, I'm at $1400. I could put them in RAID6, have much more disk space (256GB), high redundancy and POSSIBLY good read/write speed. Of course then I need to buy a good RAID controller. My question is about the last option. Are there any good RAID cards that are optimized (or can be optimized) for SSD drives? Do any of you have experience in using many cheaper SSD drives? Is it a bad idea? Thank you, Laszlo Note that some RAID controllers (3Ware in particular) refuse to recognize the MLC drives, in particular, they act as if the OCZ Vertex series do not exist when connected. I don't know what they're looking for (perhaps some indication that actual rotation is happening?) but this is a potential problem make sure your adapter can talk to these things! BTW I have done some benchmarking with Postgresql against these drives and they are SMOKING fast. -- Karl karl.vcf -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org ) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance