Re: [PERFORM] Improving PostgreSQL insert performance
On 06/10/2017 07:32 PM, Alvaro Herrera wrote: > Frits Jalvingh wrote: > >> So, I am still very interested in getting normal inserts faster, because >> that will gain speed for all work.. If Oracle can do it, and Postgres is >> able to insert fast with copy- where lies the bottleneck with the insert >> command? There seems to be quite a performance hit with the JDBC driver >> itself (as the stored procedure is a lot faster), so I can look into that. >> But even after that there is quite a gap.. > > Did you try inserting multiple tuples in one command? Something like > INSERT INTO .. VALUES ('col1', 'col2'), ('col1', 'col2'), ('col1', 'col2') > It's supposed to be faster than single-row inserts, though I don't > know by how much. When I did the testing of the patch originally I saw significant improvements, e.g. 8x in early versions. The thread is here: https://www.postgresql.org/message-id/flat/44C4451A.4010906%40joeconway.com#44c4451a.4010...@joeconway.com Joe -- Crunchy Data - http://crunchydata.com PostgreSQL Support for Secure Enterprises Consulting, Training, & Open Source Development signature.asc Description: OpenPGP digital signature
Re: [PERFORM] Do work_mem and shared buffers have 1g or 2g limit on 64 bit linux?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 06/13/2015 10:43 AM, Joshua D. Drake wrote: On 06/13/2015 10:27 AM, Kaijiang Chen wrote: Hi, I am using postgresql 9.2.10 on centos 6.2, 64 bit version. The server has 512 GB mem. The jobs are mainly OLAP like. So I need larger work_mem and shared buffers. From the source code, there is a constant MaxAllocSize==1GB. So, I wonder whether work_mem and shared buffers can exceed 2GB in the 64 bit Linux server? Work_mem IIRC can go past 2GB but has never been proven to be effective after that. It does depend on the version you are running. Starting with 9.4 work_mem and maintenance_work_mem can be usefully set to 2 GB. I've done testing with index creation, for example, and you can set maintenance_work_mem high enough (obviously depending on how much RAM you have and how big the sort memory footprint is) to get the entire sort to happen in memory without spilling to disk. In some of those cases I saw time required to create indexes drop by a factor of 3 or more...YMMV. I have not tested with large work_mem to encourage hash aggregate plans, but I suspect there is a lot to be gained there as well. HTH, Joe - -- Joe Conway -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAEBAgAGBQJVfHITAAoJEDfy90M199hlGvcP/ijyCsXnWZAeZSUAW4qb20YJ AHKn0Gl8D9mH9cfPfJeCO+60dcWINzUE6l7qOWWN8JtT6pgbRPGvQsCkx9xRzq+V aXv/d/r5wW4g06krcootliQJ1TWnLbPBCQiqmI27HSvnEgDKmJ3kOdDji1FMrcdm tuBdNxppoSx0sIFMJ6Xe/brt9O8wG/a81E0lAnsyh2nncaaXba96ldIhUbKvU0ie 7In88Rn1UYZDXnoQEtZLmF6ArdTN5dQZkyEZvNKR0CHrPVddVYXP/gMWm/XwnOu6 k3Rg/evCY2yCyxveuQXU5AZhDFXB/VLoOQoZ5MhLxnoLCNDJrqJzymE1shsgIIji i8PfXkKU92/N2kxfDBGwO0LdBpjZzzgg8zMHBsk8FIpXiJvVQKtAfCxYpYkSaL8y L0g4Qi16s2/fFZcn1ORH23BaBlcmS1cnRWWyx/amyqPHX0v4XZvp3/kSj2jCSw+E V7HD8qLut4rEAxwA5AGCy+9iugZp8DKQUUNiXOYbuysAdjceAa9LzPE0BbB4kuFC OfOOjRstr97RyDKwRHjfGs2EnJSENGGcPdGz2HYgup0d4DlIctKww8xeSo55Khp/ HhBjtk7rpnqqEmEeA8+N8w5Z60x4mK900Anr1xhX2x4ETTIG2g9mYkEEZL/OZRUC lihTXLyUhvd57/v7li5p =s0U8 -END PGP SIGNATURE- -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Hints (was Poor performance using CTE)
On 11/21/2012 08:05 AM, Heikki Linnakangas wrote: Rather than telling the planner what to do or not to do, I'd much rather have hints that give the planner more information about the tables and quals involved in the query. A typical source of bad plans is when the planner gets its cost estimates wrong. So rather than telling the planner to use a nested loop join for a INNER JOIN b ON a.id = b.id, the user could tell the planner that there are only 10 rows that match the a.id = b.id qual. That gives the planner the information it needs to choose the right plan on its own. That kind of hints would be much less implementation specific and much more likely to still be useful, or at least not outright counter-productive, in a future version with a smarter planner. You could also attach that kind of hints to tables and columns, which would be more portable and nicer than decorating all queries. I like this idea, but also think that if we have a syntax to allow hints, it would be nice to have a simple way to ignore all hints (yes, I suppose I'm suggesting yet another GUC). That way after sprinkling your SQL with hints, you could easily periodically (e.g. after a Postgres upgrade) test what would happen if the hints were removed. Joe -- Joe Conway credativ LLC: http://www.credativ.us Linux, PostgreSQL, and general Open Source Training, Service, Consulting, 24x7 Support -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Hints (was Poor performance using CTE)
On 11/21/2012 09:28 AM, Craig James wrote: On Wed, Nov 21, 2012 at 9:25 AM, Joe Conway m...@joeconway.com mailto:m...@joeconway.com wrote: On 11/21/2012 08:05 AM, Heikki Linnakangas wrote: Rather than telling the planner what to do or not to do, I'd much rather have hints that give the planner more information about the tables and quals involved in the query. A typical source of bad plans is when the planner gets its cost estimates wrong. So rather than telling the planner to use a nested loop join for a INNER JOIN b ON a.id http://a.id = b.id http://b.id, the user could tell the planner that there are only 10 rows that match the a.id http://a.id = b.id http://b.id qual. That gives the planner the information it needs to choose the right plan on its own. That kind of hints would be much less implementation specific and much more likely to still be useful, or at least not outright counter-productive, in a future version with a smarter planner. You could also attach that kind of hints to tables and columns, which would be more portable and nicer than decorating all queries. I like this idea, but also think that if we have a syntax to allow hints, it would be nice to have a simple way to ignore all hints (yes, I suppose I'm suggesting yet another GUC). That way after sprinkling your SQL with hints, you could easily periodically (e.g. after a Postgres upgrade) test what would happen if the hints were removed. Or a three-way choice: Allow, ignore, or generate an error. That would allow developers to identify where hints are being used. +1 Joe -- Joe Conway credativ LLC: http://www.credativ.us Linux, PostgreSQL, and general Open Source Training, Service, Consulting, 24x7 Support -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Queries with conditions using bitand operator
On 07/13/2010 04:48 AM, Elias Ghanem wrote: Hi, I have table ARTICLE containing a String a field STATUS that represents a number in binary format (for ex: 10011101). My application issues queries with where conditions that uses BITAND operator on this field (for ex: select * from article where status 4 = 4). Thus i'm facing performance problemes with these select queries: the queries are too slow. Since i'm using the BITAND operator in my conditions, creating an index on the status filed is useless and since the second operator variable (status 4 = 4; status 8 = 8; status 16 = 16...) a functional index is also usless (because a functional index require the use of a function that accept only table column as input parameter: constants are not accepted). So is there a way to enhance the performance of these queries? You haven't given a lot of info to help us help you, but would something along these lines be useful to you? drop table if exists testbit; create table testbit( id serial primary key, article text, status int ); insert into testbit (article, status) select 'article ' || generate_series::text, generate_series % 256 from generate_series(1,100); create index idx1 on testbit(article) where status 1 = 1; create index idx2 on testbit(article) where status 2 = 2; create index idx4 on testbit(article) where status 4 = 4; create index idx8 on testbit(article) where status 8 = 8; create index idx16 on testbit(article) where status 16 = 16; create index idx32 on testbit(article) where status 512 = 512; update testbit set status = status + 512 where id in (42, 4242, 424242); explain analyze select * from testbit where status 512 = 512; QUERY PLAN -- Index Scan using idx32 on testbit (cost=0.00..4712.62 rows=5000 width=22) (actual time=0.080..0.085 rows=3 loops=1) Total runtime: 0.170 ms HTH, Joe -- Joe Conway credativ LLC: http://www.credativ.us Linux, PostgreSQL, and general Open Source Training, Service, Consulting, Support signature.asc Description: OpenPGP digital signature
Re: [PERFORM] Highly Efficient Custom Sorting
On 07/06/2010 12:42 PM, Eliot Gable wrote: Thanks for suggesting array_unnest(). I think that will actually prove more useful to me than the other example I'm using for extracting my data from an array. I was actually planning on computing the order on the first call and storing it in a linked list which gets returned one item at a time until all rows have been returned. Also, I found a code example using Google that showed someone storing data across function calls using that pointer. I used their example to produce this: snip if(SRF_IS_FIRSTCALL()) { funcctx = SRF_FIRSTCALL_INIT(); /* This is where we stick or sorted data for returning later */ funcctx-user_fctx = MemoryContextAlloc(funcctx-multi_call_memory_ctx, sizeof(sort_data)); oldcontext = MemoryContextSwitchTo(funcctx-multi_call_memory_ctx); data = (sort_data*) funcctx-user_fctx; /snip I have a structure set up that is typedef'd to sort_data which stores pointers to various things that I need to survive across the calls. Since this seems to be what you are suggesting, I assume this is the correct approach. This approach works, but you could also use the SFRM_Materialize mode and calculate the entire result set in one go. That tends to be simpler. See, for example crosstab_hash() in contrib/tablefunc for an example. FWIW, there are also some good examples of array handling in PL/R, e.g. pg_array_get_r() in pg_conversion.c HTH, Joe -- Joe Conway credativ LLC: http://www.credativ.us Linux, PostgreSQL, and general Open Source Training, Service, Consulting, Support signature.asc Description: OpenPGP digital signature
Re: [PERFORM] Need to increase performance of a query
On 06/10/2010 12:56 PM, Anne Rosset wrote: Craig James wrote: create index item_rank_null_idx on item_rank(pf_id) where item_rank.pf_id is null; Craig Hi Craig, I tried again after adding your suggested index but I didn't see any improvements: (seems that the index is not used) Filter: ((rank IS NOT NULL) AND (pf_id IS NULL) AND ((project_id)::text = 'proj2783'::text)) Total runtime: 11.988 ms (6 rows) Time: 13.654 ms try: create index item_rank_null_idx on item_rank(pf_id) where rank IS NOT NULL AND pf_id IS NULL; Joe signature.asc Description: OpenPGP digital signature
Re: [PERFORM] Need to increase performance of a query
On 06/10/2010 01:10 PM, Joe Conway wrote: try: create index item_rank_null_idx on item_rank(pf_id) where rank IS NOT NULL AND pf_id IS NULL; oops -- that probably should be: create index item_rank_null_idx on item_rank(project_id) where rank IS NOT NULL AND pf_id IS NULL; Joe signature.asc Description: OpenPGP digital signature
Re: [PERFORM] Need to increase performance of a query
On 06/10/2010 01:21 PM, Anne Rosset wrote: I tried that and it didn't make any difference. Same query plan. A little experimentation suggests this might work: create index item_rank_project on item_rank(project_id, rank) where pf_id IS NULL; Joe signature.asc Description: OpenPGP digital signature
Re: [PERFORM] crosstab speed
Jeremiah Elliott wrote: ok, I have an application that I am trying to speed up. Its a reporting application that makes heavy use of the crosstab function. snip here is an example query The crostab function is taking between 5 and 15 seconds to return. Please run the two embedded queries independently, i.e. select ARRAY[site::text,product_line_description::text,report_sls::text,fy_period::text] as COL_HEADER, fy_year, sum(invoice_value) from order_data_tbl where fy_year is not null group by site::text,product_line_description::text,report_sls::text,fy_period::text, fy_year order by site::text,product_line_description::text,report_sls::text,fy_period::text; -- and -- select fy_year from order_data_tbl where fy_year is not null group by fy_year order by fy_year; How long does each take? crosstab cannot run any faster than the sum of these two queries run on their own. If the second one doesn't change often, can you pre-calculate it, perhaps once a day? Joe -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Performance penalty for remote access of postgresql
Stephen Frost wrote: * Guoping Zhang ([EMAIL PROTECTED]) wrote: Obviously, if there is no better solution, the TCP round trip penalty will stop us doing so as we do have performance requirement. Actually, can't you stick multiple inserts into a given 'statement'? ie: insert into abc (123); insert into abc (234); I'm not 100% sure if that solves the round-trip issue, but it might.. Also, it looks like we might have multi-value insert support in 8.2 (I truely hope so anyway), so you could do something like this: insert into abc (123),(234); Yeah, see my post from last night on PATCHES. Something like insert into abc (123); insert into abc (234); ... actually seems to work pretty well as long as you don't drive the machine into swapping. If you're doing a very large number of INSERTs, break it up into bite-sized chunks and you should be fine. Joe ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Big differences in plans between 8.0 and 8.1
Gabriele Turchi wrote: Il giorno sab, 15/07/2006 alle 13.04 -0700, Joe Conway ha scritto: Why not just periodically (once an hour?) run ANALYZE registrazioni; during the day. This will only update the statistics, and should be very low impact. This is my solution too... but: is enough? Or else: there is a better way to do this? If the performance in the better case is 50 times faster than the worse case, during an hour (50/100 record inserted in registrazioni) how much the performance can fall before the new ANALYZE is run? Otherwise, running ANALYZE more frequently can badly affect the overall performance? One thing I noticed is that in both plans there is a seq scan on registrazioni. Given that performance degrades so quickly as records are inserted into registrazioni, I'm wondering if you're missing an index. What indexes do you have on registrazioni? Joe ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Big differences in plans between 8.0 and 8.1
Gabriele Turchi wrote: Running an ANALYZE really change the plan, now it is fast as before (8.0). On the production system a VACUUM FULL ANALYZE is run every morning after a clean-up, when the registrazioni table is empty. During the day this table fills up (about 500 record any day), and apparently the performances are free-falling very quickly. This behaviour has not changed between the old and the new installation. Can you suggest an easy way to collect and keep up-to-date these statistics in a very low-impact way? Why not just periodically (once an hour?) run ANALYZE registrazioni; during the day. This will only update the statistics, and should be very low impact. HTH, Joe ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Dynamically loaded C function performance
Jim C. Nasby wrote: On Fri, May 05, 2006 at 03:47:53PM -0700, Adam Palmblad wrote: Hi, We've got a C function that we use here and we find that for every connection, the first run of the function is much slower than any subsequent runs. ( 50ms compared to 8ms) Besides using connection pooling, are there any options to improve performance? In my experience, connection startup takes a heck of a lot longer than 50ms, so why are you worrying about 50ms for the first run of a function? BTW, sorry, but I don't know a way to speed this up, either. I think Tom nailed the solution already in a nearby reply -- see preload_libraries on this page: http://www.postgresql.org/docs/8.1/interactive/runtime-config-resource.html Joe ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] [sfpug] DATA directory on network attached storage
Aditya wrote: We have not, AFAICT, had any problems with the traffic over NFS as far as reliability -- I'm sure there is a performance penalty, but the reliability and scalability gains more than offset that. My experience agrees with yours. However we did find one gotcha -- see the thread starting here for details: http://archives.postgresql.org/pgsql-hackers/2004-12/msg00479.php In a nutshell, be careful when using an nfs mounted data directory combined with an init script that creates a new data dir when it doesn't find one. FWIW, if I were to do this anew, I would probably opt for iSCSI over GigE with a NetApp. Any particular reason? Our NetApp technical rep advised nfs over iSCSI, IIRC because of performance. Joe ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] [sfpug] DATA directory on network attached storage
Aditya wrote: On Mon, Apr 11, 2005 at 10:59:51AM -0700, Joe Conway wrote: Any particular reason? Our NetApp technical rep advised nfs over iSCSI, IIRC because of performance. I would mount the Netapp volume(s) as a block level device on my server using iSCSI (vs. a file-based device like NFS) so that filesystem parameters could be more finely tuned and one could really make use of jumbo frames over GigE. Actually, we're using jumbo frames over GigE with nfs too. I'm not sure I understand why NFS would perform better than iSCSI -- in any case, some large Oracle dbs at my current job are moving to iSCSI on Netapp and in that environment both Oracle and Netapp advise iSCSI (probably because Oracle uses the block-level device directly), so I suspend the difference in performance is minimal. We also have Oracle DBs via nfs mounted Netapp, again per the local guru's advice. It might be one of those things that is still being debated even within Netapp's ranks (or maybe our info is dated - worth a check). Thanks, Joe ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] PostgreSQL vs. Oracle vs. Microsoft
Alex Turner wrote: I'm not advocating that people switch to Oracle at all, It's still much more expensive than Postgresql, and for most small and medium applications Postgresql is much easier to manage and maintain. I would just like to make sure people get their facts straight. I worked for a company that selected MS SQL Server because it was 'cheaper' than Oracle, when infact with the correct Oracle pricing, Oracle was cheaper, and had superior features. I would have prefered that they use Postgresql, which for the project in question would have been more appropriate and cost much less in hardware and software requirements, but they had to have 'Industry Standard'. Oracle ended up costing $10k with licenses at $149 ea for 25 users, and the support contract wasn't that much of a bear - I can't remember exactly how much, I think it was around $1800/yr. My facts were straight, and they come from firsthand experience. The point is, it is easy to get trapped into thinking to yourself, great, I can get a dual CPU oracle server for ~$10K, that's not too bad But then later you figure out you really need table partitioning or RAC, and suddenly you have to jump directly to multiple 6 figures. The entry level Oracle pricing is mainly a marketing gimmick -- it is intended to get you hooked. Also note that the per named user license scheme is subject to per CPU minimums that guarantee you'll never spend less than half the per CPU price. Oracle's licensing is so complex that there are businesses out there that subsist solely on helping companies figure it out to save money, and they take a cut of the savings. Oracle's own account reps had a hard time answering this question -- does a hyperthreaded Intel CPU count as 1 or 2 CPUs from a licensing standpoint? We were eventually told 1, but that the decision was subject to change in the future. Joe ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] PostgreSQL vs. Oracle vs. Microsoft
Alex Turner wrote: I appreciate your information, but it's not valid. Most people don't need RAC or table partitioning. From a small company perspective, maybe, but not in the least invalid for larger companies. Many of the features in Oracle EE are just not available in Postgresql at all, and many aren't available in any version of SQL Server (table partitioning, bitmap indexes and others). I never claimed otherwise. I said the low end product gets you hooked. Once you're hooked, you'll start to wish for all the wiz-bang features -- after all, that's why you picked Oracle in the first place. Just because Oracle reps are a little clueless sometimes doesn't mean that the product pricing sucks. The minimum user requirement for standard one is 5 users. 5*149=$745, much less than half the price of a dual or single CPU config. And what happens once you need a quad server? I'm sorry that you had a bad experience with Oracle, but Oracle is a fine product, that is available for not alot of $$ if you are willing to use a bit of elbow grease to learn how it works and don't need enterprise features, which many other database product simply don't have, or work very poorly. I never said I had a bad experience with Oracle. I pointed out the gotchas. We have several large Oracle boxes running, several MSSQL, and several Postgres -- they all have their strengths and weaknesses. Nuff said -- this thread is way off topic now... Joe ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] PostgreSQL vs. Oracle vs. Microsoft
Greg Sabino Mullane wrote: Don't forget your support contract cost, as well as licenses for each of your servers: development, testing, QA, etc. Is it really as cheap as 5K? I've heard that for any fairly modern system, it's much more, but that may be wrong. Sort of -- see: http://oraclestore.oracle.com/OA_HTML/ibeCCtpSctDspRte.jsp?section=15105 It is available on single server systems supporting up to a maximum of 2 CPUs Also note that most industrial strength features (like table partitioning, RAC, OLAP, Enterprise Manager plugins, etc, etc) are high priced options (mostly $10K to $20K per CPU) and they can only be used with the Enterprise edition (which is $40K/CPU *not* $2.5K/CPU). http://oraclestore.oracle.com/OA_HTML/ibeCCtpSctDspRte.jsp?section=10103 And you are correct, they expect to be paid for each dev, test, and QA machine too. The $5K edition is just there to get you hooked ;-) By the time you add up what you really want/need, figure you'll spend a couple of orders of magnatude higher, and then 20% per year for ongoing maintenance/upgrades/support. Joe ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] Trying to create multi db query in one large querie
Spiegelberg, Greg wrote: My experience with dblink() is that each dblink() is executed serially Correct. If you really want to do multiple queries simultaneously, you would need to write a function very similar to dblink_record, but using asynchonous libpq calls to both remote hosts. See: http://www.postgresql.org/docs/current/static/libpq-async.html HTH, Joe ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] Alternatives to Dell?
Josh Berkus wrote: Thing is, some companies are required to use 1st-tier or at least 2nd-tier vendors for hardware; they won't home-build. For those people, what vendors do others on this list recommend? What have been your good/bad experiences? I've had very good experiences with IBM hardware, and found their sales and support to be responsive. Joe ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] Data Warehouse Reevaluation - MySQL vs Postgres --
Iain wrote: Joe's example wasn't excluding partions, as he didn't use a predicated UNION ALL view to select from. His queries use an indexed column that allow the various partitions to be probed at low cost, and he was satisfied wth that. Right. My point in my previous post was that you could still do all that that if you wanted to, by building the predicated view with UNION ALL of each of the child tables. Right. It doesn't look that much different: create or replace view foo_vw as select * from foo_2004_01 where f2 = '2004-jan-01' and f2 = '2004-jan-31' union all select * from foo_2004_02 where f2 = '2004-feb-01' and f2 = '2004-feb-29' union all select * from foo_2004_03 where f2 = '2004-mar-01' and f2 = '2004-mar-31' ; -- needed just for illustration since these are toy tables set enable_seqscan to false; explain analyze select * from foo_vw where f2 = '2004-feb-15'; QUERY PLAN -- Subquery Scan foo_vw (cost=0.00..14.54 rows=3 width=16) (actual time=0.022..0.027 rows=1 loops=1) - Append (cost=0.00..14.51 rows=3 width=16) (actual time=0.019..0.022 rows=1 loops=1) - Subquery Scan *SELECT* 1 (cost=0.00..4.84 rows=1 width=16) (actual time=0.004..0.004 rows=0 loops=1) - Index Scan using foo_2004_01_idx2 on foo_2004_01 (cost=0.00..4.83 rows=1 width=16) (actual time=0.003..0.003 rows=0 loops=1) Index Cond: ((f2 = '2004-01-01'::date) AND (f2 = '2004-01-31'::date) AND (f2 = '2004-02-15'::date)) - Subquery Scan *SELECT* 2 (cost=0.00..4.84 rows=1 width=16) (actual time=0.013..0.015 rows=1 loops=1) - Index Scan using foo_2004_02_idx2 on foo_2004_02 (cost=0.00..4.83 rows=1 width=16) (actual time=0.009..0.010 rows=1 loops=1) Index Cond: ((f2 = '2004-02-01'::date) AND (f2 = '2004-02-29'::date) AND (f2 = '2004-02-15'::date)) - Subquery Scan *SELECT* 3 (cost=0.00..4.84 rows=1 width=16) (actual time=0.001..0.001 rows=0 loops=1) - Index Scan using foo_2004_03_idx2 on foo_2004_03 (cost=0.00..4.83 rows=1 width=16) (actual time=0.001..0.001 rows=0 loops=1) Index Cond: ((f2 = '2004-03-01'::date) AND (f2 = '2004-03-31'::date) AND (f2 = '2004-02-15'::date)) Total runtime: 0.188 ms (12 rows) regression=# explain analyze select * from foo where f2 = '2004-feb-15'; QUERY PLAN -- Result (cost=1.00..10073.70 rows=20 width=16) (actual time=0.059..0.091 rows=1 loops=1) - Append (cost=1.00..10073.70 rows=20 width=16) (actual time=0.055..0.086 rows=1 loops=1) - Seq Scan on foo (cost=1.00..10022.50 rows=5 width=16) (actual time=0.001..0.001 rows=0 loops=1) Filter: (f2 = '2004-02-15'::date) - Index Scan using foo_2004_01_idx2 on foo_2004_01 foo (cost=0.00..17.07 rows=5 width=16) (actual time=0.045..0.045 rows=0 loops=1) Index Cond: (f2 = '2004-02-15'::date) - Index Scan using foo_2004_02_idx2 on foo_2004_02 foo (cost=0.00..17.07 rows=5 width=16) (actual time=0.008..0.009 rows=1 loops=1) Index Cond: (f2 = '2004-02-15'::date) - Index Scan using foo_2004_03_idx2 on foo_2004_03 foo (cost=0.00..17.07 rows=5 width=16) (actual time=0.029..0.029 rows=0 loops=1) Index Cond: (f2 = '2004-02-15'::date) Total runtime: 0.191 ms (11 rows) The main difference being that the view needs to be recreated every time a table is added or dropped, whereas with the inherited tables method that isn't needed. Joe ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Data Warehouse Reevaluation - MySQL vs Postgres --
[EMAIL PROTECTED] wrote: Joe Conway [EMAIL PROTECTED] wrote on 15.09.2004, 06:30:24: We're not completely done with our data conversion (from a commercial RDBMSi), but so far the results have been excellent. Similar to what others have said in this thread, the conversion involved restructuring the data to better suit Postgres, and the application (data analysis/mining vs. the source system which is operational). As a result we've compressed a 1TB database down to ~0.4TB, and seen at least one typical query reduced from ~9 minutes down to ~40 seconds. Sounds interesting. The performance gain comes from partition elimination of the inherited tables under the root? I take it the compression comes from use of arrays, avoiding the need for additional rows and key overhead? Sorry, in trying to be concise I was not very clear. I'm using the term compression very generally here. I'll try to give a bit more background, The original data source is a database schema designed for use by an operational application that my company sells to provide enhanced management of equipment that we also sell. The application needs to be very flexible in exactly what data it stores in order to be useful across a wide variety of equipment models and versions. In order to do that there is a very large central transaction table that stores name-value pairs in varchar columns. The name-value pairs come from parsed output of the equipment, and as such there is a fair amount of redundancy and unneeded data that ends up getting stored. At each installation in the field this table can get very large ( billion rows). Additionally the application prematerializes a variety of summaries for use by the operators using the GUI. We collect the data exported from each of the systems in the field and accumulate it in a single central database for data mining and analysis. This is the database that is actually being converted. By compression I really mean that unneeded and redundant data is being stripped out, and data known to be of a certain datatype is stored in that type instead of varchar (e.g. values known to be int are stored as int). Also the summaries are not being converted (although we do some post processing to create new materialized summaries). My points in telling this were: - the use of inherited tables to partition this huge number of rows and yet allow simple query access to it seems to work well, at least in early validation tests - had we simply taken the original database and slammed it into Postgres with no further thought, we would not have seen the big improvements, and thus the project might have been seen as a failure (even though it saves substantial $) Hope that's a bit more clear. I'm hoping to write up a more detailed case study once we've cut the Postgres system into production and the dust settles a bit. Joe ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Data Warehouse Reevaluation - MySQL vs Postgres --
Josh Berkus wrote: - the use of inherited tables to partition this huge number of rows and yet allow simple query access to it seems to work well, at least in early validation tests - had we simply taken the original database and slammed it into Postgres with no further thought, we would not have seen the big improvements, and thus the project might have been seen as a failure (even though it saves substantial $) Any further thoughts on developing this into true table partitioning? Just that I'd love to see it happen ;-) Maybe someday I'll be able to find the time to work on it myself, but for the moment I'm satisfied with the workarounds we've made. Joe ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Data Warehouse Reevaluation - MySQL vs Postgres --
Simon Riggs wrote: Joe, Your application is very interesting. I've just read your OSCON paper. I'd like to talk more about that. Very similar to Kalido. ...but back to partitioning momentarily: Does the performance gain come from partition elimination of the inherited tables under the root? I think the major part of the peformance gain comes from the fact that the source database has different needs in terms of partitioning criteria because of it's different purpose. The data is basically partitioned by customer installation instead of by date. Our converted scheme partitions by date, which is in line with the analytical queries run at the corporate office. Again, this is an argument in favor of not simply porting what you're handed. We might get similar query performance with a single large table and multiple partial indexes (e.g. one per month), but there would be one tradeoff and one disadvantage to that: 1) The indexes would need to be generated periodically -- this is a tradeoff since we currently need to create inherited tables at the same periodicity 2) It would be much more difficult to roll off a month's worth of data when needed. The general idea is that each month we create a new monthly table, then archive and drop the oldest monthly table. If all the data were in one big table we would have to delete many millions of rows from a (possibly) multibillion row table, and then vacuum that table -- no thanks ;-) Joe ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Data Warehouse Reevaluation - MySQL vs Postgres --
Iain wrote: That's exactly what we're doing, but using inherited tables instead of a union view. With inheritance, there is no need to rebuild the view each time a table is added or removed. Basically, in our application, tables are partitioned by either month or week, depending on the type of data involved, and queries are normally date qualified. That sounds interesting. I have to admit that I havn't touched iheritance in pg at all yet so I find it hard to imagine how this would work. If you have a chance, would you mind elaborating on it just a little? OK, see below: = create table foo(f1 int, f2 date, f3 float8); create table foo_2004_01() inherits (foo); create table foo_2004_02() inherits (foo); create table foo_2004_03() inherits (foo); create index foo_2004_01_idx1 on foo_2004_01(f2); create index foo_2004_02_idx1 on foo_2004_02(f2); create index foo_2004_03_idx1 on foo_2004_03(f2); insert into foo_2004_02 values(1,'2004-feb-15',3.14); -- needed just for illustration since these are toy tables set enable_seqscan to false; explain analyze select * from foo where f2 = '2004-feb-15'; QUERY PLAN Result (cost=1.00..10061.32 rows=16 width=16) (actual time=0.224..0.310 rows=1 loops=1) - Append (cost=1.00..10061.32 rows=16 width=16) (actual time=0.214..0.294 rows=1 loops=1) - Seq Scan on foo (cost=1.00..10022.50 rows=5 width=16) (actual time=0.004..0.004 rows=0 loops=1) Filter: (f2 = '2004-02-15'::date) - Index Scan using foo_2004_01_idx1 on foo_2004_01 foo (cost=0.00..17.07 rows=5 width=16) (actual time=0.101..0.101 rows=0 loops=1) Index Cond: (f2 = '2004-02-15'::date) - Index Scan using foo_2004_02_idx1 on foo_2004_02 foo (cost=0.00..4.68 rows=1 width=16) (actual time=0.095..0.101 rows=1 loops=1) Index Cond: (f2 = '2004-02-15'::date) - Index Scan using foo_2004_03_idx1 on foo_2004_03 foo (cost=0.00..17.07 rows=5 width=16) (actual time=0.066..0.066 rows=0 loops=1) Index Cond: (f2 = '2004-02-15'::date) Total runtime: 0.582 ms (11 rows) create table foo_2004_04() inherits (foo); create index foo_2004_04_idx1 on foo_2004_04(f2); explain analyze select * from foo where f2 = '2004-feb-15'; QUERY PLAN Result (cost=1.00..10078.38 rows=21 width=16) (actual time=0.052..0.176 rows=1 loops=1) - Append (cost=1.00..10078.38 rows=21 width=16) (actual time=0.041..0.159 rows=1 loops=1) - Seq Scan on foo (cost=1.00..10022.50 rows=5 width=16) (actual time=0.004..0.004 rows=0 loops=1) Filter: (f2 = '2004-02-15'::date) - Index Scan using foo_2004_01_idx1 on foo_2004_01 foo (cost=0.00..17.07 rows=5 width=16) (actual time=0.012..0.012 rows=0 loops=1) Index Cond: (f2 = '2004-02-15'::date) - Index Scan using foo_2004_02_idx1 on foo_2004_02 foo (cost=0.00..4.68 rows=1 width=16) (actual time=0.016..0.022 rows=1 loops=1) Index Cond: (f2 = '2004-02-15'::date) - Index Scan using foo_2004_03_idx1 on foo_2004_03 foo (cost=0.00..17.07 rows=5 width=16) (actual time=0.008..0.008 rows=0 loops=1) Index Cond: (f2 = '2004-02-15'::date) - Index Scan using foo_2004_04_idx1 on foo_2004_04 foo (cost=0.00..17.07 rows=5 width=16) (actual time=0.095..0.095 rows=0 loops=1) Index Cond: (f2 = '2004-02-15'::date) Total runtime: 0.443 ms (13 rows) For loading data, we COPY into foo, and have a trigger that redirects the rows to the appropriate partition. Notice that the partitions which do not contain any data of interest are still probed for data, but since they have none it is very quick. In a real life example I got the following results just this afternoon: - aggregate row count = 471,849,665 - total number inherited tables = 216 (many are future dated and therefore contain no data) - select one month's worth of data for one piece of equipment by serial number (49,257 rows) = 526.015 ms Not too bad -- quick enough for my needs. BTW, this is using NFS mounted storage (NetApp NAS). Joe ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Data Warehouse Reevaluation - MySQL vs Postgres --
Christopher Browne wrote: In the last exciting episode, [EMAIL PROTECTED] (Joe Conway) wrote: That's exactly what we're doing, but using inherited tables instead of a union view. With inheritance, there is no need to rebuild the view each time a table is added or removed. Basically, in our application, tables are partitioned by either month or week, depending on the type of data involved, and queries are normally date qualified. Where does the constraint come in that'll allow most of the data to be excluded? Not sure I follow this. Or is this just that the entries are all part of bigtable so that the self join is only 2-way? We don't have a need for self-joins in our application. We do use a crosstab function to materialize some transposed views of the data, however. That allows us to avoid self-joins in the cases where we might otherwise need them. Joe ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] Data Warehouse Reevaluation - MySQL vs Postgres --
Chris Browne wrote: Might we set up the view as: create view combination_of_logs as select * from table_1 where txn_date between 'this' and 'that' union all select * from table_2 where txn_date between 'this2' and 'that2' union all select * from table_3 where txn_date between 'this3' and 'that3' union all select * from table_4 where txn_date between 'this4' and 'that4' union all ... ad infinitum union all select * from table_n where txn_date 'start_of_partition_n'; and expect that to help, as long as the query that hooks up to this has date constraints? We'd have to regenerate the view with new fixed constants each time we set up the tables, but that sounds like it could work... That's exactly what we're doing, but using inherited tables instead of a union view. With inheritance, there is no need to rebuild the view each time a table is added or removed. Basically, in our application, tables are partitioned by either month or week, depending on the type of data involved, and queries are normally date qualified. We're not completely done with our data conversion (from a commercial RDBMSi), but so far the results have been excellent. Similar to what others have said in this thread, the conversion involved restructuring the data to better suit Postgres, and the application (data analysis/mining vs. the source system which is operational). As a result we've compressed a 1TB database down to ~0.4TB, and seen at least one typical query reduced from ~9 minutes down to ~40 seconds. Joe ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] NAS, SAN or any alternate solution ?
[EMAIL PROTECTED] wrote: Would NAS or SAN be good solutions ? (I've read that NAS uses NFS which could slow down the transfer rate ??) Has anyone ever tried one of these with postgresql ? Not (yet) with Postgres, but my company has run ~100GB Oracle database on NAS (NetApp) for the past couple of years. We've found it to outperform local attached storage, and it has been extremely reliable and flexible. Our DBAs wouldn't give it up without a fight. Joe ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] postgresql and openmosix migration
Bill wrote: Ok, so maybe someone on this group will have a better idea. We have a database of financial information, and this has literally millions of entries. I have installed indicies, but for the rather computationally demanding processes we like to use, like a select query to find the commodity with the highest monthly or annual returns, the computer generally runs unacceptably slow. So, other than clustring, how could I achieve a speed increase in these complex queries? Is this better in mysql or postgresql? If the bottleneck is really computational, not I/O, you might try PL/R in conjunction with the rpvm R package. rpvm allows R to make use of pvm to split its load among a cluster. See: R: http://www.r-project.org/ PL/R: http://www.joeconway.com/plr/ rpvm: http://cran.r-project.org/src/contrib/Descriptions/rpvm.html http://cran.r-project.org/doc/packages/rpvm.pdf I haven't had a chance to play with this myself yet, but I hope to relatively soon. HTH, Joe ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Looking for ideas on how to speed up warehouse loading
Sean Shanny wrote: explain analyze SELECT t1.id, t2.url FROM referral_temp t2 LEFT OUTER JOIN d_referral t1 ON t2.url = t1.referral_raw_url ORDER BY t1.id; What I would like to know is if there are better ways to do the join? I need to get all the rows back from the referral_temp table as they are used for assigning FK's for the fact table later in processing. When I iterate over the values that I get back those with t1.id = null I assign a new FK and push both into the d_referral table as new entries as well as a text file for later use. The matching records are written to a text file for later use. Would something like this work any better (without disabling index scans): SELECT t1.id, t2.url FROM referral_temp t2, d_referral t1 WHERE t1.referral_raw_url = t2.url; process rows with a match SELECT t1.id, t2.url FROM referral_temp t2 WHERE NOT EXISTS (select 1 FROM d_referral t1 WHERE t1.referral_raw_url = t2.url); process rows without a match ? Joe ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Wierd context-switching issue on Xeon
Joe Conway wrote: In isolation, test_run.sql should do essentially no syscalls at all once it's past the initial ramp-up. On a machine that's functioning per expectations, multiple copies of test_run show a relatively low rate of semop() calls --- a few per second, at most --- and maybe a delaying select() here and there. Here's results for 7.4 on a dual Athlon server running fedora core: CPU states: cpuusernice systemirq softirq iowaitidle total 86.0%0.0% 52.4% 0.0% 0.0%0.0% 61.2% cpu00 37.6%0.0% 29.7% 0.0% 0.0%0.0% 32.6% cpu01 48.5%0.0% 22.7% 0.0% 0.0%0.0% 28.7% procs memory swap io system cpu r b swpd free buff cache si sobibo incs 1 0 120448 25764 48300 109457600 0 124 170 187 1 0 120448 25780 48300 109457600 0 0 15289 2 0 120448 25744 48300 109458000 060 141 78290 2 0 120448 25752 48300 109458000 0 0 131 140326 2 0 120448 25756 48300 109457600 040 122 140100 2 0 120448 25764 48300 109458400 060 133 136595 2 0 120448 24284 48300 109458400 0 200 138 135151 The jump in cs corresponds to starting the query in the second session. Joe ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Wierd context-switching issue on Xeon
scott.marlowe wrote: On Mon, 19 Apr 2004, Bruce Momjian wrote: I have BSD on a SuperMicro dual Xeon, so if folks want another hardware/OS combination to test, I can give out logins to my machine. I can probably do some nighttime testing on a dual 2800MHz non-MP Xeon machine as well. It's a Dell 2600 series machine and very fast. It has the moderately fast 533MHz FSB so may not have as many problems as the MP type CPUs seem to be having. I've got a quad 2.8Ghz MP Xeon (IBM x445) that I could test on. Does anyone have a test set that can reliably reproduce the problem? Joe ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Wierd context-switching issue on Xeon
Tom Lane wrote: Here is a test case. To set up, run the test_setup.sql script once; then launch two copies of the test_run.sql script. (For those of you with more than two CPUs, see whether you need one per CPU to make trouble, or whether two test_runs are enough.) Check that you get a nestloops-with-index-scans plan shown by the EXPLAIN in test_run. Check. In isolation, test_run.sql should do essentially no syscalls at all once it's past the initial ramp-up. On a machine that's functioning per expectations, multiple copies of test_run show a relatively low rate of semop() calls --- a few per second, at most --- and maybe a delaying select() here and there. What I actually see on Josh's client's machine is a context swap storm: vmstat 1 shows CS rates around 170K/sec. strace'ing the backends shows a corresponding rate of semop() syscalls, with a few delaying select()s sprinkled in. top(1) shows system CPU percent of 25-30 and idle CPU percent of 16-20. Your test case works perfectly. I ran 4 concurrent psql sessions, on a quad Xeon (IBM x445, 2.8GHz, 4GB RAM), hyperthreaded. Heres what 'top' looks like: 177 processes: 173 sleeping, 3 running, 1 zombie, 0 stopped CPU states: cpuusernice systemirq softirq iowaitidle total 35.9%0.0%7.2% 0.0% 0.0%0.0% 56.8% cpu00 19.6%0.0%4.9% 0.0% 0.0%0.0% 75.4% cpu01 44.1%0.0%7.8% 0.0% 0.0%0.0% 48.0% cpu020.0%0.0%0.0% 0.0% 0.0%0.0% 100.0% cpu03 32.3%0.0% 13.7% 0.0% 0.0%0.0% 53.9% cpu04 21.5%0.0% 10.7% 0.0% 0.0%0.0% 67.6% cpu05 42.1%0.0%9.8% 0.0% 0.0%0.0% 48.0% cpu06 100.0%0.0%0.0% 0.0% 0.0%0.0%0.0% cpu07 27.4%0.0% 10.7% 0.0% 0.0%0.0% 61.7% Mem: 4123700k av, 3933896k used, 189804k free, 0k shrd, 221948k buff 2492124k actv, 760612k in_d, 41416k in_c Swap: 2040244k av, 5632k used, 2034612k free 3113272k cached Note that cpu06 is not a postgres process. The output of vmstat looks like this: # vmstat 1 procs memory swap io system cpu r b swpd free buff cache si so bi bo in cs us sy id wa 4 0 5632 184264 221948 3113308 0 000 00 0 0 0 0 3 0 5632 184264 221948 3113308 0 000 112 211894 36 9 55 0 5 0 5632 184264 221948 3113308 0 000 125 222071 39 8 53 0 4 0 5632 184264 221948 3113308 0 000 110 215097 39 10 52 0 1 0 5632 184588 221948 3113308 0 00 96 139 187561 35 10 55 0 3 0 5632 184588 221948 3113308 0 000 114 241731 38 10 52 0 3 0 5632 184920 221948 3113308 0 000 132 257168 40 9 51 0 1 0 5632 184912 221948 3113308 0 000 114 251802 38 9 54 0 Note the test case assumes you've got shared_buffers set to at least 1000; with smaller values, you may get some I/O syscalls, which will probably skew the results. shared_buffers 16384 (1 row) I found that killing three of the four concurrent queries dropped context switches to about 70,000 to 100,000. Two or more sessions brings it up to 200K+. Joe ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Toooo many context switches (maybe SLES8?)
Dirk Lutzebäck wrote: Joe, do you know where I should look in the 7.4.2 code to find this out? I think I was wrong. I just looked in CVS and found the commit I was thinking about: http://developer.postgresql.org/cvsweb.cgi/pgsql-server/src/backend/storage/lmgr/s_lock.c.diff?r1=1.22r2=1.23 http://developer.postgresql.org/cvsweb.cgi/pgsql-server/src/include/storage/s_lock.h.diff?r1=1.123r2=1.124 = Revision 1.23 / (download) - [select for diffs] , Sat Dec 27 20:58:58 2003 UTC (3 months, 2 weeks ago) by tgl Changes since 1.22: +5 -1 lines Diff to previous 1.22 Improve spinlock code for recent x86 processors: insert a PAUSE instruction in the s_lock() wait loop, and use test before test-and-set in TAS() macro to avoid unnecessary bus traffic. Patch from Manfred Spraul, reworked a bit by Tom. = I thought this had been committed to the 7.4 stable branch as well, but it appears not. Joe ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] SETOF performance
Jeff wrote: I think it was on this list - someone posted a message about SETOF being slower. Tom replied saying it was because it needed to create an on-disk tuplestore. I was just looking for some clarification - a SETOF function will always write the reslting tuples to disk (Not buffering in say a sort_mem sized buffer)? I think at least part of what you're seeing is normal function call overhead. As far as tuplestores writing to disk, here's what the source says: In src/backend/utils/sort/tuplestore.c 8--- * maxKBytes: how much data to store in memory (any data beyond this * amount is paged to disk). When in doubt, use work_mem. */ Tuplestorestate * tuplestore_begin_heap(bool randomAccess, bool interXact, int maxKBytes) 8--- In src/backend/executor/execQual.c:ExecMakeTableFunctionResult(): 8--- tupstore = tuplestore_begin_heap(true, false, work_mem); 8--- So up to work_mem (sort_mem in 7.4 and earlier) should be stored in memory. Joe ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] rapid degradation after postmaster restart
Tom Lane wrote: Joe Conway [EMAIL PROTECTED] writes: Any idea where I can get my hands on the latest version. I found the original post from Tom, but I thought there was a later version with both number of pages and time to sleep as knobs. That was as far as I got. I think Jan posted a more complex version that would still be reasonable to apply to 7.4. I have tested Tom's original patch now. The good news -- it works great in terms of reducing the load imposed by vacuum -- almost to the level of being unnoticeable. The bad news -- in a simulation test which loads an hour's worth of data, even with delay set to 1 ms, vacuum of the large table exceeds two hours (vs 12-14 minutes with delay = 0). Since that hourly load is expected 7 x 24, this obviously isn't going to work. The problem with Jan's more complex version of the patch (at least the one I found - perhaps not the right one) is it includes a bunch of other experimental stuff that I'd not want to mess with at the moment. Would changing the input units (for the original patch) from milli-secs to micro-secs be a bad idea? If so, I guess I'll get to extracting what I need from Jan's patch. Thanks, Joe ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] rapid degradation after postmaster restart
Matthew T. O'Connor wrote: If memory serves, the problem is that you actually sleep 10ms even when you set it to 1. One of the thing changed in Jan's later patch was the ability to specify how many pages to work on before sleeping, rather than how long to sleep inbetween every 1 page. You might be able to do a quick hack and have it do 10 pages or so before sleeping. I thought I remembered something about that. It turned out to be less difficult than I first thought to extract the vacuum delay stuff from Jan's performance patch. I haven't yet tried it out, but it's attached in case you are interested. I'll report back once I have some results. Joe Index: src/backend/access/nbtree/nbtree.c === RCS file: /home/pgsql/CvsRoot/pgsql-server/src/backend/access/nbtree/nbtree.c,v retrieving revision 1.106 diff -c -b -r1.106 nbtree.c *** src/backend/access/nbtree/nbtree.c 2003/09/29 23:40:26 1.106 --- src/backend/access/nbtree/nbtree.c 2003/11/03 17:56:54 *** *** 18,23 --- 18,25 */ #include postgres.h + #include unistd.h + #include access/genam.h #include access/heapam.h #include access/nbtree.h *** *** 27,32 --- 29,39 #include storage/smgr.h + extern intvacuum_page_delay; + extern intvacuum_page_groupsize; + extern intvacuum_page_groupcount; + + /* Working state for btbuild and its callback */ typedef struct { *** *** 610,615 --- 617,631 CHECK_FOR_INTERRUPTS(); + if (vacuum_page_delay 0) + { + if (++vacuum_page_groupcount = vacuum_page_groupsize) + { + vacuum_page_groupcount = 0; + usleep(vacuum_page_delay * 1000); + } + } + ndeletable = 0; page = BufferGetPage(buf); opaque = (BTPageOpaque) PageGetSpecialPointer(page); *** *** 736,741 --- 752,768 Buffer buf; Pagepage; BTPageOpaque opaque; + + CHECK_FOR_INTERRUPTS(); + + if (vacuum_page_delay 0) + { + if (++vacuum_page_groupcount = vacuum_page_groupsize) + { + vacuum_page_groupcount = 0; + usleep(vacuum_page_delay * 1000); + } + } buf = _bt_getbuf(rel, blkno, BT_READ); page = BufferGetPage(buf); Index: src/backend/commands/vacuumlazy.c === RCS file: /home/pgsql/CvsRoot/pgsql-server/src/backend/commands/vacuumlazy.c,v retrieving revision 1.32 diff -c -b -r1.32 vacuumlazy.c *** src/backend/commands/vacuumlazy.c 2003/09/25 06:57:59 1.32 --- src/backend/commands/vacuumlazy.c 2003/11/03 17:57:27 *** *** 37,42 --- 37,44 */ #include postgres.h + #include unistd.h + #include access/genam.h #include access/heapam.h #include access/xlog.h *** *** 88,93 --- 90,99 static TransactionId OldestXmin; static TransactionId FreezeLimit; + int vacuum_page_delay = 0; /* milliseconds per page group */ + int vacuum_page_groupsize = 10; /* group size */ + int vacuum_page_groupcount = 0; /* current group size count */ + /* non-export function prototypes */ static void lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats, *** *** 228,233 --- 234,248 CHECK_FOR_INTERRUPTS(); + if (vacuum_page_delay 0) + { + if (++vacuum_page_groupcount = vacuum_page_groupsize) + { + vacuum_page_groupcount = 0; + usleep(vacuum_page_delay * 1000); + } + } + /* * If we are close to overrunning the available space for * dead-tuple TIDs, pause and do a cycle of vacuuming before we *** *** 469,474 --- 484,498 CHECK_FOR_INTERRUPTS(); + if (vacuum_page_delay 0) + { + if (++vacuum_page_groupcount = vacuum_page_groupsize) + { + vacuum_page_groupcount = 0; + usleep(vacuum_page_delay * 1000); + } + } + tblk = ItemPointerGetBlockNumber(vacrelstats-dead_tuples[tupindex]); buf = ReadBuffer(onerel, tblk);
Re: [PERFORM] rapid degradation after postmaster restart
Matthew T. O'Connor wrote: I think you understand correctly. A table with 1,000,000 rows should get vacuumed approx every 2,000,000 changes (assuming default values for -V ). FYI and insert and a delete count as one change, but and update counts as two. Unfortunately, the running with -d2 would show the numbers that pg_autovacuum is using to decide if it when it should vacuum or analyze.Also, are you sure that it vacuumed more than once and wasn't doing analyzes most of the time? Yeah, I'm sure. Snippets from the log: [...lots-o-tables...] [2004-03-14 12:44:48 PM] added table: specdb.public.parametric_states [2004-03-14 12:49:48 PM] Performing: VACUUM ANALYZE public.transaction_data [2004-03-14 01:29:59 PM] Performing: VACUUM ANALYZE public.transaction_data [2004-03-14 02:08:26 PM] Performing: ANALYZE public.out_of_spec [2004-03-14 02:08:26 PM] Performing: VACUUM ANALYZE public.transaction_data [2004-03-14 02:22:44 PM] Performing: VACUUM ANALYZE public.spc_graphs [2004-03-14 03:06:45 PM] Performing: VACUUM ANALYZE public.out_of_spec [2004-03-14 03:06:45 PM] Performing: VACUUM ANALYZE public.transaction_data [2004-03-14 03:19:51 PM] Performing: VACUUM ANALYZE public.spc_graphs [2004-03-14 03:21:09 PM] Performing: ANALYZE public.parametric_states [2004-03-14 03:54:57 PM] Performing: ANALYZE public.out_of_spec [2004-03-14 03:54:57 PM] Performing: VACUUM ANALYZE public.transaction_data [2004-03-14 04:07:52 PM] Performing: VACUUM ANALYZE public.spc_graphs [2004-03-14 04:09:33 PM] Performing: ANALYZE public.equip_status_history [2004-03-14 04:09:33 PM] Performing: VACUUM ANALYZE public.parametric_states [2004-03-14 04:43:46 PM] Performing: VACUUM ANALYZE public.out_of_spec [2004-03-14 04:43:46 PM] Performing: VACUUM ANALYZE public.transaction_data [2004-03-14 04:56:35 PM] Performing: VACUUM ANALYZE public.spc_graphs [2004-03-14 04:58:32 PM] Performing: ANALYZE public.parametric_states [2004-03-14 05:28:58 PM] added database: specdb This is the entire period of the first test, with default autovac settings. The table public.transaction_data is the one with 28 million active rows. The entire test run inserts about 600 x 600 = 360,000 rows, out of which roughly two-thirds are later deleted. That's unfortunate as that is the detail we need to see what pg_autovacuum thinks is really going on. We had a similar sounding crash on FreeBSD due to some unitialized variables that were being printed out by the debug code, however that was fixed a long time ago. Any chance you can look into this? I can try. The server belongs to another department, and they are under the gun to get back on track with their testing. Also, they compiled without debug symbols, so I need to get permission to recompile. Yes I would be very curious to see the results with the vacuum delay patch installed (is that patch applied to HEAD?) Any idea where I can get my hands on the latest version. I found the original post from Tom, but I thought there was a later version with both number of pages and time to sleep as knobs. Thanks, Joe ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] rapid degradation after postmaster restart
Tom Lane wrote: Joe Conway [EMAIL PROTECTED] writes: Any idea where I can get my hands on the latest version. I found the original post from Tom, but I thought there was a later version with both number of pages and time to sleep as knobs. That was as far as I got. I think Jan posted a more complex version that would still be reasonable to apply to 7.4. I thought that too, but was having trouble finding it. I'll look again. Thanks, Joe ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] rapid degradation after postmaster restart
Matthew T. O'Connor wrote: Strange... I wonder if this is some integer overflow problem. There was one reported recently and fixed as of CVS head yesterday, you might try that, however without the -d2 output I'm only guessing at why pg_autovacuum is vacuuming so much / so often. I'll see what I can do tomorrow to track it down. I have already recommended to the program manager that they switch to 7.4.2 plus the autovacuum patch. Not sure they will be willing to make any changes at this stage in their release process though. If we can't find one, any chance you can do some testing with CVS HEAD just to see if that works any better. I know there has been a fair amount of work done to improve this situation (not just vacuum delay, but ARC etc...) I might do that, but not likely on Solaris. I can probably get a copy of the current database and testing scripts, and give it a try on one of my own machines (all Linux, either RHAS3, RH9, or Fedora). Joe ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] rapid degradation after postmaster restart
Marty Scholes wrote: I have seen similar results to what you are describing. I found that running a full vacuum: vacuumdb -fza followed by a checkpoint makes it run fast again. Try timing the update with and without a full vacuum. Will do. I'll let you know how it goes. Thanks for the reply. Joe ---(end of broadcast)--- TIP 8: explain analyze is your friend
[PERFORM] rapid degradation after postmaster restart
I'm trying to troubleshoot a performance issue on an application ported from Oracle to postgres. Now, I know the best way to get help is to post the schema, explain analyze output, etc, etc -- unfortunately I can't do that at the moment. However, maybe someone can point me in the right direction to figure this out on my own. That said, here are a few details... PostgreSQL 7.4.1 bash-2.03$ uname -a SunOS col65 5.8 Generic_108528-27 sun4u sparc SUNW,Sun-Fire-280R The problem is this: the application runs an insert, that fires off a trigger, that cascades into a fairly complex series of functions, that do a bunch of calculations, inserts, updates, and deletes. Immediately after a postmaster restart, the first insert or two take about 1.5 minutes (undoubtedly this could be improved, but it isn't the main issue). However by the second or third insert, the time increases to 7 - 9 minutes. Restarting the postmaster causes the cycle to repeat, i.e. the first one or two inserts are back to the 1.5 minute range. Any ideas spring to mind? I don't have much experience with Postgres on Solaris -- could it be related to that somehow? Thanks for any insights. Joe ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Maximum Possible Insert Performance?
William Yu wrote: My situation is this. We have a semi-production server where we pre-process data and then upload the finished data to our production servers. We need the fastest possible write performance. Having the DB go corrupt due to power loss/OS crash is acceptable because we can always restore from last night and re-run everything that was done since then. If you can, use COPY -- it is far faster than INSERT. See: http://www.postgresql.org/docs/current/static/sql-copy.html HTH, Joe ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Another weird one with an UPDATE
David Griffiths wrote: Yes, the query operates only on indexed columns (all numeric(10)'s). Column |Type | Modifiers ---+-+-- --- user_account_id | numeric(10,0) | not null [snip] Indexes: user_account_pkey primary key btree (user_account_id), Foreign Key constraints: $1 FOREIGN KEY (lang_id) REFERENCES lang(lang_id) ON UPDATE NO ACTION ON DELETE NO ACTION, $2 FOREIGN KEY (source_id) REFERENCES source(source_id) ON UPDATE NO ACTION ON DELETE NO ACTION, $3 FOREIGN KEY (user_role_id) REFERENCES user_role(user_role_id) ON UPDATE NO ACTION ON DELETE NO ACTION And what about commercial_entity.user_account_id. Is it indexed and what is its data type (i.e. does it match numeric(10,0))? Also, have you run VACUUM ANALYZE lately? Joe ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] Another weird one with an UPDATE
David Griffiths wrote: I think you want something like: UPDATE user_account SET last_name = 'abc' WHERE EXISTS (SELECT 1 FROM commercial_entity ce, commercial_service cs WHERE user_account.user_account_id = ce.user_account_id AND ce.commercial_entity_id = cs.commercial_entity_id); Unfort, this is still taking a long time. --- Seq Scan on user_account (cost=0.00..748990.51 rows=36242 width=716) Do you have an index on user_account.user_account_id? Joe ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Compare rows
Greg Spiegelberg wrote: The reason for my initial question was this. We save changes only. In other words, if system S has row T1 for day D1 and if on day D2 we have another row T1 (excluding our time column) we don't want to save it. It still isn't entirely clear to me what you are trying to do, but perhaps some sort of calculated checksum or hash would work to determine if the data has changed? Joe ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [SQL] [PERFORM] EXTERNAL storage and substring on long strings
Scott Cain wrote: Oh, and I forgot to mention: it's highly compressed (bzip2 -9) and is 109M. Thanks. I'll grab a copy from home later today and see if I can find some time to poke at it. Joe ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] EXTERNAL storage and substring on long strings
Tom Lane wrote: Scott Cain [EMAIL PROTECTED] writes: A few days ago, I asked for advice on speeding up substring queries on the GENERAL mailing list. Joe Conway helpfully pointed out the ALTER TABLE STORAGE EXTERNAL documentation. After doing the alter, the queries got slower! Here is the background: Ah-hah, I've sussed it ... you didn't actually change the storage representation. You wrote: Yeah, I came to the same conclusion this morning (update longdna set dna = dna || '';), but it still seems that the chunked table is very slightly faster than the substring on the externally stored column: dna=# explain analyze select pdna from dna where foffset 600 and foffset 6024000; QUERY PLAN -- Index Scan using foffset_idx on dna (cost=0.00..4.22 rows=14 width=32) (actual time=0.06..0.16 rows=11 loops=1) Index Cond: ((foffset 600) AND (foffset 6024000)) Total runtime: 0.27 msec (3 rows) dna=# explain analyze select pdna from dna where foffset 600 and foffset 6024000; QUERY PLAN -- Index Scan using foffset_idx on dna (cost=0.00..4.22 rows=14 width=32) (actual time=0.07..0.16 rows=11 loops=1) Index Cond: ((foffset 600) AND (foffset 6024000)) Total runtime: 0.25 msec (3 rows) dna=# explain analyze select substr(dna,6002000,2) from longdna; QUERY PLAN Seq Scan on longdna (cost=0.00..1.01 rows=1 width=32) (actual time=0.46..0.47 rows=1 loops=1) Total runtime: 0.58 msec (2 rows) dna=# explain analyze select substr(dna,6002000,2) from longdna; QUERY PLAN Seq Scan on longdna (cost=0.00..1.01 rows=1 width=32) (actual time=0.23..0.24 rows=1 loops=1) Total runtime: 0.29 msec (2 rows) I ran each command twice after starting psql to observe the effects of caching. However with the provided sample data, longdna has only one row, and dna has 122,540 rows, all of which are chunks of the one longdna row. I would tend to think that if you had 1000 or so longdna records indexed on some id column, versus 122,540,000 dna chunks indexed on both an id and segment column, the substring from longdna would win. Joe ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] Odd performance results - more info
Medora Schauer wrote: I would greatly appreciate it if someone could run this code in their environment and let me know if you get results similiar to mine. The INT test results in execution times of 11 - 50+ secs increasing each time the test is run. The FLOAT test execution times are consistently 3 secs regardless of how many times it is run. Without actually trying the code, I'd bet that an index is getting used for the float8 case, but not in the int8 case: if (dtype == INT_TYPE){ snprintf(buffer, sizeof(buffer), UPDATE shot_record SET trace_count = %d \ WHERE shot_line_num = %d \ AND shotpoint = %d \ AND index = %d , 0, shotline, shotpoint + i, 0); } else { snprintf(buffer, sizeof(buffer), UPDATE shot_record SET trace_count = %d \ WHERE shot_line_num = %f \ AND shotpoint = %f \ AND index = %d , 0, (float)shotline, (float)shotpoint + (float)i, 0); } Try running EXPLAIN ANALYZE on these update statements manually. It also might help to run VACUUM ANALYZE after populating the tables. HTH, Joe ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] One table or many tables for data set
Castle, Lindsay wrote: The data structure looks like this: element date num1 num2 num3 num4 units There are approx 12,000 distinct elements for a total of about 6 million rows of data. Ahh, that helps! So are the elements evenly distributed, i.e. are there approx 500 rows of each element? If so, it should be plenty quick to put all the data in one table with an index on element (and maybe a multicolumn key, depending on other factors). The scanning technology I want to use may need a different number of rows and different columns depending on the scan formula; eg scan1 may need num1, num2 and num3 from the last 200 rows for element x scan2 may need num1, units from the last 10 rows for element y When you say last X rows, do you mean sorted by date? If so, you might want that index to be on (element, date). Then do: SELECT num1, num2, num3 FROM mytable WHERE element = 'an_element' order by date DESC LIMIT 20; Replace num1, num2, num3 by whatever columns you want, and LIMIT X as the number of rows you want. HTH, Joe ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Hardware performance
Hannu Krosing wrote: How do you do RAID 1+0 with just two drives ? Hmm, good point -- I must have been tired last night ;-). With two drives you can do mirroring or striping, but not both. Usually I've seen a pair of mirrored drives for the OS, and a RAID 1+0 array for data. But that requires 6 drives, not 5. On non-database servers usually the data array is RAID 5, and you could get away with 5 drives (as someone else pointed out). As I said, I've never personally found it necessary to move WAL off to a different physical drive. What do you think is the best configuration given the constraint of 5 drives? 1 drive for OS, and 4 for RAID 1+0 for data-plus-WAL? I guess the ideal would be to find enough money for that 6th drive, use the mirrored pair for both OS and WAL. Joe ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Hardware performance
Adam Witney wrote: I think the issue from the original posters point of view is that the Dell PE2650 can only hold a maximum of 5 internal drives True enough, but maybe that's a reason to be looking at other alternatives. I think he said the hardware hasn't been bought yet. Joe ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Hardware performance
Jean-Luc Lachance wrote: I am currious. How can you have RAID 1+0 with only 2 drives? If you are thinking about partitioning the drives, wont this defeate the purpose? Yeah -- Hannu already pointed out that my mind was fuzzy when I made that statement :-(. See subsequent posts. Joe ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Hardware performance
Adam Witney wrote: Actually I am going through the same questions myself at the moment I would like to have a 2 disk RAID1 and a 4 disk RAID5, so need at least 6 disks Anybody have any suggestions or experience with other hardware manufacturers for this size of setup? (2U rack, up to 6 disks, 2 processors, ~2GB RAM, if possible) I tend to use either 1U or 4U servers, depending on the application. But I've had good experiences with IBM recently, and a quick look on their site shows the x345 with these specs: 2U, 2-way server delivers extreme performance and availability for demanding applications Up to 2 Intel Xeon processors up to 3.06GHz with 533MHz front-side bus speed for outstanding performance Features up to 8GB of DDR memory, 5 PCI (4 PCI-X) slots and up to 6 hard disk drives for robust expansion Hot-swap redundant cooling, power and hard disk drives for high availability Integrated dual Ultra320 SCSI with RAID-1 for data protection This may not wrap well, but here is the url: http://www-132.ibm.com/webapp/wcs/stores/servlet/CategoryDisplay?catalogId=-840storeId=1categoryId=2559454langId=-1dualCurrId=73 Handles 6 drives; maybe that fits the bill? Joe ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Sanity check requested
Nick Fankhauser wrote: Thanks for the correction- it sounds like this is one where usage can't be precisely controlled in a dynamic user environment I just need to get a feel for what works under a load that approximates my production system. I think the most important point here is that if you set sort_mem too high, and you have a lot of simultaneous sorts, you can drive the server into swapping, which obviously is a very bad thing. You want it set as high as possible, but not so high given your usage patterns that you wind up swapping. Joe ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html