Re: [PERFORM] Benchmark Data requested
On Mon, 2008-02-04 at 17:33 -0500, Jignesh K. Shah wrote: First of all I think it is a violation of TPC rules to publish numbers without auditing them first. So even if I do the test to show the better performance of PostgreSQL 8.3, I cannot post it here or any public forum without doing going through the process. I'm not interested in the final results, pricing etc.. Just a query by query elapsed times. Can you show which part of the rules precludes this? I can't find it. This is a developer list, so publishing things here is what we do for discussion, so it's hardly breaking the spirit of the TPC rules to publish results here, in the hope of focusing development effort. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Performance problems inside a stored procedure.
Thanks for your help Андрей your English is easily understandable and much better than my ... (Russian?). I managed to get the results of an analyze and this showed that an index was not being used correctly. It seems that I was passing in a varchar and not casting it to an int and this stopped the index from being used. I suppose this is a change in the implicit casting rules between version 7.4.7 and 8.x. Once I added the explicit cast the function now uses the correct plan and returns in about 3 ms which I suppose is the performance hit that a function call has. Anyway thanks very much for your time. Regards Matthew Андрей Репко wrote: Hello Matthew, Monday, January 28, 2008, 2:02:26 PM, Вы писали: ML I have a query which runs pretty quick ( 0.82ms) but when I put it ML inside a stored procedure it takes 10 times as long (11.229ms). Is ML this what you would expect and is there any way that I can get around ML this time delay? ML postgres.conf changes. ML shared_buffers = 500MB ML work_mem = 10MB ML maintenance_work_mem = 100MB ML effective_cache_size = 2048MB ML default_statistics_target = 1000 ML Thanks for any help. When you run it outside stored procedure optimizer know about your parameters, and know what rows (estimate count) will be selected, so it can create fine plan. When you put it into SP optimizer don't know nothing about value of your parameters, but MUST create plan for it. If table is frequently updateable plan, what was created for SP became bad, and need replaning. It's sample for obtaining plan (LeXa NalBat): create function f1 ( integer, integer ) returns void language plpgsql as $body$ declare _rec record; begin for _rec in explain -- put your query here select count(*) from t1 where id between $1 and $2 loop raise info '%', _rec.QUERY PLAN; end loop; return; end; $body$; Sorry for bad English. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Benchmark Data requested
On Mon, 2008-02-04 at 17:55 -0500, Jignesh K. Shah wrote: Doing it at low scales is not attractive. Commercial databases are publishing at scale factor of 1000(about 1TB) to 1(10TB) with one in 30TB space. So ideally right now tuning should start at 1000 scale factor. I don't understand this. Sun is currently publishing results at 100GB, 300GB etc.. Why would we ignore those and go for much higher numbers? Especially when you explain why we wouldn't be able to. There isn't any currently valid result above 10 TB. If anybody is going to run tests in response to my request, then *any* scale factor is interesting, on any hardware. If that means Scale Factor 1, 3, 10 or 30 then that's fine by me. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] Performance issue using Tsearch2
Hi, I'm having a performance problem on a request using Tsearch2: the request strangely takes several minutes. I've tried to follow Tsearch tuning recommendations, I've searched through the archives, but I can't seem to find a solution to solve my problem. The ts_vector field was created using dictionnary fr_ispell only on types lword, lpart_hword and lhword. An index was created on this field. According to the stat() function, there are only 42,590 word stems indexed. I also did a VACUUM FULL ANALYZE. Here's the result of EXPLAIN ANALYZE on a filtered version of my request (the non-filtered version takes so long I usually cancel it): ** explain analyze SELECT idstruct, headline(zonetext, q), rank(zoneindex_test, q) FROM tab_ocr, tab_chemin, to_tsquery('partir') AS q WHERE tab_chemin.chemin like '%;2;%' AND tab_ocr.idstruct = tab_chemin.label AND zoneindex_test @@ q ORDER BY rank(zoneindex_test, q) DESC; QUERY PLAN - Sort (cost=2345.54..2345.58 rows=16 width=308) (actual time=270638.774..270643.142 rows=7106 loops=1) Sort Key: rank(tab_ocr.zoneindex_test, q.q) - Nested Loop (cost=80.04..2345.22 rows=16 width=308) (actual time=40886.553..270619.730 rows=7106 loops=1) - Nested Loop (cost=80.04..1465.76 rows=392 width=308) (actual time=38209.193..173932.313 rows=272414 loops=1) - Function Scan on q (cost=0.00..0.01 rows=1 width=32) (actual time=0.006..0.007 rows=1 loops=1) - Bitmap Heap Scan on tab_ocr (cost=80.04..1460.85 rows=392 width=276) (actual time=38209.180..173507.052 rows=272414 loops=1) Filter: (tab_ocr.zoneindex_test @@ q.q) - Bitmap Index Scan on zoneindex_test_idx (cost=0.00..79.94 rows=392 width=0) (actual time=38204.261..38204.261 rows=283606 loops=1) Index Cond: (tab_ocr.zoneindex_test @@ q.q) - Index Scan using tab_chemin_label_index on tab_chemin (cost=0.00..2.23 rows=1 width=4) (actual time=0.036..0.036 rows=0 loops=272414) Index Cond: (tab_ocr.idstruct = tab_chemin.label) Filter: ((chemin)::text ~~ '%;2;%'::text) Total runtime: 270647.946 ms ** Could someone help me analyze this problem? I don't manage to see if the problem comes from bad tsearch tuning, postgresql configuration, or something else... Thanks. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Performance issue using Tsearch2
On 2008-02-05 Viviane Lestic wrote: QUERY PLAN - Sort (cost=2345.54..2345.58 rows=16 width=308) (actual time=270638.774..270643.142 rows=7106 loops=1) Sort Key: rank(tab_ocr.zoneindex_test, q.q) - Nested Loop (cost=80.04..2345.22 rows=16 width=308) (actual time=40886.553..270619.730 rows=7106 loops=1) - Nested Loop (cost=80.04..1465.76 rows=392 width=308) (actual time=38209.193..173932.313 rows=272414 loops=1) - Function Scan on q (cost=0.00..0.01 rows=1 width=32) (actual time=0.006..0.007 rows=1 loops=1) - Bitmap Heap Scan on tab_ocr (cost=80.04..1460.85 rows=392 width=276) (actual time=38209.180..173507.052 rows=272414 loops=1) Filter: (tab_ocr.zoneindex_test @@ q.q) - Bitmap Index Scan on zoneindex_test_idx (cost=0.00..79.94 rows=392 width=0) (actual time=38204.261..38204.261 rows=283606 loops=1) Index Cond: (tab_ocr.zoneindex_test @@ q.q) - Index Scan using tab_chemin_label_index on tab_chemin (cost=0.00..2.23 rows=1 width=4) (actual time=0.036..0.036 rows=0 loops=272414) Index Cond: (tab_ocr.idstruct = tab_chemin.label) Filter: ((chemin)::text ~~ '%;2;%'::text) Total runtime: 270647.946 ms ** Could someone help me analyze this problem? Your planner estimates are way off. Try increasing the statistics target for the columns used in this query and re-analyze the tables after doing so. Regards Ansgar Wiechers -- The Mac OS X kernel should never panic because, when it does, it seriously inconveniences the user. --http://developer.apple.com/technotes/tn2004/tn2118.html ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Performance issue using Tsearch2
On Feb 5, 2008 12:47 PM, Viviane Lestic [EMAIL PROTECTED] wrote: Could someone help me analyze this problem? I don't manage to see if the problem comes from bad tsearch tuning, postgresql configuration, or something else... Can you try to replace zoneindex_test @@ q with zoneindex_test @@ to_tsquery('partir')? Increasing the statistics for zoneindex_test may be a good idea too (see ALTER TABLE ... ALTER COLUMN doc). I'm surprised you have the word partir in so many documents? Do you use real data? -- Guillaume ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Benchmark Data requested
Hi, Le lundi 04 février 2008, Jignesh K. Shah a écrit : Single stream loader of PostgreSQL takes hours to load data. (Single stream load... wasting all the extra cores out there) I wanted to work on this at the pgloader level, so CVS version of pgloader is now able to load data in parallel, with a python thread per configured section (1 section = 1 data file = 1 table is often the case). Not configurable at the moment, but I plan on providing a threads knob which will default to 1, and could be -1 for as many thread as sections. Multiple table loads ( 1 per table) spawned via script is bit better but hits wal problems. pgloader will too hit the WAL problem, but it still may have its benefits, or at least we will soon (you can already if you take it from CVS) be able to measure if the parallel loading at the client side is a good idea perf. wise. [...] I have not even started Partitioning of tables yet since with the current framework, you have to load the tables separately into each tables which means for the TPC-H data you need extra-logic to take that table data and split it into each partition child table. Not stuff that many people want to do by hand. I'm planning to add ddl-partitioning support to pgloader: http://archives.postgresql.org/pgsql-hackers/2007-12/msg00460.php The basic idea is for pgloader to ask PostgreSQL about constraint_exclusion, pg_inherits and pg_constraint and if pgloader recognize both the CHECK expression and the datatypes involved, and if we can implement the CHECK in python without having to resort to querying PostgreSQL, then we can run a thread per partition, with as many COPY FROM running in parallel as there are partition involved (when threads = -1). I'm not sure this will be quicker than relying on PostgreSQL trigger or rules as used for partitioning currently, but ISTM Jignesh quoted § is just about that. Comments? -- dim signature.asc Description: This is a digitally signed message part.
Re: [PERFORM] Benchmark Data requested
On Tue, 2008-02-05 at 14:43 +, Richard Huxton wrote: Simon Riggs wrote: On Tue, 2008-02-05 at 15:06 +0100, Dimitri Fontaine wrote: Le lundi 04 février 2008, Jignesh K. Shah a écrit : Multiple table loads ( 1 per table) spawned via script is bit better but hits wal problems. pgloader will too hit the WAL problem, but it still may have its benefits, or at least we will soon (you can already if you take it from CVS) be able to measure if the parallel loading at the client side is a good idea perf. wise. Should be able to reduce lock contention, but not overall WAL volume. In the case of a bulk upload to an empty table (or partition?) could you not optimise the WAL away? That is, shouldn't the WAL basically be a simple transformation of the on-disk blocks? You'd have to explicitly sync the file(s) for the table/indexes of course, and you'd need some work-around for WAL shipping, but it might be worth it for you chaps with large imports. Only by locking the table, which serializes access, which then slows you down or at least restricts other options. Plus if you use pg_loader then you'll find only the first few rows optimized and all the rest not. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Benchmark Data requested
On Tue, 5 Feb 2008, Richard Huxton wrote: In the case of a bulk upload to an empty table (or partition?) could you not optimise the WAL away? Argh. If I hadn't had to retype my email, I would have suggested that before you. ;) Matthew -- Unfortunately, university regulations probably prohibit me from eating small children in front of the lecture class. -- Computer Science Lecturer ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Benchmark Data requested
On Tue, 2008-02-05 at 15:06 +0100, Dimitri Fontaine wrote: Hi, Le lundi 04 février 2008, Jignesh K. Shah a écrit : Single stream loader of PostgreSQL takes hours to load data. (Single stream load... wasting all the extra cores out there) I wanted to work on this at the pgloader level, so CVS version of pgloader is now able to load data in parallel, with a python thread per configured section (1 section = 1 data file = 1 table is often the case). Not configurable at the moment, but I plan on providing a threads knob which will default to 1, and could be -1 for as many thread as sections. That sounds great. I was just thinking of asking for that :-) I'll look at COPY FROM internals to make this faster. I'm looking at this now to refresh my memory; I already had some plans on the shelf. Multiple table loads ( 1 per table) spawned via script is bit better but hits wal problems. pgloader will too hit the WAL problem, but it still may have its benefits, or at least we will soon (you can already if you take it from CVS) be able to measure if the parallel loading at the client side is a good idea perf. wise. Should be able to reduce lock contention, but not overall WAL volume. [...] I have not even started Partitioning of tables yet since with the current framework, you have to load the tables separately into each tables which means for the TPC-H data you need extra-logic to take that table data and split it into each partition child table. Not stuff that many people want to do by hand. I'm planning to add ddl-partitioning support to pgloader: http://archives.postgresql.org/pgsql-hackers/2007-12/msg00460.php The basic idea is for pgloader to ask PostgreSQL about constraint_exclusion, pg_inherits and pg_constraint and if pgloader recognize both the CHECK expression and the datatypes involved, and if we can implement the CHECK in python without having to resort to querying PostgreSQL, then we can run a thread per partition, with as many COPY FROM running in parallel as there are partition involved (when threads = -1). I'm not sure this will be quicker than relying on PostgreSQL trigger or rules as used for partitioning currently, but ISTM Jignesh quoted § is just about that. Much better than triggers and rules, but it will be hard to get it to work. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Benchmark Data requested
Simon Riggs wrote: On Tue, 2008-02-05 at 14:43 +, Richard Huxton wrote: Simon Riggs wrote: On Tue, 2008-02-05 at 15:06 +0100, Dimitri Fontaine wrote: Le lundi 04 février 2008, Jignesh K. Shah a écrit : Multiple table loads ( 1 per table) spawned via script is bit better but hits wal problems. pgloader will too hit the WAL problem, but it still may have its benefits, or at least we will soon (you can already if you take it from CVS) be able to measure if the parallel loading at the client side is a good idea perf. wise. Should be able to reduce lock contention, but not overall WAL volume. In the case of a bulk upload to an empty table (or partition?) could you not optimise the WAL away? That is, shouldn't the WAL basically be a simple transformation of the on-disk blocks? You'd have to explicitly sync the file(s) for the table/indexes of course, and you'd need some work-around for WAL shipping, but it might be worth it for you chaps with large imports. Only by locking the table, which serializes access, which then slows you down or at least restricts other options. Plus if you use pg_loader then you'll find only the first few rows optimized and all the rest not. Hmm - the table-locking requirement is true enough, but why would pg_loader cause problems after the first few rows? -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Benchmark Data requested
On Tue, 2008-02-05 at 15:05 +, Richard Huxton wrote: Only by locking the table, which serializes access, which then slows you down or at least restricts other options. Plus if you use pg_loader then you'll find only the first few rows optimized and all the rest not. Hmm - the table-locking requirement is true enough, but why would pg_loader cause problems after the first few rows? It runs a stream of COPY statements, so only first would be optimized with the empty table optimization. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Benchmark Data requested
On Tue, 5 Feb 2008, Simon Riggs wrote: In the case of a bulk upload to an empty table (or partition?) could you not optimise the WAL away? That is, shouldn't the WAL basically be a simple transformation of the on-disk blocks? You'd have to explicitly sync the file(s) for the table/indexes of course, and you'd need some work-around for WAL shipping, but it might be worth it for you chaps with large imports. Only by locking the table, which serializes access, which then slows you down or at least restricts other options. Plus if you use pg_loader then you'll find only the first few rows optimized and all the rest not. Why would you need to lock the table? Matthew -- Picard: I was just paid a visit from Q. Riker: Q! Any idea what he's up to? Picard: No. He said he wanted to be nice to me. Riker: I'll alert the crew. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Benchmark Data requested
Simon Riggs wrote: On Tue, 2008-02-05 at 15:05 +, Richard Huxton wrote: Only by locking the table, which serializes access, which then slows you down or at least restricts other options. Plus if you use pg_loader then you'll find only the first few rows optimized and all the rest not. Hmm - the table-locking requirement is true enough, but why would pg_loader cause problems after the first few rows? It runs a stream of COPY statements, so only first would be optimized with the empty table optimization. Ah, if you're allowing multiple commands during the process I can see how it could get fiddly. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Benchmark Data requested
---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Benchmark Data requested
Simon Riggs wrote: On Tue, 2008-02-05 at 15:06 +0100, Dimitri Fontaine wrote: Le lundi 04 février 2008, Jignesh K. Shah a écrit : Multiple table loads ( 1 per table) spawned via script is bit better but hits wal problems. pgloader will too hit the WAL problem, but it still may have its benefits, or at least we will soon (you can already if you take it from CVS) be able to measure if the parallel loading at the client side is a good idea perf. wise. Should be able to reduce lock contention, but not overall WAL volume. In the case of a bulk upload to an empty table (or partition?) could you not optimise the WAL away? That is, shouldn't the WAL basically be a simple transformation of the on-disk blocks? You'd have to explicitly sync the file(s) for the table/indexes of course, and you'd need some work-around for WAL shipping, but it might be worth it for you chaps with large imports. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Performance issue using Tsearch2
2008/2/5, Ansgar -59cobalt- Wiechers wrote: Your planner estimates are way off. Try increasing the statistics target for the columns used in this query and re-analyze the tables after doing so. I first set STATISTICS to 1000 for column zoneindex_test and saw no significant improvement (with a vacuum full analyze in between). Then I set default_statistics_target to 1000: there is now an improvement, but the overall time is still way too long... (and the estimated costs didn't change...) Here are the results with default_statistics_target set to 1000: explain analyze SELECT idstruct, headline(zonetext, q), rank(zoneindex_test, q) FROM tab_ocr, tab_chemin, to_tsquery('partir') AS q WHERE tab_chemin.chemin like '%;2;%' AND tab_ocr.idstruct = tab_chemin.label AND zoneindex_test @@ q ORDER BY rank(zoneindex_test, q) DESC; QUERY PLAN - Sort (cost=2345.30..2345.32 rows=8 width=327) (actual time=229913.715..229918.172 rows=7106 loops=1) Sort Key: rank(tab_ocr.zoneindex_test, q.q) - Nested Loop (cost=80.04..2345.18 rows=8 width=327) (actual time=28159.626..229892.957 rows=7106 loops=1) - Nested Loop (cost=80.04..1465.76 rows=392 width=327) (actual time=26084.558..130979.395 rows=272414 loops=1) - Function Scan on q (cost=0.00..0.01 rows=1 width=32) (actual time=0.006..0.007 rows=1 loops=1) - Bitmap Heap Scan on tab_ocr (cost=80.04..1460.85 rows=392 width=295) (actual time=26084.544..130562.220 rows=272414 loops=1) Filter: (tab_ocr.zoneindex_test @@ q.q) - Bitmap Index Scan on zoneindex_test_idx (cost=0.00..79.94 rows=392 width=0) (actual time=26073.315..26073.315 rows=283606 loops=1) Index Cond: (tab_ocr.zoneindex_test @@ q.q) - Index Scan using tab_chemin_label_index on tab_chemin (cost=0.00..2.23 rows=1 width=4) (actual time=0.040..0.040 rows=0 loops=272414) Index Cond: (tab_ocr.idstruct = tab_chemin.label) Filter: ((chemin)::text ~~ '%;2;%'::text) Total runtime: 229922.864 ms 2008/2/5, Guillaume Smet wrote: Can you try to replace zoneindex_test @@ q with zoneindex_test @@ to_tsquery('partir')? The improvement seems negligible (with default_statistics_target back to 10, its default value): explain analyze SELECT idstruct, headline(zonetext, q), rank(zoneindex_test, q) FROM tab_ocr, tab_chemin, to_tsquery('partir') AS q WHERE tab_chemin.chemin like '%;2;%' AND tab_ocr.idstruct = tab_chemin.label AND zoneindex_test @@ to_tsquery('partir') ORDER BY rank(zoneindex_test, q) DESC; QUERY PLAN - Sort (cost=4358.91..4358.95 rows=16 width=308) (actual time=266489.667..266494.132 rows=7106 loops=1) Sort Key: rank(tab_ocr.zoneindex_test, q.q) - Nested Loop (cost=80.04..4358.59 rows=16 width=308) (actual time=42245.881..266469.644 rows=7106 loops=1) - Function Scan on q (cost=0.00..0.01 rows=1 width=32) (actual time=0.007..0.008 rows=1 loops=1) - Nested Loop (cost=80.04..4358.34 rows=16 width=276) (actual time=42239.570..178496.761 rows=7106 loops=1) - Bitmap Heap Scan on tab_ocr (cost=80.04..1461.83 rows=392 width=276) (actual time=38317.423..174188.779 rows=272414 loops=1) Filter: (zoneindex_test @@ '''partir'''::tsquery) - Bitmap Index Scan on zoneindex_test_idx (cost=0.00..79.94 rows=392 width=0) (actual time=38289.289..38289.289 rows=283606 loops=1) Index Cond: (zoneindex_test @@ '''partir'''::tsquery) - Index Scan using tab_chemin_label_index on tab_chemin (cost=0.00..7.38 rows=1 width=4) (actual time=0.014..0.014 rows=0 loops=272414) Index Cond: (tab_ocr.idstruct = tab_chemin.label) Filter: ((chemin)::text ~~ '%;2;%'::text) Total runtime: 266498.704 ms Increasing the statistics for zoneindex_test may be a good idea too (see ALTER TABLE ... ALTER COLUMN doc). I posted the results above. I'm surprised you have the word partir in so many documents? Do you use real data? I'm using real data. The indexed documents are extracted from newspapers, and partir (and its derivates) is quite a common verb in the French language, so I'm not that surprised to see it show up in many documents. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Benchmark Data requested
Apologies for the blank email - mailer problems. I lost all my nicely typed stuff, too. On Tue, 5 Feb 2008, Dimitri Fontaine wrote: Multiple table loads ( 1 per table) spawned via script is bit better but hits wal problems. pgloader will too hit the WAL problem, but it still may have its benefits, or at least we will soon (you can already if you take it from CVS) be able to measure if the parallel loading at the client side is a good idea perf. wise. You'll have to be careful here. Depending on the filesystem, writing large amounts of data to two files simultaneously can results in the blocks being interleaved to some degree on the disc, which can cause performance problems later on. As for the WAL, I have an suggestion, but I'm aware that I don't know how PG actually does it, so you'll have to tell me if it is valid. My impression is that the WAL is used to store writes in a transactional manner, for writes that can't be written in a transactional manner directly to the data files. Hence the suggestion for restoring database dumps to run the whole restore in one transaction, which means that the table creation is in the same transaction as loading the data into it. Since the table is not visible to other backends, the writes to it do not need to go to the WAL, and PG is clever enough to do this. My suggestion is to extend that slightly. If there is a large chunk of data to be written to a table, which will be entirely to empty pages or appended to the of the data file, then there is no risk of corruption of existing data, and that write could be made directly to the table. You would have to write a WAL entry reserving the space in the data file, and then write the data to the file. Then when that WAL entry is checkpointed, no work would be required. This would improve the performance of database restores and large writes which expand the table's data file. So, would it work? Matthew -- If pro is the opposite of con, what is the opposite of progress? ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Benchmark Data requested
Matthew wrote: On Tue, 5 Feb 2008, Richard Huxton wrote: Why would you need to lock the table? Because you're not really writing the WAL, which means you can't let anyone else get their data into any of the blocks you are writing into. You'd basically want to write the disk blocks then attach them in some way. So what's wrong with reserving the space using the WAL, then everyone else will know. After all, when you write the data to the WAL, you must have an idea of where it is meant to end up. My suggestion is that you go through all the motions of writing the data to the WAL, just without the data bit. Well, now you're looking at page-level locking for the data blocks, or at least something very similar. Not sure what you'd do with indexes though - don't see a simple way of avoiding a large lock on a btree index. If you reserved the space in advance that could work. But you don't know how much to reserve until you've copied it in. You could of course have a set of co-operating processes all bulk-loading while maintaining a table-lock outside of the those. It feels like things are getting complicated then though. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Benchmark Data requested
One of the problems with Empty Table optimization is that if there are indexes created then it is considered as no longer empty. Commercial databases have options like IRRECOVERABLE clause along with DISK PARTITIONS and CPU partitions for their bulk loaders. So one option turns off logging, disk partitions create multiple processes to read various lines/blocks from input file and other various blocks to clean up the bufferpools to disk and CPU partitions to process the various blocks/lines read for their format and put the rows in bufferpool if successful. Regards, Jignesh Simon Riggs wrote: On Tue, 2008-02-05 at 15:05 +, Richard Huxton wrote: Only by locking the table, which serializes access, which then slows you down or at least restricts other options. Plus if you use pg_loader then you'll find only the first few rows optimized and all the rest not. Hmm - the table-locking requirement is true enough, but why would pg_loader cause problems after the first few rows? It runs a stream of COPY statements, so only first would be optimized with the empty table optimization. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Benchmark Data requested
On Tue, 5 Feb 2008, Richard Huxton wrote: So what's wrong with reserving the space using the WAL, then everyone else will know. After all, when you write the data to the WAL, you must have an idea of where it is meant to end up. My suggestion is that you go through all the motions of writing the data to the WAL, just without the data bit. Well, now you're looking at page-level locking for the data blocks, or at least something very similar. Not sure what you'd do with indexes though - don't see a simple way of avoiding a large lock on a btree index. Yeah, indexes would be a lot more difficult I guess, if writes to them involve changing lots of stuff around. We do most of our loads without the indexes present though. If you reserved the space in advance that could work. But you don't know how much to reserve until you've copied it in. What does the WAL do? When do you allocate space in the file for written rows? Is is when you write the WAL, or when you checkpoint it? If it's when you write the WAL, then you can just use the same algorithm. You could of course have a set of co-operating processes all bulk-loading while maintaining a table-lock outside of the those. It feels like things are getting complicated then though. That does sound a bit evil. You could have different backends, each running a single transaction where they create one table and load the data for it. That wouldn't need any change to the backend, but it would only work for dump restores, and would require the client to be clever. I'm all for allowing this kind of optimisation while writing normally to the database, and for not requiring the client to think too hard. Matthew -- All of this sounds mildly turgid and messy and confusing... but what the heck. That's what programming's all about, really -- Computer Science Lecturer ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Benchmark Data requested
Matthew wrote: On Tue, 5 Feb 2008, Simon Riggs wrote: In the case of a bulk upload to an empty table (or partition?) could you not optimise the WAL away? That is, shouldn't the WAL basically be a simple transformation of the on-disk blocks? You'd have to explicitly sync the file(s) for the table/indexes of course, and you'd need some work-around for WAL shipping, but it might be worth it for you chaps with large imports. Only by locking the table, which serializes access, which then slows you down or at least restricts other options. Plus if you use pg_loader then you'll find only the first few rows optimized and all the rest not. Why would you need to lock the table? Because you're not really writing the WAL, which means you can't let anyone else get their data into any of the blocks you are writing into. You'd basically want to write the disk blocks then attach them in some way. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Benchmark Data requested
On Tue, 5 Feb 2008, Richard Huxton wrote: Why would you need to lock the table? Because you're not really writing the WAL, which means you can't let anyone else get their data into any of the blocks you are writing into. You'd basically want to write the disk blocks then attach them in some way. So what's wrong with reserving the space using the WAL, then everyone else will know. After all, when you write the data to the WAL, you must have an idea of where it is meant to end up. My suggestion is that you go through all the motions of writing the data to the WAL, just without the data bit. Matthew -- Failure is not an option. It comes bundled with your Microsoft product. -- Ferenc Mantfeld ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Benchmark Data requested
Le mardi 05 février 2008, Simon Riggs a écrit : It runs a stream of COPY statements, so only first would be optimized with the empty table optimization. The number of rows per COPY statement is configurable, so provided you have an estimation of the volume to import (wc -l), you could tweak this number for lowering the stream (down to 1 COPY maybe)... But basically a COPY run should be kept in memory (and we're talking about high volumes here) and in case of error processing you'd want it not that huge after all... -- dim signature.asc Description: This is a digitally signed message part.
Re: [PERFORM] Benchmark Data requested
Le mardi 05 février 2008, Simon Riggs a écrit : I'll look at COPY FROM internals to make this faster. I'm looking at this now to refresh my memory; I already had some plans on the shelf. Maybe stealing some ideas from pg_bulkload could somewhat help here? http://pgfoundry.org/docman/view.php/1000261/456/20060709_pg_bulkload.pdf IIRC it's mainly about how to optimize index updating while loading data, and I've heard complaints on the line this external tool has to know too much about PostgreSQL internals to be trustworthy as non-core code... so... The basic idea is for pgloader to ask PostgreSQL about constraint_exclusion, pg_inherits and pg_constraint and if pgloader recognize both the CHECK expression and the datatypes involved, and if we can implement the CHECK in python without having to resort to querying PostgreSQL, then we can run a thread per partition, with as many COPY FROM running in parallel as there are partition involved (when threads = -1). I'm not sure this will be quicker than relying on PostgreSQL trigger or rules as used for partitioning currently, but ISTM Jignesh quoted § is just about that. Much better than triggers and rules, but it will be hard to get it to work. Well, I'm thinking about providing a somewhat modular approach where pgloader code is able to recognize CHECK constraints, load a module registered to the operator and data types, then use it. The modules and their registration should be done at the configuration level, I'll provide some defaults and users will be able to add their code, the same way on-the-fly reformat modules are handled now. This means that I'll be able to provide (hopefully) quickly the basic cases (CHECK on dates = x and y), numeric ranges, etc, and users will be able to care about more complex setups. When the constraint won't match any configured pgloader exclusion module, the trigger/rule code will get used (COPY will go to the main table), and when the python CHECK implementation will be wrong (worst case) PostgreSQL will reject the data and pgloader will fill your reject data and log files. And you're back to debugging your python CHECK implementation... All of this is only a braindump as of now, and maybe quite an optimistic one... but baring any 'I know this can't work' objection that's what I'm gonna try to implement for next pgloader version. Thanks for comments, input is really appreciated ! -- dim signature.asc Description: This is a digitally signed message part.
Re: [PERFORM] Benchmark Data requested
On Tue, 2008-02-05 at 18:15 +0100, Dimitri Fontaine wrote: Le mardi 05 février 2008, Simon Riggs a écrit : I'll look at COPY FROM internals to make this faster. I'm looking at this now to refresh my memory; I already had some plans on the shelf. Maybe stealing some ideas from pg_bulkload could somewhat help here? http://pgfoundry.org/docman/view.php/1000261/456/20060709_pg_bulkload.pdf IIRC it's mainly about how to optimize index updating while loading data, and I've heard complaints on the line this external tool has to know too much about PostgreSQL internals to be trustworthy as non-core code... so... Yeh, the batch index updates are a cool feature. Should be able to do that internally also. Not going to try the no-WAL route again though. If we can get it running efficiently and in parallel, then that will be OK. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Benchmark Data requested
On Tue, 2008-02-05 at 13:47 -0500, Jignesh K. Shah wrote: That sounds cool to me too.. How much work is to make pg_bulkload to work on 8.3? An Integrated version is certainly more beneficial. Specially I think it will also help for other setups like TPC-E too where this is a problem. If you don't write WAL then you can lose all your writes in a crash. That issue is surmountable on a table with no indexes, or even conceivably with one monotonically ascending index. With other indexes if we crash then we have a likely corrupt index. For most production systems I'm aware of, losing an index on a huge table is not anything you'd want to trade for performance. Assuming you've ever been knee-deep in it on a real server. Maybe we can have a load mode for a table where we skip writing any WAL, but if we crash we just truncate the whole table to nothing? Issue a WARNING if we enable this mode while any data in table. I'm nervous of it, but maybe people really want it? I don't really want to invent ext2 all over again, so we have to run an fsck on a table of we crash while loading. My concern is that many people would choose that then blame us for delivering unreliable software. e.g. direct path loader on Oracle used to corrupt a PK index if you loaded duplicate rows with it (whether it still does I couldn't care). That kind of behaviour is simply incompatible with production usage, even if it does good benchmark. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Benchmark Data requested
Dimitri Fontaine wrote: Le mardi 05 février 2008, Simon Riggs a écrit : I'll look at COPY FROM internals to make this faster. I'm looking at this now to refresh my memory; I already had some plans on the shelf. Maybe stealing some ideas from pg_bulkload could somewhat help here? http://pgfoundry.org/docman/view.php/1000261/456/20060709_pg_bulkload.pdf IIRC it's mainly about how to optimize index updating while loading data, and I've heard complaints on the line this external tool has to know too much about PostgreSQL internals to be trustworthy as non-core code... so... I've been thinking of looking into that as well. The basic trick pg_bulkload is using is to populate the index as the data is being loaded. There's no fundamental reason why we couldn't do that internally in COPY. Triggers or constraints that access the table being loaded would make it impossible, but we should be able to detect that and fall back to what we have now. What I'm basically thinking about is to modify the indexam API of building a new index, so that COPY would feed the tuples to the indexam, instead of the indexam opening and scanning the heap. The b-tree indexam would spool the tuples into a tuplesort as the COPY progresses, and build the index from that at the end as usual. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Benchmark Data requested
Hi Heikki, Is there a way such an operation can be spawned as a worker process? Generally during such loading - which most people will do during offpeak hours I expect additional CPU resources available. By delegating such additional work to worker processes, we should be able to capitalize on additional cores in the system. Even if it is a single core, the mere fact that the loading process will eventually wait for a read from the input file which cannot be non-blocking, the OS can timeslice it well for the second process to use those wait times for the index population work. What do you think? Regards, Jignesh Heikki Linnakangas wrote: Dimitri Fontaine wrote: Le mardi 05 février 2008, Simon Riggs a écrit : I'll look at COPY FROM internals to make this faster. I'm looking at this now to refresh my memory; I already had some plans on the shelf. Maybe stealing some ideas from pg_bulkload could somewhat help here? http://pgfoundry.org/docman/view.php/1000261/456/20060709_pg_bulkload.pdf IIRC it's mainly about how to optimize index updating while loading data, and I've heard complaints on the line this external tool has to know too much about PostgreSQL internals to be trustworthy as non-core code... so... I've been thinking of looking into that as well. The basic trick pg_bulkload is using is to populate the index as the data is being loaded. There's no fundamental reason why we couldn't do that internally in COPY. Triggers or constraints that access the table being loaded would make it impossible, but we should be able to detect that and fall back to what we have now. What I'm basically thinking about is to modify the indexam API of building a new index, so that COPY would feed the tuples to the indexam, instead of the indexam opening and scanning the heap. The b-tree indexam would spool the tuples into a tuplesort as the COPY progresses, and build the index from that at the end as usual. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Benchmark Data requested
Commercial Db bulk loaders work the same way.. they give you an option as a fast loader provided in case of error, the whole table is truncated. This I think also has real life advantages where PostgreSQL is used as datamarts which are recreated every now and then from other systems and they want fast loaders. So its not just the benchmarking folks like me that will take advantage of such features. INFACT I have seen that they force the clause REPLACE TABLE in the sense that will infact truncate the table before loading so there is no confusion what happens to the original data in the table and only then it avoids the logs. to be honest, its not the WAL Writes to the disk that I am worried about.. According to my tests, async_commit is coming pretty close to sync=off and solves the WALWriteLock contention. We should maybe just focus on making it more efficient which I think also involves WALInsertLock that may not be entirely efficient. Also all changes have to be addon options and not replacement for existing loads, I totally agree to that point.. The guys in production support don't even like optimizer query plan changes, forget corrupt index. (I have spent two days in previous role trying to figure out why a particular query plan on another database changed in production.) Simon Riggs wrote: On Tue, 2008-02-05 at 13:47 -0500, Jignesh K. Shah wrote: That sounds cool to me too.. How much work is to make pg_bulkload to work on 8.3? An Integrated version is certainly more beneficial. Specially I think it will also help for other setups like TPC-E too where this is a problem. If you don't write WAL then you can lose all your writes in a crash. That issue is surmountable on a table with no indexes, or even conceivably with one monotonically ascending index. With other indexes if we crash then we have a likely corrupt index. For most production systems I'm aware of, losing an index on a huge table is not anything you'd want to trade for performance. Assuming you've ever been knee-deep in it on a real server. Maybe we can have a load mode for a table where we skip writing any WAL, but if we crash we just truncate the whole table to nothing? Issue a WARNING if we enable this mode while any data in table. I'm nervous of it, but maybe people really want it? I don't really want to invent ext2 all over again, so we have to run an fsck on a table of we crash while loading. My concern is that many people would choose that then blame us for delivering unreliable software. e.g. direct path loader on Oracle used to corrupt a PK index if you loaded duplicate rows with it (whether it still does I couldn't care). That kind of behaviour is simply incompatible with production usage, even if it does good benchmark. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Benchmark Data requested
Jignesh K. Shah wrote: Is there a way such an operation can be spawned as a worker process? Generally during such loading - which most people will do during offpeak hours I expect additional CPU resources available. By delegating such additional work to worker processes, we should be able to capitalize on additional cores in the system. Hmm. You do need access to shared memory, locks, catalogs, and to run functions etc, so I don't think it's significantly easier than using multiple cores for COPY itself. Even if it is a single core, the mere fact that the loading process will eventually wait for a read from the input file which cannot be non-blocking, the OS can timeslice it well for the second process to use those wait times for the index population work. That's an interesting point. What do you think? Regards, Jignesh Heikki Linnakangas wrote: Dimitri Fontaine wrote: Le mardi 05 février 2008, Simon Riggs a écrit : I'll look at COPY FROM internals to make this faster. I'm looking at this now to refresh my memory; I already had some plans on the shelf. Maybe stealing some ideas from pg_bulkload could somewhat help here? http://pgfoundry.org/docman/view.php/1000261/456/20060709_pg_bulkload.pdf IIRC it's mainly about how to optimize index updating while loading data, and I've heard complaints on the line this external tool has to know too much about PostgreSQL internals to be trustworthy as non-core code... so... I've been thinking of looking into that as well. The basic trick pg_bulkload is using is to populate the index as the data is being loaded. There's no fundamental reason why we couldn't do that internally in COPY. Triggers or constraints that access the table being loaded would make it impossible, but we should be able to detect that and fall back to what we have now. What I'm basically thinking about is to modify the indexam API of building a new index, so that COPY would feed the tuples to the indexam, instead of the indexam opening and scanning the heap. The b-tree indexam would spool the tuples into a tuplesort as the COPY progresses, and build the index from that at the end as usual. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Benchmark Data requested
On Tue, 2008-02-05 at 15:50 -0500, Jignesh K. Shah wrote: Is there a way such an operation can be spawned as a worker process? Generally during such loading - which most people will do during offpeak hours I expect additional CPU resources available. By delegating such additional work to worker processes, we should be able to capitalize on additional cores in the system. Even if it is a single core, the mere fact that the loading process will eventually wait for a read from the input file which cannot be non-blocking, the OS can timeslice it well for the second process to use those wait times for the index population work. If Dimitri is working on parallel load, why bother? -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Benchmark Data requested
On Tue, 5 Feb 2008, Simon Riggs wrote: On Tue, 2008-02-05 at 15:50 -0500, Jignesh K. Shah wrote: Even if it is a single core, the mere fact that the loading process will eventually wait for a read from the input file which cannot be non-blocking, the OS can timeslice it well for the second process to use those wait times for the index population work. If Dimitri is working on parallel load, why bother? pgloader is a great tool for a lot of things, particularly if there's any chance that some of your rows will get rejected. But the way things pass through the Python/psycopg layer made it uncompetative (more than 50% slowdown) against the straight COPY path from a rows/second perspective the last time (V2.1.0?) I did what I thought was a fair test of it (usual caveat of with the type of data I was loading). Maybe there's been some gigantic improvement since then, but it's hard to beat COPY when you've got an API layer or two in the middle. I suspect what will end up happening is that a parallel loading pgloader will scale something like this: 1 CPU: Considerably slower than COPY 2-3 CPUs: Close to even with COPY 4+ CPUs: Faster than COPY Maybe I'm wrong, but I wouldn't abandon looking into another approach until that territory is mapped out a bit better. Given the very large number of dual-core systems out there now relative to those with more, optimizing the straight COPY path with any way to take advantage of even one more core to things like index building is well worth doing. Heikki's idea sounded good to me regardless, and if that can be separated out enough to get a second core into the index building at the same time so much the better. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Benchmark Data requested
Hi, On Feb 6, 2008 9:05 AM, Greg Smith [EMAIL PROTECTED] wrote: On Tue, 5 Feb 2008, Simon Riggs wrote: On Tue, 2008-02-05 at 15:50 -0500, Jignesh K. Shah wrote: Even if it is a single core, the mere fact that the loading process will eventually wait for a read from the input file which cannot be non-blocking, the OS can timeslice it well for the second process to use those wait times for the index population work. If Dimitri is working on parallel load, why bother? pgloader is a great tool for a lot of things, particularly if there's any chance that some of your rows will get rejected. But the way things pass through the Python/psycopg layer made it uncompetative (more than 50% slowdown) against the straight COPY path from a rows/second perspective the last time (V2.1.0?) I did what I thought was a fair test of it (usual caveat of with the type of data I was loading). Maybe there's been some gigantic improvement since then, but it's hard to beat COPY when you've got an API layer or two in the middle. I think, its time now that we should jazz COPY up a bit to include all the discussed functionality. Heikki's batch-indexing idea is pretty useful too. Another thing that pg_bulkload does is it directly loads the tuples into the relation by constructing the tuples and writing them directly to the physical file corresponding to the involved relation, bypassing the engine completely (ofcourse the limitations that arise out of it are not supporting rules, triggers, constraints, default expression evaluation etc). ISTM, we could optimize the COPY code to try to do direct loading too (not necessarily as done by pg_bulkload) to speed it up further in certain cases. Another thing that we should add to COPY is the ability to continue data load across errors as was discussed recently on hackers some time back too. Regards, Nikhils -- EnterpriseDB http://www.enterprisedb.com