Re: [PERFORM] Improving PostgreSQL insert performance
Vladimir Sitnikov wrote: > Alvaro>Something like > INSERT INTO .. VALUES ('col1', 'col2'), ('col1', 'col2'), ('col1', 'col2')>I > did not > Frits>try that, to be honest. > > pgjdbc does automatically rewrite insert values(); into insert ... > values(),(),(),() when reWriteBatchedInserts=true. I don't expect manual > multivalues to be noticeably faster there. Ahh, so that's what that option does :-) Nice to know -- great feature. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Improving PostgreSQL insert performance
Alvaro>Something like INSERT INTO .. VALUES ('col1', 'col2'), ('col1', 'col2'), ('col1', 'col2')>I did not Frits>try that, to be honest. pgjdbc does automatically rewrite insert values(); into insert ... values(),(),(),() when reWriteBatchedInserts=true. I don't expect manual multivalues to be noticeably faster there. Frits>https://etc.to/confluence/display/~admjal/PostgreSQL+performance+tests Do you really intend to measure just a single insert operation? It looks odd, as typical applications would execute inserts for quite a while before they terminate. You are including lots of warmup overheads (e.g. JIT-compilation), so your approach does not measure peak performance. On the other hand, you are not measuring enough time to catch things like "DB log switch". Would you please use JMH as a load driver? Here's an example: https://github.com/pgjdbc/pgjdbc/blob/master/ubenchmark/src/main/java/org/postgresql/benchmark/statement/InsertBatch.java Vladimir >
Re: [PERFORM] Improving PostgreSQL insert performance
Hi Alvaro, I did not try that, to be honest. I am using a single prepared statement so that the database needs to parse it only once. All executes then use the batched parameters. I will try this later on, but I wonder whether having to reparse the statement every time compared to one prepared statement would actually be faster. But thanks for the tip; I will take a look. Regards, Frits
Re: [PERFORM] Improving PostgreSQL insert performance
On 06/10/2017 07:32 PM, Alvaro Herrera wrote: > Frits Jalvingh wrote: > >> So, I am still very interested in getting normal inserts faster, because >> that will gain speed for all work.. If Oracle can do it, and Postgres is >> able to insert fast with copy- where lies the bottleneck with the insert >> command? There seems to be quite a performance hit with the JDBC driver >> itself (as the stored procedure is a lot faster), so I can look into that. >> But even after that there is quite a gap.. > > Did you try inserting multiple tuples in one command? Something like > INSERT INTO .. VALUES ('col1', 'col2'), ('col1', 'col2'), ('col1', 'col2') > It's supposed to be faster than single-row inserts, though I don't > know by how much. When I did the testing of the patch originally I saw significant improvements, e.g. 8x in early versions. The thread is here: https://www.postgresql.org/message-id/flat/44C4451A.4010906%40joeconway.com#44c4451a.4010...@joeconway.com Joe -- Crunchy Data - http://crunchydata.com PostgreSQL Support for Secure Enterprises Consulting, Training, & Open Source Development signature.asc Description: OpenPGP digital signature
Re: [PERFORM] Improving PostgreSQL insert performance
Frits Jalvingh wrote: > So, I am still very interested in getting normal inserts faster, because > that will gain speed for all work.. If Oracle can do it, and Postgres is > able to insert fast with copy- where lies the bottleneck with the insert > command? There seems to be quite a performance hit with the JDBC driver > itself (as the stored procedure is a lot faster), so I can look into that. > But even after that there is quite a gap.. Did you try inserting multiple tuples in one command? Something like INSERT INTO .. VALUES ('col1', 'col2'), ('col1', 'col2'), ('col1', 'col2') It's supposed to be faster than single-row inserts, though I don't know by how much. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Improving PostgreSQL insert performance
I think binary is worse.. according to the postgres documentation: The binary format option causes all data to be stored/read as binary format rather than as text. It is somewhat faster than the text and CSV formats, but a binary-format file is less portable across machine architectures and PostgreSQL versions. Also, the binary format is very data type specific; for example it will not work to output binary data from a smallint column and read it into an integer column, even though that would work fine in text format. By itself it is similar in badness as both require completely different semantics than insert.. On Sat, 10 Jun 2017 at 22:12, Nicolas Pariswrote: > > I tried the copy command, and that indeed works quite brilliantly: > > Inserted 2400 rows in 22004 milliseconds, 1090710.7798582076 rows per > > second > > > > That's faster than Oracle. But with a very bad interface I have to say > for > > normal database work.. I will try to make this work in the tooling, but > it > > needs some very special code to format all possible values properly, and > to > > manage the end of the copy, so it is not usable in general which is a > pity, I > > think. > > Have you thought about the COPY with binary format ? Thats looks more > robust than the text format you used in your benchmarks. >
Re: [PERFORM] Improving PostgreSQL insert performance
> I tried the copy command, and that indeed works quite brilliantly: > Inserted 2400 rows in 22004 milliseconds, 1090710.7798582076 rows per > second > > That's faster than Oracle. But with a very bad interface I have to say for > normal database work.. I will try to make this work in the tooling, but it > needs some very special code to format all possible values properly, and to > manage the end of the copy, so it is not usable in general which is a pity, I > think. Have you thought about the COPY with binary format ? Thats looks more robust than the text format you used in your benchmarks. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Improving PostgreSQL insert performance
On Sat, Jun 10, 2017 at 12:08 AM Vladimir Sitnikov < sitnikov.vladi...@gmail.com> wrote: > Would you mind sharing the source code of your benchmark? > The source code for the several tests, plus the numbers collected so far, can be found at: https://etc.to/confluence/display/~admjal/PostgreSQL+performance+tests Regards, Frits
Re: [PERFORM] Improving PostgreSQL insert performance
Frits, Would you mind sharing the source code of your benchmark? >BTW: It seems you need a recent driver for this; I'm using postgresql-42.1.1.jar Technically speaking, reWriteBatchedInserts was introduced in 9.4.1209 (2016-07-15) Vladimir
Re: [PERFORM] Improving PostgreSQL insert performance
On Fri, Jun 9, 2017 at 6:04 AM, Frits Jalvinghwrote: > > I already changed the following config parameters: > work_mem 512MB > synchronous_commit off > Since you are already batching up commits into large chunks, this setting is not very useful, but does risk you losing supposedly-committed data upon a crash. I would not do it. > shared_buffers 512mb > You might try increasing wal_buffers, but the default for this size of shared_buffers is 16MB, which is usually big enough. One thing you are missing is max_wal_size. The default value of that is probably too small for what you are doing. But if you are not using COPY, then maybe none of this matters as the bottleneck will be elsewhere. Cheers, Jeff
Re: [PERFORM] Improving PostgreSQL insert performance
The parameter was reWriteBatchedInserts = true Either added in connection properties of in the connection URL like jdbc:postgresql://localhost:5432/datavault_12_tst?reWriteBatchedInserts=true BTW: It seems you need a recent driver for this; I'm using postgresql-42.1.1.jar On Fri, Jun 9, 2017 at 5:33 PM Kenneth Marshallwrote: > On Fri, Jun 09, 2017 at 03:22:35PM +, Frits Jalvingh wrote: > > Hi Babu, > > > > That was all already done, as it is common practice for JDBC. Your > > parameter was added to the code that already did all that - and worked > > brilliantly there ;) > > > Hi Frits, > > What was the parameter? I did not see an Email in the thread from Babu. > > Regards, > Ken >
Re: [PERFORM] Improving PostgreSQL insert performance
On Fri, Jun 09, 2017 at 03:22:35PM +, Frits Jalvingh wrote: > Hi Babu, > > That was all already done, as it is common practice for JDBC. Your > parameter was added to the code that already did all that - and worked > brilliantly there ;) > Hi Frits, What was the parameter? I did not see an Email in the thread from Babu. Regards, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Improving PostgreSQL insert performance
Hi Babu, That was all already done, as it is common practice for JDBC. Your parameter was added to the code that already did all that - and worked brilliantly there ;) >
Re: [PERFORM] Improving PostgreSQL insert performance
On Fri, Jun 9, 2017 at 9:12 AM, Frits Jalvinghwrote: > Hi John, > > Yes, I was aware and amazed by that ;) It is actually the fetch size in > combination with autocommit being on; that dies the sweet OOM death as soon > as the table gets big. > > But Postgres read performance, with autocommit off and fetch size arond 64K, > is quite OK. But it's good to get this mentioned a lot, because as you said > you can spend quite some time wondering about this! No production db server should have the oom killer enabled. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Improving PostgreSQL insert performance
Hi John, Yes, I was aware and amazed by that ;) It is actually the fetch size in combination with autocommit being on; that dies the sweet OOM death as soon as the table gets big. But Postgres read performance, with autocommit off and fetch size arond 64K, is quite OK. But it's good to get this mentioned a lot, because as you said you can spend quite some time wondering about this! On Fri, Jun 9, 2017 at 5:08 PM John Gorman <jgor...@eldocomp.com> wrote: > You need to be careful with the setFetchSize we have tables with over 10 > million rows and many columns and the PostgreSQL JDBC driver silently > fails, ignores the fetch size and tries to read the entire table content > into memory. I spent many agonizing days on this. > > > > ps.setFetchSize(65536); > > > > Regards > > John > > > > > > *From:* pgsql-performance-ow...@postgresql.org [mailto: > pgsql-performance-ow...@postgresql.org] *On Behalf Of *Frits Jalvingh > *Sent:* Friday, June 09, 2017 7:55 AM > *To:* Sunkara, Amrutha; pgsql-performance@postgresql.org > > > *Subject:* Re: [PERFORM] Improving PostgreSQL insert performance > > > > I am not doing anything special I guess. I am adding the results of the > tests and the programs I'm using to the following page: > > > > https://etc.to/confluence/display/~admjal/PostgreSQL+performance+tests > > > > The copy example, in Java, is at the end. All of the examples use trivial > data and the same data. If you find fault please let me know ;) But the > copy does insert the records as they can be seen ;) > > On Fri, Jun 9, 2017 at 4:47 PM Sunkara, Amrutha <amru...@nytimes.com> > wrote: > > Frits, > > > > When you use the copy command, are you doing anything special to get the > run time that you are indicating? > >
Re: [PERFORM] Improving PostgreSQL insert performance
You need to be careful with the setFetchSize we have tables with over 10 million rows and many columns and the PostgreSQL JDBC driver silently fails, ignores the fetch size and tries to read the entire table content into memory. I spent many agonizing days on this. ps.setFetchSize(65536); Regards John From: pgsql-performance-ow...@postgresql.org [mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of Frits Jalvingh Sent: Friday, June 09, 2017 7:55 AM To: Sunkara, Amrutha; pgsql-performance@postgresql.org Subject: Re: [PERFORM] Improving PostgreSQL insert performance I am not doing anything special I guess. I am adding the results of the tests and the programs I'm using to the following page: https://etc.to/confluence/display/~admjal/PostgreSQL+performance+tests The copy example, in Java, is at the end. All of the examples use trivial data and the same data. If you find fault please let me know ;) But the copy does insert the records as they can be seen ;) On Fri, Jun 9, 2017 at 4:47 PM Sunkara, Amrutha <amru...@nytimes.com<mailto:amru...@nytimes.com>> wrote: Frits, When you use the copy command, are you doing anything special to get the run time that you are indicating?
Re: [PERFORM] Improving PostgreSQL insert performance
Hi Babu, No, I did not, and the effect is quite great: Inserted 100 rows in 2535 milliseconds, 394477.3175542406 rows per second Inserted 100 rows in 2553 milliseconds, 391696.0438699569 rows per second compared to (without your parameter): Inserted 100 rows in 7643 milliseconds, 130838.67591259976 rows per second That is quite an increase!! Thanks a lot for the tip!! For those keeping score: we're now at 77% of Oracle's performance- without copy ;) >
Re: [PERFORM] Improving PostgreSQL insert performance
Hi Kenneth, I tried unlogged before, but as long as the commit interval is long it had no discerning effect that I could see.
Re: [PERFORM] Improving PostgreSQL insert performance
I am not doing anything special I guess. I am adding the results of the tests and the programs I'm using to the following page: https://etc.to/confluence/display/~admjal/PostgreSQL+performance+tests The copy example, in Java, is at the end. All of the examples use trivial data and the same data. If you find fault please let me know ;) But the copy does insert the records as they can be seen ;) On Fri, Jun 9, 2017 at 4:47 PM Sunkara, Amruthawrote: > Frits, > > When you use the copy command, are you doing anything special to get the > run time that you are indicating? >
Re: [PERFORM] Improving PostgreSQL insert performance
On Fri, Jun 09, 2017 at 02:39:37PM +, Frits Jalvingh wrote: > Hi all, > > Thanks a lot for the many responses! > > About preparing statements: this is done properly in Java, and pgsql does > it by itself. So that cannot be done better ;) > > I tried the copy command, and that indeed works quite brilliantly: > Inserted 2400 rows in 22004 milliseconds, 1090710.7798582076 rows per > second > > That's faster than Oracle. But with a very bad interface I have to say for > normal database work.. I will try to make this work in the tooling, but it > needs some very special code to format all possible values properly, and to > manage the end of the copy, so it is not usable in general which is a pity, > I think. > > So, I am still very interested in getting normal inserts faster, because > that will gain speed for all work.. If Oracle can do it, and Postgres is > able to insert fast with copy- where lies the bottleneck with the insert > command? There seems to be quite a performance hit with the JDBC driver > itself (as the stored procedure is a lot faster), so I can look into that. > But even after that there is quite a gap.. > > Regards, > > Frits Hi Frits, Have you looked at UNLOGGED tables and also having more that 1 insert stream running at a time. Sometimes multiple parallel inserts can be faster. Regards, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Improving PostgreSQL insert performance
Frits, When you use the copy command, are you doing anything special to get the run time that you are indicating? On Fri, Jun 9, 2017 at 10:39 AM, Frits Jalvinghwrote: > Hi all, > > Thanks a lot for the many responses! > > About preparing statements: this is done properly in Java, and pgsql does > it by itself. So that cannot be done better ;) > > I tried the copy command, and that indeed works quite brilliantly: > Inserted 2400 rows in 22004 milliseconds, 1090710.7798582076 > <(779)%20858-2076> rows per second > > That's faster than Oracle. But with a very bad interface I have to say for > normal database work.. I will try to make this work in the tooling, but it > needs some very special code to format all possible values properly, and to > manage the end of the copy, so it is not usable in general which is a pity, > I think. > > So, I am still very interested in getting normal inserts faster, because > that will gain speed for all work.. If Oracle can do it, and Postgres is > able to insert fast with copy- where lies the bottleneck with the insert > command? There seems to be quite a performance hit with the JDBC driver > itself (as the stored procedure is a lot faster), so I can look into that. > But even after that there is quite a gap.. > > Regards, > > Frits > > On Fri, Jun 9, 2017 at 4:33 PM Scott Marlowe > wrote: > >> On Fri, Jun 9, 2017 at 7:56 AM, Frits Jalvingh wrote: >> > Hi Kenneth, Andreas, >> > >> > Thanks for your tips! >> > >> > I increased shared_buffers to 8GB but it has no measurable effect at >> all. I >> > think that is logical: shared buffers are important for querying but >> not for >> > inserting; for that the speed to write to disk seems most important- no >> big >> > reason to cache the data if the commit requires a full write anyway. >> > I also changed the code to do only one commit; this also has no effect >> I can >> > see. >> > >> > It is true that Oracle had more memory assigned to it (1.5G), but unlike >> > Postgres (which is completely on a fast SSD) Oracle runs on slower disk >> > (ZFS).. >> > >> > I will try copy, but I first need to investigate how to use it- its >> > interface seems odd to say the least ;) I'll report back on that once >> done. >> >> I you want an example of copy, just pg_dump a table: >> >> pg_dump -d smarlowe -t test >> >> (SNIP) >> COPY test (a, b) FROM stdin; >> 1 abc >> 2 xyz >> \. >> (SNIP) >> >
Re: [PERFORM] Improving PostgreSQL insert performance
Hi all, Thanks a lot for the many responses! About preparing statements: this is done properly in Java, and pgsql does it by itself. So that cannot be done better ;) I tried the copy command, and that indeed works quite brilliantly: Inserted 2400 rows in 22004 milliseconds, 1090710.7798582076 rows per second That's faster than Oracle. But with a very bad interface I have to say for normal database work.. I will try to make this work in the tooling, but it needs some very special code to format all possible values properly, and to manage the end of the copy, so it is not usable in general which is a pity, I think. So, I am still very interested in getting normal inserts faster, because that will gain speed for all work.. If Oracle can do it, and Postgres is able to insert fast with copy- where lies the bottleneck with the insert command? There seems to be quite a performance hit with the JDBC driver itself (as the stored procedure is a lot faster), so I can look into that. But even after that there is quite a gap.. Regards, Frits On Fri, Jun 9, 2017 at 4:33 PM Scott Marlowewrote: > On Fri, Jun 9, 2017 at 7:56 AM, Frits Jalvingh wrote: > > Hi Kenneth, Andreas, > > > > Thanks for your tips! > > > > I increased shared_buffers to 8GB but it has no measurable effect at > all. I > > think that is logical: shared buffers are important for querying but not > for > > inserting; for that the speed to write to disk seems most important- no > big > > reason to cache the data if the commit requires a full write anyway. > > I also changed the code to do only one commit; this also has no effect I > can > > see. > > > > It is true that Oracle had more memory assigned to it (1.5G), but unlike > > Postgres (which is completely on a fast SSD) Oracle runs on slower disk > > (ZFS).. > > > > I will try copy, but I first need to investigate how to use it- its > > interface seems odd to say the least ;) I'll report back on that once > done. > > I you want an example of copy, just pg_dump a table: > > pg_dump -d smarlowe -t test > > (SNIP) > COPY test (a, b) FROM stdin; > 1 abc > 2 xyz > \. > (SNIP) >
Re: [PERFORM] Improving PostgreSQL insert performance
On Fri, Jun 09, 2017 at 01:56:58PM +, Frits Jalvingh wrote: > Hi Kenneth, Andreas, > > Thanks for your tips! > > I increased shared_buffers to 8GB but it has no measurable effect at all. I > think that is logical: shared buffers are important for querying but not > for inserting; for that the speed to write to disk seems most important- no > big reason to cache the data if the commit requires a full write anyway. > I also changed the code to do only one commit; this also has no effect I > can see. > > It is true that Oracle had more memory assigned to it (1.5G), but unlike > Postgres (which is completely on a fast SSD) Oracle runs on slower disk > (ZFS).. > > I will try copy, but I first need to investigate how to use it- its > interface seems odd to say the least ;) I'll report back on that once done. > > Any other tips would be welcome! > > Regards, > > Frits Hi Frits, Here is an article that is still valid: https://www.depesz.com/2007/07/05/how-to-insert-data-to-database-as-fast-as-possible/ Regards, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Improving PostgreSQL insert performance
On Fri, Jun 9, 2017 at 7:56 AM, Frits Jalvinghwrote: > Hi Kenneth, Andreas, > > Thanks for your tips! > > I increased shared_buffers to 8GB but it has no measurable effect at all. I > think that is logical: shared buffers are important for querying but not for > inserting; for that the speed to write to disk seems most important- no big > reason to cache the data if the commit requires a full write anyway. > I also changed the code to do only one commit; this also has no effect I can > see. > > It is true that Oracle had more memory assigned to it (1.5G), but unlike > Postgres (which is completely on a fast SSD) Oracle runs on slower disk > (ZFS).. > > I will try copy, but I first need to investigate how to use it- its > interface seems odd to say the least ;) I'll report back on that once done. I you want an example of copy, just pg_dump a table: pg_dump -d smarlowe -t test (SNIP) COPY test (a, b) FROM stdin; 1 abc 2 xyz \. (SNIP) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Improving PostgreSQL insert performance
Hi Kenneth, Andreas, Thanks for your tips! I increased shared_buffers to 8GB but it has no measurable effect at all. I think that is logical: shared buffers are important for querying but not for inserting; for that the speed to write to disk seems most important- no big reason to cache the data if the commit requires a full write anyway. I also changed the code to do only one commit; this also has no effect I can see. It is true that Oracle had more memory assigned to it (1.5G), but unlike Postgres (which is completely on a fast SSD) Oracle runs on slower disk (ZFS).. I will try copy, but I first need to investigate how to use it- its interface seems odd to say the least ;) I'll report back on that once done. Any other tips would be welcome! Regards, Frits On Fri, Jun 9, 2017 at 3:30 PM Kenneth Marshallwrote: > On Fri, Jun 09, 2017 at 03:24:15PM +0200, Andreas Kretschmer wrote: > > > > > > Am 09.06.2017 um 15:04 schrieb Frits Jalvingh: > > >Hi all, > > > > > >I am trying to improve the runtime of a big data warehouse > > >application. One significant bottleneck found was insert > > >performance, so I am investigating ways of getting Postgresql to > > >insert data faster. > > > > * use COPY instead of Insert, it is much faster > > * bundle all Insert into one transaction > > * use a separate disk/spindel for the transaction log > > > > > > > > > > > >I already changed the following config parameters: > > >work_mem 512MB > > >synchronous_commit off > > >shared_buffers 512mb > > >commit_delay 10 > > >autovacuum_naptime 10min > > > > > >Postgres version is 9.6.3 on Ubuntu 17.04 64 bit, on a i7-4790K > > >with 16GB memory and an Intel 750 SSD. JDBC driver is > > >postgresql-42.1.1. > > > > > > > increase shared_buffers, with 16gb ram i would suggest 8gb > > +1 Without even checking, I think Oracle is configured to use a LOT > more memory than 512mb. > > Regards, > Ken > > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance >
Re: [PERFORM] Improving PostgreSQL insert performance
On Fri, Jun 09, 2017 at 03:24:15PM +0200, Andreas Kretschmer wrote: > > > Am 09.06.2017 um 15:04 schrieb Frits Jalvingh: > >Hi all, > > > >I am trying to improve the runtime of a big data warehouse > >application. One significant bottleneck found was insert > >performance, so I am investigating ways of getting Postgresql to > >insert data faster. > > * use COPY instead of Insert, it is much faster > * bundle all Insert into one transaction > * use a separate disk/spindel for the transaction log > > > > > > >I already changed the following config parameters: > >work_mem 512MB > >synchronous_commit off > >shared_buffers 512mb > >commit_delay 10 > >autovacuum_naptime 10min > > > >Postgres version is 9.6.3 on Ubuntu 17.04 64 bit, on a i7-4790K > >with 16GB memory and an Intel 750 SSD. JDBC driver is > >postgresql-42.1.1. > > > > increase shared_buffers, with 16gb ram i would suggest 8gb +1 Without even checking, I think Oracle is configured to use a LOT more memory than 512mb. Regards, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Improving PostgreSQL insert performance
Am 09.06.2017 um 15:04 schrieb Frits Jalvingh: Hi all, I am trying to improve the runtime of a big data warehouse application. One significant bottleneck found was insert performance, so I am investigating ways of getting Postgresql to insert data faster. * use COPY instead of Insert, it is much faster * bundle all Insert into one transaction * use a separate disk/spindel for the transaction log I already changed the following config parameters: work_mem 512MB synchronous_commit off shared_buffers 512mb commit_delay 10 autovacuum_naptime 10min Postgres version is 9.6.3 on Ubuntu 17.04 64 bit, on a i7-4790K with 16GB memory and an Intel 750 SSD. JDBC driver is postgresql-42.1.1. increase shared_buffers, with 16gb ram i would suggest 8gb Regards, Andreas -- 2ndQuadrant - The PostgreSQL Support Company. www.2ndQuadrant.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Improving PostgreSQL insert performance
Hi all, I am trying to improve the runtime of a big data warehouse application. One significant bottleneck found was insert performance, so I am investigating ways of getting Postgresql to insert data faster. I ran several tests on a fast machine to find out what performs best, and compared the results with the same actions in Oracle on that same machine. So far I am finding that PostgreSQL insert performance is several times slower than Oracle performance, and I would be grateful for some help trying to decrease this gap... To test I wrote a simple Java program which inserts data into a simple table, using statement batching and delayed commits. The table looks as follows: create table h_test( id_h integer , source_system_id integer , organisation_id integer , load_dts timestamp without time zone , boekingdetailid text ); No constraints, no indexes. The java program and PostgreSQL run on the same machine. The best results I've got are: PostgreSQL inserts: Commit size 50.000 and batch size 10.000 Inserted 100 rows in 7500 milliseconds, 142857.14285714287 rows per second Inserted 100 rows in 7410 milliseconds, 142857.14285714287 rows per second The exact same test done on Oracle (on the same machine) reports: Inserted 100 rows in 1072 milliseconds, 100.0 rows per second Increasing the row count in Oracle decreases this number a bit, but it's still fast: Inserted 2400 rows in 47155 milliseconds, 510638.2978723404 rows per second (oracle) compared with: Inserted 2400 rows in 159929 milliseconds, 150943.3962264151 rows per second (postgresql) I also created a small pg/sql stored procedure to insert the same 1 million rows, which runs in about 4 seconds, resulting in 250.000 rows a second. This is in the DB itself, but it still is twice as slow as Oracle with JDBC: CREATE or replace function test() returns void AS $$ DECLARE count integer; BEGIN for count in 1..100 loop insert into h_test(id_h,source_system_id,organisation_id,load_dts,boekingdetailid) values(count, 1, 12, now(), 'boe' || count || 'king' || count); end loop; END; $$ LANGUAGE plpgsql; I already changed the following config parameters: work_mem 512MB synchronous_commit off shared_buffers 512mb commit_delay 10 autovacuum_naptime 10min Postgres version is 9.6.3 on Ubuntu 17.04 64 bit, on a i7-4790K with 16GB memory and an Intel 750 SSD. JDBC driver is postgresql-42.1.1. (btw: the actual load I'm trying to improve will load more than 132 million rows, and will grow). Any help is greatly appreciated! Regards, Frits