Re: [PERFORM] Improving PostgreSQL insert performance

2017-06-11 Thread Alvaro Herrera
Vladimir Sitnikov wrote:
> Alvaro>Something like
> INSERT INTO .. VALUES ('col1', 'col2'), ('col1', 'col2'), ('col1', 'col2')>I
> did not
> Frits>try that, to be honest.
> 
> pgjdbc does automatically rewrite insert values(); into insert ...
> values(),(),(),() when reWriteBatchedInserts=true. I don't expect manual
> multivalues to be noticeably faster there.

Ahh, so that's what that option does :-)  Nice to know -- great feature.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Improving PostgreSQL insert performance

2017-06-11 Thread Vladimir Sitnikov
Alvaro>Something like
INSERT INTO .. VALUES ('col1', 'col2'), ('col1', 'col2'), ('col1', 'col2')>I
did not
Frits>try that, to be honest.

pgjdbc does automatically rewrite insert values(); into insert ...
values(),(),(),() when reWriteBatchedInserts=true. I don't expect manual
multivalues to be noticeably faster there.


Frits>https://etc.to/confluence/display/~admjal/PostgreSQL+performance+tests

Do you really intend to measure just a single insert operation?
It looks odd, as typical applications would execute inserts for quite a
while before they terminate.

You are including lots of warmup overheads (e.g. JIT-compilation), so your
approach does not measure peak performance.
On the other hand, you are not measuring enough time to catch things like
"DB log switch".

Would you please use JMH as a load driver?
Here's an example:
https://github.com/pgjdbc/pgjdbc/blob/master/ubenchmark/src/main/java/org/postgresql/benchmark/statement/InsertBatch.java


Vladimir

>


Re: [PERFORM] Improving PostgreSQL insert performance

2017-06-11 Thread Frits Jalvingh
Hi Alvaro,

I did not try that, to be honest. I am using a single prepared statement so
that the database needs to parse it only once. All executes then use the
batched parameters.
I will try this later on, but I wonder whether having to reparse the
statement every time compared to one prepared statement would actually be
faster.

But thanks for the tip; I will take a look.

Regards,

Frits


Re: [PERFORM] Improving PostgreSQL insert performance

2017-06-10 Thread Joe Conway
On 06/10/2017 07:32 PM, Alvaro Herrera wrote:
> Frits Jalvingh wrote:
> 
>> So, I am still very interested in getting normal inserts faster, because
>> that will gain speed for all work.. If Oracle can do it, and Postgres is
>> able to insert fast with copy- where lies the bottleneck with the insert
>> command? There seems to be quite a performance hit with the JDBC driver
>> itself (as the stored procedure is a lot faster), so I can look into that.
>> But even after that there is quite a gap..
> 
> Did you try inserting multiple tuples in one command?  Something like
> INSERT INTO .. VALUES ('col1', 'col2'), ('col1', 'col2'), ('col1', 'col2')
> It's supposed to be faster than single-row inserts, though I don't
> know by how much.

When I did the testing of the patch originally I saw significant
improvements, e.g. 8x in early versions. The thread is here:
https://www.postgresql.org/message-id/flat/44C4451A.4010906%40joeconway.com#44c4451a.4010...@joeconway.com

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] Improving PostgreSQL insert performance

2017-06-10 Thread Alvaro Herrera
Frits Jalvingh wrote:

> So, I am still very interested in getting normal inserts faster, because
> that will gain speed for all work.. If Oracle can do it, and Postgres is
> able to insert fast with copy- where lies the bottleneck with the insert
> command? There seems to be quite a performance hit with the JDBC driver
> itself (as the stored procedure is a lot faster), so I can look into that.
> But even after that there is quite a gap..

Did you try inserting multiple tuples in one command?  Something like
INSERT INTO .. VALUES ('col1', 'col2'), ('col1', 'col2'), ('col1', 'col2')
It's supposed to be faster than single-row inserts, though I don't
know by how much.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Improving PostgreSQL insert performance

2017-06-10 Thread Frits Jalvingh
I think binary is worse.. according to the postgres documentation:

The binary format option causes all data to be stored/read as binary format
rather than as text. It is somewhat faster than the text and CSV formats,
but a binary-format file is less portable across machine architectures and
PostgreSQL versions. Also, the binary format is very data type specific;
for example it will not work to output binary data from a smallint column
and read it into an integer column, even though that would work fine in
text format.

By itself it is similar in badness as both require completely different
semantics than insert..
On Sat, 10 Jun 2017 at 22:12, Nicolas Paris  wrote:

> > I tried the copy command, and that indeed works quite brilliantly:
> > Inserted 2400 rows in 22004 milliseconds, 1090710.7798582076 rows per
> > second
> >
> > That's faster than Oracle. But with a very bad interface I have to say
> for
> > normal database work.. I will try to make this work in the tooling, but
> it
> > needs some very special code to format all possible values properly, and
> to
> > manage the end of the copy, so it is not usable in general which is a
> pity, I
> > think.
>
> Have you thought about the COPY with binary format ? Thats looks more
> robust than the text format you used in your benchmarks.
>


Re: [PERFORM] Improving PostgreSQL insert performance

2017-06-10 Thread Nicolas Paris
> I tried the copy command, and that indeed works quite brilliantly:
> Inserted 2400 rows in 22004 milliseconds, 1090710.7798582076 rows per
> second
> 
> That's faster than Oracle. But with a very bad interface I have to say for
> normal database work.. I will try to make this work in the tooling, but it
> needs some very special code to format all possible values properly, and to
> manage the end of the copy, so it is not usable in general which is a pity, I
> think.

Have you thought about the COPY with binary format ? Thats looks more
robust than the text format you used in your benchmarks.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Improving PostgreSQL insert performance

2017-06-10 Thread Frits Jalvingh
On Sat, Jun 10, 2017 at 12:08 AM Vladimir Sitnikov <
sitnikov.vladi...@gmail.com> wrote:

> Would you mind sharing the source code of your benchmark?
>

The source code for the several tests, plus the numbers collected so far,
can be found at:

https://etc.to/confluence/display/~admjal/PostgreSQL+performance+tests

Regards,

Frits


Re: [PERFORM] Improving PostgreSQL insert performance

2017-06-09 Thread Vladimir Sitnikov
Frits,

Would you mind sharing the source code of your benchmark?

>BTW: It seems you need a recent driver for this; I'm
using postgresql-42.1.1.jar

Technically speaking, reWriteBatchedInserts was introduced in 9.4.1209
(2016-07-15)

Vladimir


Re: [PERFORM] Improving PostgreSQL insert performance

2017-06-09 Thread Jeff Janes
On Fri, Jun 9, 2017 at 6:04 AM, Frits Jalvingh  wrote:

>
> I already changed the following config parameters:
> work_mem 512MB
> synchronous_commit off
>

Since you are already batching up commits into large chunks, this setting
is not very useful, but does risk you losing supposedly-committed data upon
a crash.  I would not do it.


> shared_buffers 512mb
>

You might try increasing wal_buffers, but the default for this size of
shared_buffers is 16MB, which is usually big enough.

One thing you are missing is max_wal_size.  The default value of that is
probably too small for what you are doing.

But if you are not using COPY, then maybe none of this matters as the
bottleneck will be elsewhere.

Cheers,

Jeff


Re: [PERFORM] Improving PostgreSQL insert performance

2017-06-09 Thread Frits Jalvingh
The parameter was
reWriteBatchedInserts = true

Either added in connection properties of in the connection URL like

jdbc:postgresql://localhost:5432/datavault_12_tst?reWriteBatchedInserts=true

BTW: It seems you need a recent driver for this; I'm
using postgresql-42.1.1.jar

On Fri, Jun 9, 2017 at 5:33 PM Kenneth Marshall  wrote:

> On Fri, Jun 09, 2017 at 03:22:35PM +, Frits Jalvingh wrote:
> > Hi Babu,
> >
> > That was all already done, as it is common practice for JDBC. Your
> > parameter was added to the code that already did all that - and worked
> > brilliantly there ;)
> >
> Hi Frits,
>
> What was the parameter? I did not see an Email in the thread from Babu.
>
> Regards,
> Ken
>


Re: [PERFORM] Improving PostgreSQL insert performance

2017-06-09 Thread Kenneth Marshall
On Fri, Jun 09, 2017 at 03:22:35PM +, Frits Jalvingh wrote:
> Hi Babu,
> 
> That was all already done, as it is common practice for JDBC. Your
> parameter was added to the code that already did all that - and worked
> brilliantly there ;)
> 
Hi Frits,

What was the parameter? I did not see an Email in the thread from Babu.

Regards,
Ken


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Improving PostgreSQL insert performance

2017-06-09 Thread Frits Jalvingh
Hi Babu,

That was all already done, as it is common practice for JDBC. Your
parameter was added to the code that already did all that - and worked
brilliantly there ;)


>


Re: [PERFORM] Improving PostgreSQL insert performance

2017-06-09 Thread Scott Marlowe
On Fri, Jun 9, 2017 at 9:12 AM, Frits Jalvingh  wrote:
> Hi John,
>
> Yes, I was aware and amazed by that ;) It is actually the fetch size in
> combination with autocommit being on; that dies the sweet OOM death as soon
> as the table gets big.
>
> But Postgres read performance, with autocommit off and fetch size arond 64K,
> is quite OK. But it's good to get this mentioned a lot, because as you said
> you can spend quite some time wondering about this!

No production db server should have the oom killer enabled.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Improving PostgreSQL insert performance

2017-06-09 Thread Frits Jalvingh
Hi John,

Yes, I was aware and amazed by that ;) It is actually the fetch size in
combination with autocommit being on; that dies the sweet OOM death as soon
as the table gets big.

But Postgres read performance, with autocommit off and fetch size arond
64K, is quite OK. But it's good to get this mentioned a lot, because as you
said you can spend quite some time wondering about this!

On Fri, Jun 9, 2017 at 5:08 PM John Gorman <jgor...@eldocomp.com> wrote:

> You need to be careful with the setFetchSize we have tables with over 10
> million rows and many columns and the PostgreSQL JDBC driver silently
> fails, ignores the fetch size and tries to read the entire table content
> into memory. I spent many agonizing days on this.
>
>
>
> ps.setFetchSize(65536);
>
>
>
> Regards
>
> John
>
>
>
>
>
> *From:* pgsql-performance-ow...@postgresql.org [mailto:
> pgsql-performance-ow...@postgresql.org] *On Behalf Of *Frits Jalvingh
> *Sent:* Friday, June 09, 2017 7:55 AM
> *To:* Sunkara, Amrutha; pgsql-performance@postgresql.org
>
>
> *Subject:* Re: [PERFORM] Improving PostgreSQL insert performance
>
>
>
> I am not doing anything special I guess. I am adding the results of the
> tests and the programs I'm using to the following page:
>
>
>
> https://etc.to/confluence/display/~admjal/PostgreSQL+performance+tests
>
>
>
> The copy example, in Java, is at the end. All of the examples use trivial
> data and the same data. If you find fault please let me know ;) But the
> copy does insert the records as they can be seen ;)
>
> On Fri, Jun 9, 2017 at 4:47 PM Sunkara, Amrutha <amru...@nytimes.com>
> wrote:
>
> Frits,
>
>
>
> When you use the copy command, are you doing anything special to get the
> run time that you are indicating?
>
>


Re: [PERFORM] Improving PostgreSQL insert performance

2017-06-09 Thread John Gorman
You need to be careful with the setFetchSize we have tables with over 10 
million rows and many columns and the PostgreSQL JDBC driver silently fails, 
ignores the fetch size and tries to read the entire table content into memory. 
I spent many agonizing days on this.

ps.setFetchSize(65536);

Regards
John


From: pgsql-performance-ow...@postgresql.org 
[mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of Frits Jalvingh
Sent: Friday, June 09, 2017 7:55 AM
To: Sunkara, Amrutha; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Improving PostgreSQL insert performance

I am not doing anything special I guess. I am adding the results of the tests 
and the programs I'm using to the following page:

https://etc.to/confluence/display/~admjal/PostgreSQL+performance+tests

The copy example, in Java, is at the end. All of the examples use trivial data 
and the same data. If you find fault please let me know ;) But the copy does 
insert the records as they can be seen ;)
On Fri, Jun 9, 2017 at 4:47 PM Sunkara, Amrutha 
<amru...@nytimes.com<mailto:amru...@nytimes.com>> wrote:
Frits,

When you use the copy command, are you doing anything special to get the run 
time that you are indicating?


Re: [PERFORM] Improving PostgreSQL insert performance

2017-06-09 Thread Frits Jalvingh
Hi Babu,

No, I did not, and the effect is quite great:

Inserted 100 rows in 2535 milliseconds, 394477.3175542406 rows per
second
Inserted 100 rows in 2553 milliseconds, 391696.0438699569 rows per
second

compared to (without your parameter):
Inserted 100 rows in 7643 milliseconds, 130838.67591259976 rows per
second

That is quite an increase!! Thanks a lot for the tip!!

For those keeping score: we're now at 77% of Oracle's performance- without
copy ;)




>


Re: [PERFORM] Improving PostgreSQL insert performance

2017-06-09 Thread Frits Jalvingh
Hi Kenneth,

I tried unlogged before, but as long as the commit interval is long it had
no discerning effect that I could see.


Re: [PERFORM] Improving PostgreSQL insert performance

2017-06-09 Thread Frits Jalvingh
I am not doing anything special I guess. I am adding the results of the
tests and the programs I'm using to the following page:

https://etc.to/confluence/display/~admjal/PostgreSQL+performance+tests

The copy example, in Java, is at the end. All of the examples use trivial
data and the same data. If you find fault please let me know ;) But the
copy does insert the records as they can be seen ;)

On Fri, Jun 9, 2017 at 4:47 PM Sunkara, Amrutha  wrote:

> Frits,
>
> When you use the copy command, are you doing anything special to get the
> run time that you are indicating?
>


Re: [PERFORM] Improving PostgreSQL insert performance

2017-06-09 Thread Kenneth Marshall
On Fri, Jun 09, 2017 at 02:39:37PM +, Frits Jalvingh wrote:
> Hi all,
> 
> Thanks a lot for the many responses!
> 
> About preparing statements: this is done properly in Java, and pgsql does
> it by itself. So that cannot be done better ;)
> 
> I tried the copy command, and that indeed works quite brilliantly:
> Inserted 2400 rows in 22004 milliseconds, 1090710.7798582076 rows per
> second
> 
> That's faster than Oracle. But with a very bad interface I have to say for
> normal database work.. I will try to make this work in the tooling, but it
> needs some very special code to format all possible values properly, and to
> manage the end of the copy, so it is not usable in general which is a pity,
> I think.
> 
> So, I am still very interested in getting normal inserts faster, because
> that will gain speed for all work.. If Oracle can do it, and Postgres is
> able to insert fast with copy- where lies the bottleneck with the insert
> command? There seems to be quite a performance hit with the JDBC driver
> itself (as the stored procedure is a lot faster), so I can look into that.
> But even after that there is quite a gap..
> 
> Regards,
> 
> Frits

Hi Frits,

Have you looked at UNLOGGED tables and also having more that 1 insert
stream running at a time. Sometimes multiple parallel inserts can be
faster.

Regards,
Ken


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Improving PostgreSQL insert performance

2017-06-09 Thread Sunkara, Amrutha
Frits,

When you use the copy command, are you doing anything special to get the
run time that you are indicating?

On Fri, Jun 9, 2017 at 10:39 AM, Frits Jalvingh  wrote:

> Hi all,
>
> Thanks a lot for the many responses!
>
> About preparing statements: this is done properly in Java, and pgsql does
> it by itself. So that cannot be done better ;)
>
> I tried the copy command, and that indeed works quite brilliantly:
> Inserted 2400 rows in 22004 milliseconds, 1090710.7798582076
> <(779)%20858-2076> rows per second
>
> That's faster than Oracle. But with a very bad interface I have to say for
> normal database work.. I will try to make this work in the tooling, but it
> needs some very special code to format all possible values properly, and to
> manage the end of the copy, so it is not usable in general which is a pity,
> I think.
>
> So, I am still very interested in getting normal inserts faster, because
> that will gain speed for all work.. If Oracle can do it, and Postgres is
> able to insert fast with copy- where lies the bottleneck with the insert
> command? There seems to be quite a performance hit with the JDBC driver
> itself (as the stored procedure is a lot faster), so I can look into that.
> But even after that there is quite a gap..
>
> Regards,
>
> Frits
>
> On Fri, Jun 9, 2017 at 4:33 PM Scott Marlowe 
> wrote:
>
>> On Fri, Jun 9, 2017 at 7:56 AM, Frits Jalvingh  wrote:
>> > Hi Kenneth, Andreas,
>> >
>> > Thanks for your tips!
>> >
>> > I increased shared_buffers to 8GB but it has no measurable effect at
>> all. I
>> > think that is logical: shared buffers are important for querying but
>> not for
>> > inserting; for that the speed to write to disk seems most important- no
>> big
>> > reason to cache the data if the commit requires a full write anyway.
>> > I also changed the code to do only one commit; this also has no effect
>> I can
>> > see.
>> >
>> > It is true that Oracle had more memory assigned to it (1.5G), but unlike
>> > Postgres (which is completely on a fast SSD) Oracle runs on slower disk
>> > (ZFS)..
>> >
>> > I will try copy, but I first need to investigate how to use it- its
>> > interface seems odd to say the least ;) I'll report back on that once
>> done.
>>
>> I you want an example of copy, just pg_dump a table:
>>
>> pg_dump -d smarlowe -t test
>>
>> (SNIP)
>> COPY test (a, b) FROM stdin;
>> 1 abc
>> 2 xyz
>> \.
>> (SNIP)
>>
>


Re: [PERFORM] Improving PostgreSQL insert performance

2017-06-09 Thread Frits Jalvingh
Hi all,

Thanks a lot for the many responses!

About preparing statements: this is done properly in Java, and pgsql does
it by itself. So that cannot be done better ;)

I tried the copy command, and that indeed works quite brilliantly:
Inserted 2400 rows in 22004 milliseconds, 1090710.7798582076 rows per
second

That's faster than Oracle. But with a very bad interface I have to say for
normal database work.. I will try to make this work in the tooling, but it
needs some very special code to format all possible values properly, and to
manage the end of the copy, so it is not usable in general which is a pity,
I think.

So, I am still very interested in getting normal inserts faster, because
that will gain speed for all work.. If Oracle can do it, and Postgres is
able to insert fast with copy- where lies the bottleneck with the insert
command? There seems to be quite a performance hit with the JDBC driver
itself (as the stored procedure is a lot faster), so I can look into that.
But even after that there is quite a gap..

Regards,

Frits

On Fri, Jun 9, 2017 at 4:33 PM Scott Marlowe 
wrote:

> On Fri, Jun 9, 2017 at 7:56 AM, Frits Jalvingh  wrote:
> > Hi Kenneth, Andreas,
> >
> > Thanks for your tips!
> >
> > I increased shared_buffers to 8GB but it has no measurable effect at
> all. I
> > think that is logical: shared buffers are important for querying but not
> for
> > inserting; for that the speed to write to disk seems most important- no
> big
> > reason to cache the data if the commit requires a full write anyway.
> > I also changed the code to do only one commit; this also has no effect I
> can
> > see.
> >
> > It is true that Oracle had more memory assigned to it (1.5G), but unlike
> > Postgres (which is completely on a fast SSD) Oracle runs on slower disk
> > (ZFS)..
> >
> > I will try copy, but I first need to investigate how to use it- its
> > interface seems odd to say the least ;) I'll report back on that once
> done.
>
> I you want an example of copy, just pg_dump a table:
>
> pg_dump -d smarlowe -t test
>
> (SNIP)
> COPY test (a, b) FROM stdin;
> 1 abc
> 2 xyz
> \.
> (SNIP)
>


Re: [PERFORM] Improving PostgreSQL insert performance

2017-06-09 Thread Kenneth Marshall
On Fri, Jun 09, 2017 at 01:56:58PM +, Frits Jalvingh wrote:
> Hi Kenneth, Andreas,
> 
> Thanks for your tips!
> 
> I increased shared_buffers to 8GB but it has no measurable effect at all. I
> think that is logical: shared buffers are important for querying but not
> for inserting; for that the speed to write to disk seems most important- no
> big reason to cache the data if the commit requires a full write anyway.
> I also changed the code to do only one commit; this also has no effect I
> can see.
> 
> It is true that Oracle had more memory assigned to it (1.5G), but unlike
> Postgres (which is completely on a fast SSD) Oracle runs on slower disk
> (ZFS)..
> 
> I will try copy, but I first need to investigate how to use it- its
> interface seems odd to say the least ;) I'll report back on that once done.
> 
> Any other tips would be welcome!
> 
> Regards,
> 
> Frits

Hi Frits,

Here is an article that is still valid:

https://www.depesz.com/2007/07/05/how-to-insert-data-to-database-as-fast-as-possible/

Regards,
Ken


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Improving PostgreSQL insert performance

2017-06-09 Thread Scott Marlowe
On Fri, Jun 9, 2017 at 7:56 AM, Frits Jalvingh  wrote:
> Hi Kenneth, Andreas,
>
> Thanks for your tips!
>
> I increased shared_buffers to 8GB but it has no measurable effect at all. I
> think that is logical: shared buffers are important for querying but not for
> inserting; for that the speed to write to disk seems most important- no big
> reason to cache the data if the commit requires a full write anyway.
> I also changed the code to do only one commit; this also has no effect I can
> see.
>
> It is true that Oracle had more memory assigned to it (1.5G), but unlike
> Postgres (which is completely on a fast SSD) Oracle runs on slower disk
> (ZFS)..
>
> I will try copy, but I first need to investigate how to use it- its
> interface seems odd to say the least ;) I'll report back on that once done.

I you want an example of copy, just pg_dump a table:

pg_dump -d smarlowe -t test

(SNIP)
COPY test (a, b) FROM stdin;
1 abc
2 xyz
\.
(SNIP)


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Improving PostgreSQL insert performance

2017-06-09 Thread Frits Jalvingh
Hi Kenneth, Andreas,

Thanks for your tips!

I increased shared_buffers to 8GB but it has no measurable effect at all. I
think that is logical: shared buffers are important for querying but not
for inserting; for that the speed to write to disk seems most important- no
big reason to cache the data if the commit requires a full write anyway.
I also changed the code to do only one commit; this also has no effect I
can see.

It is true that Oracle had more memory assigned to it (1.5G), but unlike
Postgres (which is completely on a fast SSD) Oracle runs on slower disk
(ZFS)..

I will try copy, but I first need to investigate how to use it- its
interface seems odd to say the least ;) I'll report back on that once done.

Any other tips would be welcome!

Regards,

Frits

On Fri, Jun 9, 2017 at 3:30 PM Kenneth Marshall  wrote:

> On Fri, Jun 09, 2017 at 03:24:15PM +0200, Andreas Kretschmer wrote:
> >
> >
> > Am 09.06.2017 um 15:04 schrieb Frits Jalvingh:
> > >Hi all,
> > >
> > >I am trying to improve the runtime of a big data warehouse
> > >application. One significant bottleneck found was insert
> > >performance, so I am investigating ways of getting Postgresql to
> > >insert data faster.
> >
> > * use COPY instead of Insert, it is much faster
> > * bundle all Insert into one transaction
> > * use a separate disk/spindel for the transaction log
> >
> >
> >
> > >
> > >I already changed the following config parameters:
> > >work_mem 512MB
> > >synchronous_commit off
> > >shared_buffers 512mb
> > >commit_delay 10
> > >autovacuum_naptime 10min
> > >
> > >Postgres version is 9.6.3 on Ubuntu 17.04 64 bit, on a i7-4790K
> > >with 16GB memory and an Intel 750 SSD. JDBC driver is
> > >postgresql-42.1.1.
> > >
> >
> > increase shared_buffers, with 16gb ram i would suggest 8gb
>
> +1 Without even checking, I think Oracle is configured to use a LOT
> more memory than 512mb.
>
> Regards,
> Ken
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>


Re: [PERFORM] Improving PostgreSQL insert performance

2017-06-09 Thread Kenneth Marshall
On Fri, Jun 09, 2017 at 03:24:15PM +0200, Andreas Kretschmer wrote:
> 
> 
> Am 09.06.2017 um 15:04 schrieb Frits Jalvingh:
> >Hi all,
> >
> >I am trying to improve the runtime of a big data warehouse
> >application. One significant bottleneck found was insert
> >performance, so I am investigating ways of getting Postgresql to
> >insert data faster.
> 
> * use COPY instead of Insert, it is much faster
> * bundle all Insert into one transaction
> * use a separate disk/spindel for the transaction log
> 
> 
> 
> >
> >I already changed the following config parameters:
> >work_mem 512MB
> >synchronous_commit off
> >shared_buffers 512mb
> >commit_delay 10
> >autovacuum_naptime 10min
> >
> >Postgres version is 9.6.3 on Ubuntu 17.04 64 bit, on a i7-4790K
> >with 16GB memory and an Intel 750 SSD. JDBC driver is
> >postgresql-42.1.1.
> >
> 
> increase shared_buffers, with 16gb ram i would suggest 8gb

+1 Without even checking, I think Oracle is configured to use a LOT
more memory than 512mb.

Regards,
Ken


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Improving PostgreSQL insert performance

2017-06-09 Thread Andreas Kretschmer



Am 09.06.2017 um 15:04 schrieb Frits Jalvingh:

Hi all,

I am trying to improve the runtime of a big data warehouse 
application. One significant bottleneck found was insert performance, 
so I am investigating ways of getting Postgresql to insert data faster.


* use COPY instead of Insert, it is much faster
* bundle all Insert into one transaction
* use a separate disk/spindel for the transaction log





I already changed the following config parameters:
work_mem 512MB
synchronous_commit off
shared_buffers 512mb
commit_delay 10
autovacuum_naptime 10min

Postgres version is 9.6.3 on Ubuntu 17.04 64 bit, on a i7-4790K with 
16GB memory and an Intel 750 SSD. JDBC driver is postgresql-42.1.1.




increase shared_buffers, with 16gb ram i would suggest 8gb


Regards, Andreas
--
2ndQuadrant - The PostgreSQL Support Company.
www.2ndQuadrant.com



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Improving PostgreSQL insert performance

2017-06-09 Thread Frits Jalvingh
Hi all,

I am trying to improve the runtime of a big data warehouse application. One
significant bottleneck found was insert performance, so I am investigating
ways of getting Postgresql to insert data faster. I ran several tests on a
fast machine to find out what performs best, and compared the results with
the same actions in Oracle on that same machine.

So far I am finding that PostgreSQL insert performance is several times
slower than Oracle performance, and I would be grateful for some help
trying to decrease this gap...

To test I wrote a simple Java program which inserts data into a simple
table, using statement batching and delayed commits. The table looks as
follows:

create table h_test(
id_h integer
, source_system_id integer
, organisation_id integer
, load_dts timestamp without time zone
, boekingdetailid text
);
No constraints, no indexes.

The java program and PostgreSQL run on the same machine. The best results
I've got are:

PostgreSQL inserts:

Commit size 50.000 and batch size 10.000
Inserted 100 rows in 7500 milliseconds, 142857.14285714287 rows per
second
Inserted 100 rows in 7410 milliseconds, 142857.14285714287 rows per
second

The exact same test done on Oracle (on the same machine) reports:

Inserted 100 rows in 1072 milliseconds, 100.0 rows per second

Increasing the row count in Oracle decreases this number a bit, but it's
still fast:
Inserted 2400 rows in 47155 milliseconds, 510638.2978723404 rows per
second (oracle)

compared with:
Inserted 2400 rows in 159929 milliseconds, 150943.3962264151 rows per
second (postgresql)

I also created a small pg/sql stored procedure to insert the same 1 million
rows, which runs in about 4 seconds, resulting in 250.000 rows a second.
This is in the DB itself, but it still is twice as slow as Oracle with JDBC:
CREATE or replace function test() returns void AS $$
DECLARE
count integer;
BEGIN
for count in 1..100 loop
insert into
h_test(id_h,source_system_id,organisation_id,load_dts,boekingdetailid)
values(count, 1, 12, now(), 'boe' || count || 'king' || count);
end loop;
END;
$$ LANGUAGE plpgsql;


I already changed the following config parameters:
work_mem 512MB
synchronous_commit off
shared_buffers 512mb
commit_delay 10
autovacuum_naptime 10min

Postgres version is 9.6.3 on Ubuntu 17.04 64 bit, on a i7-4790K with 16GB
memory and an Intel 750 SSD. JDBC driver is postgresql-42.1.1.

(btw: the actual load I'm trying to improve will load more than 132 million
rows, and will grow).

Any help is greatly appreciated!

Regards,

Frits