Re: [PERFORM] performance libpq vs JDBC

2010-12-16 Thread Divakar Singh
Can you trying writing libpq program using COPY functions?
I hope it will be better than prepared statements.

 Best Regards,
Divakar





From: Werner Scholtes werner.schol...@heuboe.de
To: pgsql-performance@postgresql.org pgsql-performance@postgresql.org
Sent: Wed, December 15, 2010 8:21:55 PM
Subject: [PERFORM] performance libpq vs JDBC


I wrote a test program in C++ using libpq. It works as follows (pseudo code):
 
for( int loop = 0; loop  1000; ++loop ) {
   PQexec(BEGIN);
   const char* sql = INSERT INTO pg_perf_test (id, text) VALUES($1,$2);
   PQprepare(m_conn,stmtid,sql,0,NULL);
   for ( int i = 0; i  1000; ++i ) 
  // Set values etc.
  PQexecPrepared(m_conn,…);
   }
   PQexec(DEALLOCATE stmtid);
   PQexec(COMMIT);   
}
 
I measured the duration of every loop of the outer for-loop resulting in an 
average of 450 ms (per 1000 data sets insert)
 
After that, I wrote a test program in Java using JDBC. It works as follows:
 
for( intloops = 0; loops  1000; ++i) {
   String sql = INSERT INTO pq_perf_test (id,text) VALUES (?,?);
   PreparedStatement stmt = con.prepareStatement(sql);
   for(inti = 0; i  1000; ++i ) {
  // Set values etc.
  stmt.addBatch();
   }
   stmt.executeBatch();
   con.commit();
   stmt.close();
}
 
I measured the duration of every loop of the outer for-loop resulting in an 
average of 100 ms (per 1000 data sets insert)
 
This means that accessing PostgreSQL by JDBC is about 4-5 times faster than 
using libpq. 

 
Comparable  results have been measured with analog update and delete 
statements. 

 
I need to enhance the performance of my C++ code. Is there any possibility in 
libpq to reach the performance of JDBC for INSERT, UPDATE and DELETE statements 
(I have no chance to use COPY statements)? I didn't find anything comparable to 
PreparedStatement.executeBatch() in libpq.
 
Best regards,
Werner Scholtes


  

Re: [PERFORM] performance libpq vs JDBC

2010-12-16 Thread Werner Scholtes
Unfortunately I cannot use COPY funtion, since I need the performance of JDBC 
for update and delete statements in C++ libpq-program as well.

I wonder how JDBC  PreparedStatement.addBatch() and 
PreparedStatement.executeBatch() work. They need to have a more efficient 
protocol to send bulks of parameter sets for one prepared statement as batch in 
one network transmission to the server. As far as I could see PQexecPrepared 
does not allow to send more than one parameter set (parameters for one row) in 
one call. So libpq sends 1000 times one single row to the server where JDBC 
sends 1 time 1000 rows, which is much more efficient.

I assume that the wire protocol of PostgreSQL allows to transmit multiple rows 
at once, but libpq doesn't have an interface to access it. Is that right?

Von: Divakar Singh [mailto:dpsma...@yahoo.com]
Gesendet: Donnerstag, 16. Dezember 2010 09:11
An: Werner Scholtes; pgsql-performance@postgresql.org
Betreff: Re: [PERFORM] performance libpq vs JDBC

Can you trying writing libpq program using COPY functions?
I hope it will be better than prepared statements.

Best Regards,
Divakar



From: Werner Scholtes werner.schol...@heuboe.de
To: pgsql-performance@postgresql.org pgsql-performance@postgresql.org
Sent: Wed, December 15, 2010 8:21:55 PM
Subject: [PERFORM] performance libpq vs JDBC


I wrote a test program in C++ using libpq. It works as follows (pseudo code):

for ( int loop = 0; loop  1000; ++loop ) {
   PQexec(BEGIN);
   const char* sql = INSERT INTO pg_perf_test (id, text) VALUES($1,$2);
   PQprepare(m_conn, stmtid,sql,0,NULL);
   for ( int i = 0; i  1000; ++i )
  // Set values etc.
  PQexecPrepared(m_conn,…);
   }
   PQexec(DEALLOCATE stmtid);
   PQexec(COMMIT);
}

I measured the duration of every loop of the outer for-loop resulting in an 
average of 450 ms (per 1000 data sets insert)

After that, I wrote a test program in Java using JDBC. It works as follows:

for ( int loops = 0; loops  1000; ++i) {
   String sql = INSERT INTO pq_perf_test (id,text) VALUES (?,?);
   PreparedStatement stmt = con.prepareStatement(sql);
   for (int i = 0; i  1000; ++i ) {
  // Set values etc.
  stmt.addBatch();
   }
   stmt.executeBatch();
   con.commit();
   stmt.close();
}

I measured the duration of every loop of the outer for-loop resulting in an 
average of 100 ms (per 1000 data sets insert)

This means that accessing PostgreSQL by JDBC is about 4-5 times faster than 
using libpq.

Comparable  results have been measured with analog update and delete statements.

I need to enhance the performance of my C++ code. Is there any possibility in 
libpq to reach the performance of JDBC for INSERT, UPDATE and DELETE statements 
(I have no chance to use COPY statements)? I didn't find anything comparable to 
PreparedStatement.executeBatch() in libpq.

Best regards,
Werner Scholtes







Re: [PERFORM] performance libpq vs JDBC

2010-12-16 Thread Divakar Singh
If you have all records before issuing Insert, you can do it like: insert into 
xxx values (a,b,c), (d,e,f), ..;
an example: http://kaiv.wordpress.com/2007/07/19/faster-insert-for-multiple-rows


 Best Regards,
Divakar





From: Werner Scholtes werner.schol...@heuboe.de
To: Divakar Singh dpsma...@yahoo.com; pgsql-performance@postgresql.org 
pgsql-performance@postgresql.org
Sent: Thu, December 16, 2010 2:51:53 PM
Subject: RE: [PERFORM] performance libpq vs JDBC


Unfortunately I cannot use COPY funtion, since I need the performance of JDBC 
for update and delete statements in C++ libpq-program as well.
 
I wonder how JDBC  PreparedStatement.addBatch() and 
PreparedStatement.executeBatch() work. They need to have a more efficient 
protocol to send bulks of parameter sets for one prepared statement as batch in 
one network transmission to the server. As far as I could see PQexecPrepared 
does not allow to send more than one parameter set (parameters for one row) in 
one call. So libpq sends 1000 times one single row to the server where JDBC 
sends 1 time 1000 rows, which is much more efficient.
 
I assume that the wire protocol of PostgreSQL allows to transmit multiple rows 
at once, but libpq doesn't have an interface to access it. Is that right? 

 
Von:Divakar Singh [mailto:dpsma...@yahoo.com] 
Gesendet: Donnerstag, 16. Dezember 2010 09:11
An: Werner Scholtes; pgsql-performance@postgresql.org
Betreff: Re: [PERFORM] performance libpq vs JDBC
 
Can you trying writing libpq program using COPY functions?
I hope it will be better than prepared statements.
 
Best Regards,
Divakar
 
 



From:Werner Scholtes werner.schol...@heuboe.de
To: pgsql-performance@postgresql.org pgsql-performance@postgresql.org
Sent: Wed, December 15, 2010 8:21:55 PM
Subject: [PERFORM] performance libpq vs JDBC



I wrote a test program in C++ using libpq. It works as follows (pseudo code):
 
for( int loop = 0; loop  1000; ++loop ) {
   PQexec(BEGIN);
   const char* sql = INSERT INTO pg_perf_test (id, text) VALUES($1,$2);
   PQprepare(m_conn,stmtid,sql,0,NULL);
   for ( int i = 0; i  1000; ++i ) 
  // Set values etc.
  PQexecPrepared(m_conn,…);
   }
   PQexec(DEALLOCATE stmtid);
   PQexec(COMMIT);   
}
 
I measured the duration of every loop of the outer for-loop resulting in an 
average of 450 ms (per 1000 data sets insert)
 
After that, I wrote a test program in Java using JDBC. It works as follows:
 
for( intloops = 0; loops  1000; ++i) {
   String sql = INSERT INTO pq_perf_test (id,text) VALUES (?,?);
   PreparedStatement stmt = con.prepareStatement(sql);
   for(inti = 0; i  1000; ++i ) {
  // Set values etc.
  stmt.addBatch();
   }
   stmt.executeBatch();
   con.commit();
   stmt.close();
}
 
I measured the duration of every loop of the outer for-loop resulting in an 
average of 100 ms (per 1000 data sets insert)
 
This means that accessing PostgreSQL by JDBC is about 4-5 times faster than 
using libpq. 

 
Comparable  results have been measured with analog update and delete 
statements. 

 
I need to enhance the performance of my C++ code. Is there any possibility in 
libpq to reach the performance of JDBC for INSERT, UPDATE and DELETE statements 
(I have no chance to use COPY statements)? I didn't find anything comparable to 
PreparedStatement.executeBatch() in libpq.
 
Best regards,
Werner Scholtes


  

Re: [PERFORM] performance libpq vs JDBC

2010-12-16 Thread Werner Scholtes
What about update and delete? In case of an update I have all records to be 
updated and in case of an delete I have all primary key values of records to be 
deleted.

Von: pgsql-performance-ow...@postgresql.org 
[mailto:pgsql-performance-ow...@postgresql.org] Im Auftrag von Divakar Singh
Gesendet: Donnerstag, 16. Dezember 2010 10:38
An: Werner Scholtes; pgsql-performance@postgresql.org
Betreff: Re: [PERFORM] performance libpq vs JDBC

If you have all records before issuing Insert, you can do it like: insert into 
xxx values (a,b,c), (d,e,f), ..;
an example: http://kaiv.wordpress.com/2007/07/19/faster-insert-for-multiple-rows

Best Regards,
Divakar



From: Werner Scholtes werner.schol...@heuboe.de
To: Divakar Singh dpsma...@yahoo.com; pgsql-performance@postgresql.org 
pgsql-performance@postgresql.org
Sent: Thu, December 16, 2010 2:51:53 PM
Subject: RE: [PERFORM] performance libpq vs JDBC


Unfortunately I cannot use COPY funtion, since I need the performance of JDBC 
for update and delete statements in C++ libpq-program as well.

I wonder how JDBC  PreparedStatement.addBatch() and 
PreparedStatement.executeBatch() work. They need to have a more efficient 
protocol to send bulks of parameter sets for one prepared statement as batch in 
one network transmission to the server. As far as I could see PQexecPrepared 
does not allow to send more than one parameter set (parameters for one row) in 
one call. So libpq sends 1000 times one single row to the server where JDBC 
sends 1 time 1000 rows, which is much more efficient.

I assume that the wire protocol of PostgreSQL allows to transmit multiple rows 
at once, but libpq doesn't have an interface to access it. Is that right?

Von: Divakar Singh [mailto:dpsma...@yahoo.com]
Gesendet: Donnerstag, 16. Dezember 2010 09:11
An: Werner Scholtes; pgsql-performance@postgresql.org
Betreff: Re: [PERFORM] performance libpq vs JDBC

Can you trying writing libpq program using COPY functions?
I hope it will be better than prepared statements.

Best Regards,
Divakar



From: Werner Scholtes werner.schol...@heuboe.de
To: pgsql-performance@postgresql.org pgsql-performance@postgresql.org
Sent: Wed, December 15, 2010 8:21:55 PM
Subject: [PERFORM] performance libpq vs JDBC

I wrote a test program in C++ using libpq. It works as follows (pseudo code):

for ( int loop = 0; loop  1000; ++loop ) {
   PQexec(BEGIN);
   const char* sql = INSERT INTO pg_perf_test (id, text) VALUES($1,$2);
   PQprepare(m_conn, stmtid,sql,0,NULL);
   for ( int i = 0; i  1000; ++i )
  // Set values etc.
  PQexecPrepared(m_conn,…);
   }
   PQexec(DEALLOCATE stmtid);
   PQexec(COMMIT);
}

I measured the duration of every loop of the outer for-loop resulting in an 
average of 450 ms (per 1000 data sets insert)

After that, I wrote a test program in Java using JDBC. It works as follows:

for ( int loops = 0; loops  1000; ++i) {
   String sql = INSERT INTO pq_perf_test (id,text) VALUES (?,?);
   PreparedStatement stmt = con.prepareStatement(sql);
   for (int i = 0; i  1000; ++i ) {
  // Set values etc.
  stmt.addBatch();
   }
   stmt.executeBatch();
   con.commit();
   stmt.close();
}

I measured the duration of every loop of the outer for-loop resulting in an 
average of 100 ms (per 1000 data sets insert)

This means that accessing PostgreSQL by JDBC is about 4-5 times faster than 
using libpq.

Comparable  results have been measured with analog update and delete statements.

I need to enhance the performance of my C++ code. Is there any possibility in 
libpq to reach the performance of JDBC for INSERT, UPDATE and DELETE statements 
(I have no chance to use COPY statements)? I didn't find anything comparable to 
PreparedStatement.executeBatch() in libpq.

Best regards,
Werner Scholtes








Re: [PERFORM] performance libpq vs JDBC

2010-12-16 Thread Divakar Singh
Update and delete are the operations which affect more than 1 row in general.
The only thing is that the criteria has to be the same for all rows.
If you have different criteria for different rows in case of update or delete, 
you will have to fire 2 queries.

I mean, if you want to do
1. delete from xyz where a = 1
and
2. delete from xyz where a = 2
Then you will have to run query 2 times.

 Best Regards,
Divakar





From: Werner Scholtes werner.schol...@heuboe.de
To: Divakar Singh dpsma...@yahoo.com; pgsql-performance@postgresql.org 
pgsql-performance@postgresql.org
Sent: Thu, December 16, 2010 3:11:36 PM
Subject: Re: [PERFORM] performance libpq vs JDBC


What about update and delete? In case of an update I have all records to be 
updated and in case of an delete I have all primary key values of records to be 
deleted. 

 
Von:pgsql-performance-ow...@postgresql.org 
[mailto:pgsql-performance-ow...@postgresql.org] Im Auftrag von Divakar Singh
Gesendet: Donnerstag, 16. Dezember 2010 10:38
An: Werner Scholtes; pgsql-performance@postgresql.org
Betreff: Re: [PERFORM] performance libpq vs JDBC
 
If you have all records before issuing Insert, you can do it like: insert into 
xxx values (a,b,c), (d,e,f), ..;
an example: http://kaiv.wordpress.com/2007/07/19/faster-insert-for-multiple-rows
 
Best Regards,
Divakar
 
 



From:Werner Scholtes werner.schol...@heuboe.de
To: Divakar Singh dpsma...@yahoo.com; pgsql-performance@postgresql.org 
pgsql-performance@postgresql.org
Sent: Thu, December 16, 2010 2:51:53 PM
Subject: RE: [PERFORM] performance libpq vs JDBC



Unfortunately I cannot use COPY funtion, since I need the performance of JDBC 
for update and delete statements in C++ libpq-program as well.
 
I wonder how JDBC  PreparedStatement.addBatch() and 
PreparedStatement.executeBatch() work. They need to have a more efficient 
protocol to send bulks of parameter sets for one prepared statement as batch in 
one network transmission to the server. As far as I could see PQexecPrepared 
does not allow to send more than one parameter set (parameters for one row) in 
one call. So libpq sends 1000 times one single row to the server where JDBC 
sends 1 time 1000 rows, which is much more efficient.
 
I assume that the wire protocol of PostgreSQL allows to transmit multiple rows 
at once, but libpq doesn't have an interface to access it. Is that right? 

 
Von:Divakar Singh [mailto:dpsma...@yahoo.com] 
Gesendet: Donnerstag, 16. Dezember 2010 09:11
An: Werner Scholtes; pgsql-performance@postgresql.org
Betreff: Re: [PERFORM] performance libpq vs JDBC
 
Can you trying writing libpq program using COPY functions?
I hope it will be better than prepared statements.
 
Best Regards,
Divakar
 
 



From:Werner Scholtes werner.schol...@heuboe.de
To: pgsql-performance@postgresql.org pgsql-performance@postgresql.org
Sent: Wed, December 15, 2010 8:21:55 PM
Subject: [PERFORM] performance libpq vs JDBC


I wrote a test program in C++ using libpq. It works as follows (pseudo code):
 
for( int loop = 0; loop  1000; ++loop ) {
   PQexec(BEGIN);
   const char* sql = INSERT INTO pg_perf_test (id, text) VALUES($1,$2);
   PQprepare(m_conn,stmtid,sql,0,NULL);
   for ( int i = 0; i  1000; ++i ) 
  // Set values etc.
  PQexecPrepared(m_conn,…);
   }
   PQexec(DEALLOCATE stmtid);
   PQexec(COMMIT);   
}
 
I measured the duration of every loop of the outer for-loop resulting in an 
average of 450 ms (per 1000 data sets insert)
 
After that, I wrote a test program in Java using JDBC. It works as follows:
 
for( intloops = 0; loops  1000; ++i) {
   String sql = INSERT INTO pq_perf_test (id,text) VALUES (?,?);
   PreparedStatement stmt = con.prepareStatement(sql);
   for(inti = 0; i  1000; ++i ) {
  // Set values etc.
  stmt.addBatch();
   }
   stmt.executeBatch();
   con.commit();
   stmt.close();
}
 
I measured the duration of every loop of the outer for-loop resulting in an 
average of 100 ms (per 1000 data sets insert)
 
This means that accessing PostgreSQL by JDBC is about 4-5 times faster than 
using libpq. 

 
Comparable  results have been measured with analog update and delete 
statements. 

 
I need to enhance the performance of my C++ code. Is there any possibility in 
libpq to reach the performance of JDBC for INSERT, UPDATE and DELETE statements 
(I have no chance to use COPY statements)? I didn't find anything comparable to 
PreparedStatement.executeBatch() in libpq.
 
Best regards,
Werner Scholtes


  

Re: [PERFORM] performance libpq vs JDBC

2010-12-16 Thread Richard Huxton

On 16/12/10 09:21, Werner Scholtes wrote:

I assume that the wire protocol of PostgreSQL allows to transmit
multiple rows at once, but libpq doesn't have an interface to access it.
Is that right?


Sounds wrong to me. The libpq client is the default reference 
implementation of the protocol. If there were large efficiencies that 
could be copied, they would be.


Anyway - you don't need to assume what's in the protocol. It's 
documented here:

  http://www.postgresql.org/docs/9.0/static/protocol.html

I'd stick wireshark or some other network analyser on the two sessions - 
see exactly what is different.


--
  Richard Huxton
  Archonet Ltd

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] How to get FK to use new index without restarting the database

2010-12-16 Thread Eric Comeau


Is there a way force the db to re-evaluate its execution plan for a FK 
without bouncing the DB?


 PostgreSQL 8.1.17

In our latest release our developers have implemented some new foreign 
keys but forgot to create indexes on these keys.


The problem surfaced at one of our client installs where a maintenance 
DELETE query was running for over 24 hrs. We have since then identified 
the missing indexes and have sent the client a script to create them, 
but in our testing we could not been able to get postgres to use the new 
index for the FK cascade delete without bouncing the database.


Here is an example of an added fk but missing index

ALTER TABLE scheduled_job_arg ADD CONSTRAINT sjr_scheduled_job_id_fk
  FOREIGN KEY (scheduled_job_id) REFERENCES scheduled_job (id)
 ON UPDATE CASCADE ON DELETE CASCADE;

Thanks in Advance,
Eric



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] performance libpq vs JDBC

2010-12-16 Thread Werner Scholtes
Thanks a lot for your advice. I found the difference: My Java program sends one 
huge SQL string containing 1000 INSERT statements separated by ';' (without 
using prepared statements at all!), whereas my C++ program sends one INSERT 
statement with parameters to be prepared and after that 1000 times parameters. 
Now I refactured my C++ program to send also 1000 INSERT statements in one call 
to PQexec and reached the same performance as my Java program.

I just wonder why anyone should use prepared statements at all?

 -Ursprüngliche Nachricht-
 Von: pgsql-performance-ow...@postgresql.org [mailto:pgsql-performance-
 ow...@postgresql.org] Im Auftrag von Richard Huxton
 Gesendet: Donnerstag, 16. Dezember 2010 13:15
 An: Werner Scholtes
 Cc: Divakar Singh; pgsql-performance@postgresql.org
 Betreff: Re: [PERFORM] performance libpq vs JDBC
 
 On 16/12/10 09:21, Werner Scholtes wrote:
  I assume that the wire protocol of PostgreSQL allows to transmit
  multiple rows at once, but libpq doesn't have an interface to access
 it.
  Is that right?
 
 Sounds wrong to me. The libpq client is the default reference
 implementation of the protocol. If there were large efficiencies that
 could be copied, they would be.
 
 Anyway - you don't need to assume what's in the protocol. It's
 documented here:
http://www.postgresql.org/docs/9.0/static/protocol.html
 
 I'd stick wireshark or some other network analyser on the two sessions
 -
 see exactly what is different.
 
 --
Richard Huxton
Archonet Ltd
 
 --
 Sent via pgsql-performance mailing list (pgsql-
 performa...@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] How to get FK to use new index without restarting the database

2010-12-16 Thread Jayadevan M
Hello,
 Is there a way force the db to re-evaluate its execution plan for a FK 
 without bouncing the DB?
 
   PostgreSQL 8.1.17
 
 In our latest release our developers have implemented some new foreign 
 keys but forgot to create indexes on these keys.
 
 The problem surfaced at one of our client installs where a maintenance 
 DELETE query was running for over 24 hrs. We have since then identified 
 the missing indexes and have sent the client a script to create them, 
 but in our testing we could not been able to get postgres to use the new 

 index for the FK cascade delete without bouncing the database.
Did you try analyze? May be it will help.
http://www.postgresql.org/docs/9.0/static/sql-analyze.html 

Regards,
Jayadevan





DISCLAIMER: 

The information in this e-mail and any attachment is intended only for 
the person to whom it is addressed and may contain confidential and/or 
privileged material. If you have received this e-mail in error, kindly 
contact the sender and destroy all copies of the original communication. 
IBS makes no warranty, express or implied, nor guarantees the accuracy, 
adequacy or completeness of the information contained in this email or any 
attachment and is not liable for any errors, defects, omissions, viruses 
or for resultant loss or damage, if any, direct or indirect.






-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] performance libpq vs JDBC

2010-12-16 Thread Richard Huxton

On 16/12/10 12:28, Werner Scholtes wrote:

Thanks a lot for your advice. I found the difference: My Java program
sends one huge SQL string containing 1000 INSERT statements separated
by ';' (without using prepared statements at all!), whereas my C++
program sends one INSERT statement with parameters to be prepared and
after that 1000 times parameters. Now I refactured my C++ program to
send also 1000 INSERT statements in one call to PQexec and reached
the same performance as my Java program.


So - it was the network round-trip overhead. Like Divakar suggested, 
COPY or VALUES (),(),() would work too.


You mention multiple updates/deletes too. Perhaps the cleanest and 
fastest method would be to build a TEMP table containing IDs/values 
required and join against that for your updates/deletes.



I just wonder why anyone should use prepared statements at all?


Not everything is a simple INSERT. Preparing saves planning-time on 
repeated SELECTs. It also provides some SQL injection safety since you 
provide parameters rather than building a SQL string.


--
  Richard Huxton
  Archonet Ltd

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] How to get FK to use new index without restarting the database

2010-12-16 Thread Richard Huxton

On 16/12/10 12:12, Eric Comeau wrote:


The problem surfaced at one of our client installs where a maintenance
DELETE query was running for over 24 hrs. We have since then identified
the missing indexes and have sent the client a script to create them,
but in our testing we could not been able to get postgres to use the new
index for the FK cascade delete without bouncing the database.


Well, an ongoing DELETE isn't going to see a new index. I'd have thought 
a new connection should though.


--
  Richard Huxton
  Archonet Ltd

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] How to get FK to use new index without restarting the database

2010-12-16 Thread Eric Comeau

On 10-12-16 07:34 AM, Jayadevan M wrote:

Hello,

Is there a way force the db to re-evaluate its execution plan for a FK
without bouncing the DB?

   PostgreSQL 8.1.17

In our latest release our developers have implemented some new foreign
keys but forgot to create indexes on these keys.

The problem surfaced at one of our client installs where a maintenance
DELETE query was running for over 24 hrs. We have since then identified
the missing indexes and have sent the client a script to create them,
but in our testing we could not been able to get postgres to use the new



index for the FK cascade delete without bouncing the database.

Did you try analyze? May be it will help.
http://www.postgresql.org/docs/9.0/static/sql-analyze.html


Yes we did. Thanks for the suggestion.



Regards,
Jayadevan





DISCLAIMER:

The information in this e-mail and any attachment is intended only for
the person to whom it is addressed and may contain confidential and/or
privileged material. If you have received this e-mail in error, kindly
contact the sender and destroy all copies of the original communication.
IBS makes no warranty, express or implied, nor guarantees the accuracy,
adequacy or completeness of the information contained in this email or any
attachment and is not liable for any errors, defects, omissions, viruses
or for resultant loss or damage, if any, direct or indirect.









--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Help with bulk read performance

2010-12-16 Thread Krzysztof Nienartowicz
Hello Daniel,
We have the same scenario for the native Java arrays, so we are storing bytea 
and doing conversion at the client side, but for the server side SQL,  plJava 
comes very handy:

No sure how you want to create stored procedures to convert internally but this 
is how we do this:

One has to define conversion routines in Java then deploy them to plJava. 
Scanning though this field would be still CPU bound, around 2x slower than with 
native arrays and 6x slower than with blobs,  but at least one has this 
ability. It's even possible to pass them to plR to do some statistical 
processing directly, so depending on the operations you do it may be still 
cheaper then streaming out  over the wire to the regular JDBC client.

1. deploy class like this within plJava (null handling left out for brevity)

import java.io.File;
import java.io.IOException;
import java.io.ObjectInputStream;
import java.io.ObjectOutputStream;

public class myArrayConversion 
{
   
public myArrayConversion() {}

/** Serialize double array to blob */
public static byte[] convertDoubleArrayToBytea(double[] obj) throws 
IOException {
 ByteArrayOutputStream baos =  new ByteArrayOutputStream();
 ObjectOutputStream oos = new ObjectOutputStream(baos);
 oos.writeObject(obj); 
 return baos.toByteArray();
}
/** Serialize int array to blob */
public static byte[] convertIntToBytea(int[] obj) throws IOException {
 ByteArrayOutputStream baos =  new ByteArrayOutputStream();
 ObjectOutputStream oos = new ObjectOutputStream(baos);
 oos.writeObject(obj); 
 return baos.toByteArray();
}

/** Deserialize blob to double array */
public static double[] convertToDoubleArray(byte[] obj) throws IOException,
 
ClassNotFoundException {
// Deserialize from a byte array
  ObjectInputStream ios = new ObjectInputStream(new 
ByteArrayInputStream(obj));
  return (double[])ios.readObject();
}

/** Deserialize blob to it array */
public static int[] convertIntToArray(byte[] obj) throws IOException,
 
ClassNotFoundException {
// Deserialize from a byte array
   ObjectInputStream ios = new ObjectInputStream(new 
ByteArrayInputStream(obj));
   return (int[])ios.readObject();
}


// other types arrays streaming...
//...
}

2. then create a mapping functions as a db owner:

sql
CREATE OR REPLACE FUNCTION public.convertDoubleArrayToBytea(double precision[])
  RETURNS bytea AS
'mappingPkg.convertDoubleArrayToBytea(double[])'
  LANGUAGE 'javau' IMMUTABLE
  COST 50;

GRANT EXECUTE ON FUNCTION public.convertDoubleArrayToBytea(double precision[]) 
TO public;



CREATE OR REPLACE FUNCTION public.convertToDoubleArray(bytea)
  RETURNS double precision[] AS
'mappingPkg.convertToDoubleArray(byte[])'
  LANGUAGE 'javau' IMMUTABLE
  COST 50;

GRANT EXECUTE ON FUNCTION public.convertToDoubleArray(bytea) TO public;
/sql


then you can have conversion either way:

select 
convertToDoubleArray(convertDoubleArrayToBytea(array[i::float8,1.1,100.1,i*0.1]::float8[]))
 from generate_series(1,100) i;

so you'd be also able to create bytea objects from native SQL arrays within SQL.

PLJava seems to be enjoying revival last days thanks to Johann 'Myrkraverk' 
Oskarsson who fixed several long-standing bugs. Check out the plJava list for 
details.


Krzysztof



On Dec 16, 2010, at 10:22 AM, pgsql-performance-ow...@postgresql.org wrote:

 From: Dan Schaffer daniel.s.schaf...@noaa.gov
 Date: December 15, 2010 9:15:14 PM GMT+01:00
 To: Andy Colson a...@squeakycode.net
 Cc: Jim Nasby j...@nasby.net, pgsql-performance@postgresql.org, Nick 
 Matheson nick.d.mathe...@noaa.gov
 Subject: Re: Help with bulk read performance
 Reply-To: daniel.s.schaf...@noaa.gov
 
 
 Hi,
 My name is Dan and I'm a co-worker of Nick Matheson who initially submitted 
 this question (because the mail group had me blacklisted for awhile for some 
 reason).
 
 
 Thank you for all of the suggestions.  We were able to improve out bulk read 
 performance from 3 MB/s to 60 MB/s (assuming the data are NOT in cache in 
 both cases) by doing the following:
 
 1. Storing the data in a bytea column instead of an array column.
 2. Retrieving the data via the Postgres 9 CopyManager#copyOut(String sql, 
 OutputStream stream)  method
 
 The key to the dramatic improvement appears to be the reduction in packing 
 and unpacking time on the server and client, respectively.  The server 
 packing occurs when the retrieved data are packed into a bytestream for 
 sending across the network. Storing the data as a simple byte array reduces 
 this time substantially.  The client-side unpacking time is spent generating 
 a ResultSet object. By unpacking the bytestream into the desired arrays of 
 floats by hand instead, this time became close to 

Re: [PERFORM] performance libpq vs JDBC

2010-12-16 Thread Merlin Moncure
On Thu, Dec 16, 2010 at 7:14 AM, Richard Huxton d...@archonet.com wrote:
 On 16/12/10 09:21, Werner Scholtes wrote:

 I assume that the wire protocol of PostgreSQL allows to transmit
 multiple rows at once, but libpq doesn't have an interface to access it.
 Is that right?

 Sounds wrong to me. The libpq client is the default reference implementation
 of the protocol. If there were large efficiencies that could be copied, they
 would be.

 Anyway - you don't need to assume what's in the protocol. It's documented
 here:
  http://www.postgresql.org/docs/9.0/static/protocol.html

 I'd stick wireshark or some other network analyser on the two sessions - see
 exactly what is different.

There is only one explanation for the difference: they are slamming
data across the wire without waiting for the result.  libpq queries
are synchronous: you send a query, wait for the result.  This means
for very simple queries like the above you can become network bound.

In C/C++ you can work around this using a couple of different methods.
 COPY of course is the fastest, but extremely limiting in what it can
do.  We developed libpqtypes (I love talking about libpqtypes) to deal
with this problem.  In the attached example, it stacks data into an
array in the client, sends it to the server which unnests and inserts
it.  The attached example inserts a million rows in about 11 seconds
on my workstation (client side prepare could knock that down to 8 or
so).

If you need to do something fancy, the we typically create a receiving
function on the server in plpgsql which unnests() the result and makes
decisions, etc.  This is extremely powerful and you can compose and
send very rich data to/from postgres in a single query.

merlin

#include libpq-fe.h
#include libpqtypes.h

#define INS_COUNT 100

int main()
{
 int i;

 PGconn *conn = PQconnectdb(dbname=pg9);
 PGresult *res;
 if(PQstatus(conn) != CONNECTION_OK)
 {
   printf(bad connection);
   return -1;
 }

 PQtypesRegister(conn);

 PGregisterType type = {ins_test, NULL, NULL};
 PQregisterComposites(conn, type, 1);

 PGparam *p =  PQparamCreate(conn);
 PGarray arr;
 arr.param = PQparamCreate(conn);
 arr.ndims = 0;

 PGparam *t = PQparamCreate(conn);

 for(i=0; iINS_COUNT; i++)
 {
   PGint4 a=i;
   PGtext b = some_text;
   PGtimestamp c;
   PGbytea d;

   d.len = 8;
   d.data = b;

   c.date.isbc   = 0;
   c.date.year   = 2000;
   c.date.mon= 0;
   c.date.mday   = 19;
   c.time.hour   = 10;
   c.time.min= 41;
   c.time.sec= 6;
   c.time.usec   = 0;
   c.time.gmtoff = -18000;

   PQputf(t, %int4 %text %timestamptz %bytea, a, b, c, d);
   PQputf(arr.param, %ins_test, t);
   PQparamReset(t);
 }

 if(!PQputf(p, %ins_test[], arr))
 {
   printf(putf failed: %s\n, PQgeterror());
   return -1;
 }
 res = PQparamExec(conn, p, insert into ins_test select * from
unnest($1) r(a, b, c, d), 1);

 if(!res)
 {
   printf(got %s\n, PQgeterror());
   return -1;
 }
 PQclear(res);
 PQparamClear(p);
 PQfinish(conn);
}

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Help with bulk read performance

2010-12-16 Thread Pierre C


If the data are stored as a byte array but retrieve into a ResultSet,  
the unpacking time goes up by an order of magnitude and the
observed total throughput is 25 MB/s.  If the data are stored in a  
Postgres float array and unpacked into a byte stream, the

observed throughput is 20 MB/s.



float - text conversions are very slow, this is in fact due to the  
mismatch between base-2 (IEEE754) and base-10 (text) floating point  
representation, which needs very very complex calculations.


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] How to get FK to use new index without restarting the database

2010-12-16 Thread Tom Lane
Eric Comeau ecom...@signiant.com writes:
 Is there a way force the db to re-evaluate its execution plan for a FK 
 without bouncing the DB?

   PostgreSQL 8.1.17

You don't need to bounce the whole DB, but you will need to start fresh
sessions.  We didn't add automatic invalidation of those plans until 8.3.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Index Bloat - how to tell?

2010-12-16 Thread Mladen Gogala

Dave Crooke wrote:
There is a plugin called pgstattuple which can be quite informative 
 however, it actually does a full scan of the table / index files, 
which may be a bit invasive depending on your environment and load.


http://www.postgresql.org/docs/current/static/pgstattuple.html

It's in the contrib (at least for 8.4), and so you have to import its 
functions into your schema using the script in the contrib directory.


Cheers
Dave

I tried it with one of my databases:


testtrack=# select * from pgstatindex('public.defects_pkey');
version | tree_level | index_size | root_block_no | internal_pages | 
leaf_pages | empty_pages | deleted_pages | avg_leaf_density | 
leaf_fragmentation


-+++---+++-+---+--+---
-
  2 |  1 | 827392 | 3 |  0 
|100 |   0 | 0 |70.12 
| 22

(1 row)


What is leaf_fragmentation? How is it defined? I wasn't able to find 
out any definition of that number. How is it calculated. I verified that 
running reindex makes it 0:



testtrack=# reindex table public.defects;
REINDEX
testtrack=# select * from pgstatindex('public.defects_pkey');
version | tree_level | index_size | root_block_no | internal_pages | 
leaf_pages | empty_pages | deleted_pages | avg_leaf_density | 
leaf_fragmentation


-+++---+++-+---+--+---
-
  2 |  1 | 647168 | 3 |  0 
| 78 |   0 | 0 |89.67 
|  0

(1 row)


--

Mladen Gogala 
Sr. Oracle DBA

1500 Broadway
New York, NY 10036
(212) 329-5251
http://www.vmsinfo.com 
The Leader in Integrated Media Intelligence Solutions





--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] How to get FK to use new index without restarting the database

2010-12-16 Thread Eric Comeau

On 10-12-16 11:27 AM, Tom Lane wrote:

Eric Comeauecom...@signiant.com  writes:

Is there a way force the db to re-evaluate its execution plan for a FK
without bouncing the DB?



   PostgreSQL 8.1.17


You don't need to bounce the whole DB, but you will need to start fresh
sessions.  We didn't add automatic invalidation of those plans until 8.3.

regards, tom lane



We confirmed that disconnecting and reconnecting resolves the issue.

Thanks to all that helped.

I replied to Tom and the list yesterday from my e-mail, but I don't see 
my reply here, so it must be stuck in the ether somewhere


Eric

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance