Re: [HACKERS] [BUGS] BUG #3244: problem with PREPARE

2007-04-24 Thread William Lawrance

In the modified version of ECPG that we used for our benchmark, we
PREPARE'd all statements.


-Original Message-
From: Michael Meskes [mailto:[EMAIL PROTECTED]
Sent: Tuesday, April 24, 2007 12:45 AM
To: William Lawrance
Cc: Tom Lane; [EMAIL PROTECTED]; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] [BUGS] BUG #3244: problem with PREPARE


On Mon, Apr 23, 2007 at 02:02:04PM -0700, William Lawrance wrote:
 Our first attempt to use the ECPG prepare interface revealed that ECPG
 doesn't use the PQlib prepare function. The ECPG prepare replaces any
 parameters with their values and presents a new SQL statement to the

This is true and should also be documented. The reason for this
behaviour is simply that ECPG prepare feature was added before the
backend had its own prepare feature. And no one changed it so far.

 There are several difficulties to be encountered when attempting to use
 this within a program using the ECPG interface. For example, the
 connection structure for PQlib isn't readily available, and the
 transaction semantics must be synchronized with ECPG's state. This did
 work, but it was fairly clumsy.

Right, that's what makes it non trivial.

 Since we wanted to do this in a cleaner manner, and also wished to avoid
 changing the applications if possible, we used the following approach:
 
 Within the execute.c module, we added routines to manage a cache
 of prepared statements. These routines are able to search, insert,
 and delete entries in the cache. The key for these cache entries is
 the text of the SQL statement as passed by ECPG from the application
 program.
 
 Within the same module, we replaced the ECPGexecute function.
 This is the function that is called to execute a statement after
 some preliminary housekeeping is done. The original ECPGexecute
 function constructs an ASCII string by replacing each host variable
 with its current value and then calling PQexec. The new
 ECPGexecute function does the following:
 
   - build an array of the current values of the host variables.
 
   - search the cache for an entry indicating that this statement
 has already been prepare'd, via  PQprepare
 
   - If no entry was found in the previous step, call PQprepare
 for the statement and then insert an entry for it into the
 cache. If this requires an entry to be re-used, execute a
 DEALLOCATE PREPARE.. for the previous contents.
 
   - At this point, the SQL statement has been prepare'd by PQlib,
 either when the statement was executed in the past, or in
 the previous step.
 
   - call PQexecPrepared, using the array of parameters built
 in the first step above.

Does this mean you prepare ALL statements? Or where you only talking
about statements that are prepared in the application?

Michael
-- 
Michael Meskes
Email: Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: [EMAIL PROTECTED]
Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL!

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [BUGS] BUG #3244: problem with PREPARE

2007-04-23 Thread William Lawrance

We were about to submit a patch to ECPG to improve the performance
of embedded SQL, when we discovered that PREPARE had quit working.

Background:

We have a benchmark for a very large customer that consists of several
hundred programs containing several thousand embedded SQL statements.
In a three hour execution, millions of calls are made to the DBMS server.
This benchmark has been successfully executed using Oracle, DB2, and
PostgreSQL. In the benchmark, Postgres is shown to be slower, by far, than
the other DBMS systems.

While searching for ways to improve the PostgreSQL performance,
we noticed that query plans are not saved and re-used in the server
if accessed from the ECPG interface. In the customer benchmark this is key.
The program processes a large input file of work. For each item in the file
there is a large sequence of the application and SQL that must be executed.
For each subsequent item of the input file, many of the same SQL statements
are processed.

What we found was that each SQL statement was constructed by ECPG as
an ASCII string and presented to the postmaster to be re-optimized each
time the SQL was received. We found that using the 'prepare' interface
to save the query plan after the first execution would be a
significant savings - approximately 30% of the elapsed time of the
application.  The reduction in elapsed time was over 1 hour for the
benchmark.

Details:

Our first attempt to use the ECPG prepare interface revealed that ECPG
doesn't use the PQlib prepare function. The ECPG prepare replaces any
parameters with their values and presents a new SQL statement to the
postmaster each time. We then tried to use the PQlib prepare interface.
There are several difficulties to be encountered when attempting to use
this within a program using the ECPG interface. For example, the
connection structure for PQlib isn't readily available, and the
transaction semantics must be synchronized with ECPG's state. This did
work, but it was fairly clumsy.

Since we wanted to do this in a cleaner manner, and also wished to avoid
changing the applications if possible, we used the following approach:

Within the execute.c module, we added routines to manage a cache
of prepared statements. These routines are able to search, insert,
and delete entries in the cache. The key for these cache entries is
the text of the SQL statement as passed by ECPG from the application
program.

Within the same module, we replaced the ECPGexecute function.
This is the function that is called to execute a statement after
some preliminary housekeeping is done. The original ECPGexecute
function constructs an ASCII string by replacing each host variable
with its current value and then calling PQexec. The new
ECPGexecute function does the following:

  - build an array of the current values of the host variables.

  - search the cache for an entry indicating that this statement
has already been prepare'd, via  PQprepare

  - If no entry was found in the previous step, call PQprepare
for the statement and then insert an entry for it into the
cache. If this requires an entry to be re-used, execute a
DEALLOCATE PREPARE.. for the previous contents.

  - At this point, the SQL statement has been prepare'd by PQlib,
either when the statement was executed in the past, or in
the previous step.

  - call PQexecPrepared, using the array of parameters built
in the first step above.






-Original Message-
From: Tom Lane [mailto:[EMAIL PROTECTED]
Sent: Sunday, April 22, 2007 7:53 PM
To: William Lawrance
Cc: [EMAIL PROTECTED]; pgsql-hackers@postgresql.org
Subject: Re: [BUGS] BUG #3244: problem with PREPARE


William Lawrance [EMAIL PROTECTED] writes:
 This program that does   PQprepare and then
 PQexecPrepared has worked previously, but doesn't
 work now.
 ...
 strcpy(openStmt, declare C1 cursor for select cola
 from tprep
 where cola = $1);
 res = PQprepare(conn, stmtopen, openStmt, 0, 0);

I looked into this a bit and found that the issue comes from my recent
changes in support of plan caching.  To simplify matters, I instituted
a rule that utility statements don't have any interesting
transformations done at parse analysis time; see this new comment in
analyze.c:

 * For optimizable statements, we are careful to obtain a suitable lock on
 * each referenced table, and other modules of the backend preserve or
 * re-obtain these locks before depending on the results.  It is therefore
 * okay to do significant semantic analysis of these statements.  For
 * utility commands, no locks are obtained here (and if they were, we could
 * not be sure we'd still have them at execution).  Hence the general rule
 * for utility commands is to just dump them into a Query node
untransformed.
 * parse_analyze does do some purely syntactic transformations on CREATE
TABLE