[HACKERS] Updateable cursors

FAST PostgreSQL Mon, 22 Jan 2007 20:43:09 -0800

We are trying to develop the updateable cursors functionality into 
Postgresql. I have given below details of the design and also issues we are 
facing.  Looking forward to the advice on how to proceed with these issues.


Rgds,
Arul Shaji




 
1. Introduction
--------------
This is a combined proposal and design document for adding updatable 
(insensitive) cursor capability to the PostgreSQL database. 
There have already been a couple of previous proposals since 2003 for 
implementing this feature so there appears to be community interest in doing 
so. This will enable the following constructs to be processed:


UPDATE <table_name> SET value_list WHERE CURRENT OF <cursor_name>
DELETE FROM <table_name> WHERE CURRENT OF <cursor_name>

This has the effect of users being able to update or delete specific rows of 
a table, as defined by the row currently fetched into the cursor.


2. Overall Conceptual Design
-----------------------------
The design is considered from the viewpoint of progression of a command 
through the various stages of processing, from changes to the file gram.y 
to implement the actual grammar changes, through to changes in the Executor 
portion of the database architecture.

2.1 Changes to the Grammar
------------------------------
The following changes will be done to the PostgreSQL grammar:

UPDATE statement has the option WHERE CURRENT OF <cursor_name> added
DELETE statement has the option WHERE CURRENT OF <cursor_name> added

The cursor_name data is held in the UpdateStmt and DeleteStmt structures and 
contains just the name of the cursor.

The pl/pgsql grammar changes in the same manner.

The word CURRENT will be added to the ScanKeywords array in keywords.c.


2.2 Changes to Affected Data Structures
------------------------------------------
The following data structures are affected by this change: 

Portal structure, QueryDesc structure, the UpdateStmt and DeleteStmt 
structures

The Portal will contain a list of structures of relation ids and tuple ids 
relating to the tuple held in the QueryDesc structure. There will be one 
entry in the relation and tuple id list for each entry in the relation-list 
of the statement below: 

DECLARE <cursor_name> [WITH HOLD] SELECT FOR UPDATE OF <relation-list> 

The QueryDesc structure will contain the relation id and the tuple id 
relating to the tuple obtained via the FETCH command so that it can be 
propagated back to the Portal for storage in the list described above.

The UpdateStmt and DeleteStmt structures have the cursor name added so that 
the information is available for use in obtaining the portal structure 
related to the cursor previously opened via the DECLARE CURSOR request.


2.3 Changes to the SQL Parser
------------------------------------
At present, although the FOR UPDATE clause of the DECLARE CURSOR command has 
been present in the grammar, it causes an error message later in the 
processing since cursors are currently not updatable. This now needs to 
change. The FOR UPDATE clause has to be valid, but not the FOR SHARE 
clause. 

The relation names that follow the FOR UPDATE clause will be added to the 
rtable in the Query structure and identified by means of the rowMarks array. 
In the case of an updatable cursor the FOR SHARE option is not allowed 
therefore all entries in the rtable that are identified by the rowMarks array 
must relate to tables that are FOR UPDATE.

In the UPDATE or DELETE statements the WHERE CURRENT OF <cursor_name> 
clause results in the cursor name being placed in the UpdateStmt or 
DeleteStmt structure. During the processing of the functions - 
transformDeleteStmt() and transformUpdateStmt() - the cursor name is used to 
obtain a pointer to the related Portal structure and the tuple affected by 
the current UPDATE or DELETE statement is extracted from the Portal, where it 
has been placed as the result of a previous FETCH request. At this point all 
the information for the UPDATE or DELETE statement is available so the 
statements can be transformed into standard UPDATE or DELETE statements and 
sent for re-write/planning/execution as usual.

2.4 Changes to the Optimizer
------------------------------
There is a need to add a TidScan node to planning UPDATE / DELETE statements 
where the statements are UPDATE / DELETE at position. This is to enable the 
tuple ids of the tuples in the tables relating to the query to be obtained. 
There will need to be a new mechanism to achieve this, as at present, a Tid 
scan is done only if there is a standard WHERE condition on update or delete 
statements to provide Tid qualifier data.


2.5 Changes to the Executor
-------------------------------
There are various options that have been considered for this part of the 
enhancement. These are described in the sections below.

We would like to hear opinions on which option is the best way to go or if 
none of these is acceptable, any alternate ideas ?

Option 1  MVCC Via Continuous Searching of Database

The Executor is to be changed in the following ways:
1)      When the FETCH statement is executed the id of the resulting tuple is 
extracted and passed back to the Portal structure to be saved to indicate the 
cursor is currently positioned on a tuple.
2)      When the UPDATE or DELETE request is executed the tuple id previously 
FETCHed is held in the QueryDesc structure so that it can be compared with 
the tuple ids returned from the TidScan node processed prior to the actual 
UPDATE / DELETE node in the plan. This enables a decision to be made as to 
whether the tuple held in the cursor is visible to the UPDATE / DELETE 
request according to the rules of concurrency. The result is that, at the 
cost of repeatedly searching the database at each UPDATE / DELETE command, 
the hash table is no longer required.
This approach has the advantage that there is no hash table held in memory or 
on disk so it will not be memory intensive but will be processing intensive. 

This is a good one-off solution to the problem and, taken in isolation is 
probably the best approach. However, if one considers the method(s) used in 
other areas of PostgreSQL, it is probably not the best solution. This option 
will probably not be used further.

Option 2  MVCC via New Snapshot

The executor can be changed by adding a new kind of snapshot that is 
specifically used for identifying if a given tuple, retrieved from the 
database during an update or delete statement should be visible during the 
current transaction.

This approach requires a new kind of snapshot (this idea was used by Gavin 
for a previous updatable cursor patch but objections were raised.)

Option 3  MVCC Via Hash Table in Memory
 
The executor can be changed by saving into a hash table and comparing each 
tuple in the cursor with that set to check if the tuple should be visible.
This approach has the advantage that it will be quick. It has the 
disadvantage that, since the hash table will contain all the tuples of the 
table being checked that it may use all local memory for a large table.

Option 4  MVCC Via Hash Table on Disk
 
When the UPDATE or DELETE request is executed the first time the Tid scan 
database retrieval will be done first. At this time the tuple id of each row 
in the table to be updated by the request will be available in the executor. 
These tuple ids need to be stored in a hash table that is stored to disk, as, 
if the table is large there could be a huge number of tuple ids. This data is 
then available for comparison with the individual tuple to be updated or 
deleted to check if it should be processed. The hash table will exist for the 
duration of the transaction, from BEGIN to END (or ABORT). 

The hash table is then used to identify if the tuple should be visible during 
the current transaction. If the tuple should be visible then the update or 
delete proceeds as usual.

This approach has the advantage that it will use little memory but will be 
relatively slow as the data has to be accessed from disk. 

Option 5 Store Tuple Id in Snapshot.

The Snapshot structure can be changed to include the tuple id. This enables 
the current state of the tuple to be identified with respect to the current 
transaction.
The tuple id, as identified in the cursor at the point where the 
DELETE/UPDATE statement is being processed, can use the snapshot to identify 
if the tuple should be visible in the context of the current transaction.


2.6 Changes to the Catalog 
----------------------------
The Catalog needs to reflect changes introduced by the updatable cursor 
implementation. A boolean attribute is_for_update is to be added to the 
pg_cursors implementation. It will define that the cursor is for update 
(value is FALSE) or for share (value is TRUE, the default value).
 

3 Design Assumptions
----------------------------
The following design assumptions are made:

As PostgreSQL8.2 does not support the SENSITIVE cursor option the tuples 
contained in a cursor can never be updated so these tuples will always appear 
in their original form as at the start of the transaction. This is in 
breach of the SQL2003 Standard as described in 5WD-02-Foundation-2003-09.pdf, 
p 810. The standard requires the updatable cursor to be declared as sensitive.

With respect to nested transactions  In PostgreSQL nested transactions are 
implemented by defining save points via the keyword SAVEPOINT. A ROLLBACK 
TO SAVEPOINT rolls back the database contents to the last savepoint in this 
transaction or the begin statement, whichever is closer.

It is assumed that the FETCH statement is used to return only a single row 
into the cursor with each command when the cursor is updatable.

According to the SQL2003 Standard Update and Delete statements may contain 
only a single base table.

The DECLARE CURSOR statement is supposed to use column level locking, but 
PostgreSQL supports only row level locking. The result of this is that the 
column list that the standard requires DECLARE <cursor_name> SELECT  FOR 
UPDATE OF column-list becomes a relation (table) list.

This is an email from Fujitsu Australia Software Technology Pty Ltd, ABN 27 003 
693 481. It is confidential to the ordinary user of the email address to which 
it was addressed and may contain copyright and/or legally privileged 
information. No one else may read, print, store, copy or forward all or any of 
it or its attachments. If you receive this email in error, please return to 
sender. Thank you.

If you do not wish to receive commercial email messages from Fujitsu Australia 
Software Technology Pty Ltd, please email [EMAIL PROTECTED]


---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match

[HACKERS] Updateable cursors

Reply via email to