RFC 130 (v4) Transaction-enabled variables for Perl6

Perl6 RFC Librarian Thu, 24 Aug 2000 08:45:44 -0700
This and other RFCs are available on the web at
  http://dev.perl.org/rfc/

=head1 TITLE

Transaction-enabled variables for Perl6

=head1 VERSION

  Maintainer: Szab�, Bal�zs <[EMAIL PROTECTED]>
  Date: 17 Aug 2000
  Last Modified: 24 Aug 2000
  Version: 4
  Mailing List: [EMAIL PROTECTED]
  Number: 130

=head1 ABSTRACT

Transactions are quite important in a database-enabled application.
Professional database systems have transaction-handling inside, but
there are only a few laguage out there, what supports transactions in
variable level.

In this RFC we will look at how these variables would look like in perl6.

=head1 CHANGES

=head2 Version 4

=over 4

=item *

TIE interface callbacks are renamed to TIECOMMIT, TIEROLLBACK

=item *

2 phase commit described

=item *

PREPARE, TIEPREPARE added to support two phase commits

=item *

Object interface described

=item *

"safe" renamed to "trans"

=back

=head2 Version 3

=over 4

=item * 

Added a tie interface change request: COMMIT and ROLLBACK, and new global

=item *

Fixed some Formatting error of this pod.

=item *

'use varlock' renamed to 'use transaction'.

=back

=head2 Version 2

=over 4

=item *

Detailed implementation description

=item *

Add a new pragma 'varlock' for controlling the concurrency control.

=back

=head1 DESCRIPTION

In short, we have "local" keyword, which changes a value of a variable
for only the runtime of the current scope. The transaction-enabled
variables should start up like "local", but IF the currenct scope
reaches at the end, it then copied into the global one.

We need to get a new keyword for defining such a variable, I think
"transaction" is too long, we could use "trans" for this.

Preferred syntax:

  sub trans_test { my ($self,@params)=@_;
    trans $self->{value}=$new_value;
  
    # ...
  
    die "Error occured" if $some_error;
  
    function_call(...)
  
    # ...
  
  } ;

Meaning (in semi perl5 syntax):

  sub trans_test {
    local $self->{value}=$new_value;
  
    # ...
  
    die "Error occured" if $ome_erre;
  
    function_call(...)
  
    # ...
  
    global $self->{value}=$self->{value};
  };

If we want to gain more control and want to maintain easy syntax, we can
use another pragma, which sets up the attributes of locking.  I think
the "transaction" pragma could be a good name:

  use transaction (mode => 'lock', timeout=>6000);

Parameters for transaction:

=over 4

=item mode

can be:

=over 4

=item simple

No blocking, concurrency control. (default).

In a not-threaded environment this causes minimal overhead, and no
locking overhead at all.

=item lock

Explicitly lock the accessed variables. (shared and exclusive locks used).

=item version

This is simlar to the postgres' multi-version concurrency control. It
requires more memory, but has a less chance to get into deadlock.

=back 

=item timeout

Timeout in ms until we wait for the lock. 0 means nonblocking. If the
timeout reached, the system throws an exception.

=item isolation

Transaction isolation level. This can be: 

=over 4

=item 0

Read uncommitted

=item 1

Read committed (default)

=item 2

Repeatable read

=item 3

Serializable.

=back

PostgreSQL implements only 1 and 3 AFAIK, so I think we could implment
only that level. Then 0 and 2 will be synonim for 1 and 3, but we could
keep the place for a future implementation.

See the postgreSQL documentation for the details.

=back

=head2 Two phase commit 

Two phase commit is the common way to deal with distributed
transactions. Perl need an interface to objects (and tied variables) to
deal with these to become a reliable transaction-handler. You can choose
to implement these features in your object and your tied variable. If
you don't do that, perl will give you a rough default.

At the end of the transaction, 2 different thing can happen: rollback or
commit. When rollback occured, all the transaction variables must be
rolled back. In commit, a two-phase commit procedure has been started.

The first phase is preparing to the commit: check the resources,
allocates resources to the commit, flushes caches, etc. After that it
can decide wheter you can do a commit or not. If all participant sends
"yes", then the commit phase begins: the coordinator sends "commit"
messages to the participant, and the transaction finishes. If any of the
participants in the "prepare" phase sends a false value, then the whole
transaction need to be rolled back.

How it looks like in perl?

You have objects. Objects can be transaction-enabled, and if you want
that, you need to define the following functions: COMMIT, ROLLBACK,
PREPARE. If you have a tied variable, then you can define three
callbacks for this: TIECOMMIT, TIEROLLBACK, TIEPREPARE. These can be
used to extend an object or a tied variable to transaction-safe. If you
don't define PREPARE or TIEPREPARE, then it will be only a one phase
commit. If you don't define COMMIT (or TIECOMMIT) and ROLLBACK (or
TIEROLLBACK), then perl will do the simple "copy back the old value on
rollback" mechanism, which works well in cases when no multithreading
and no special handling is necessary for the data.

=head2 Tie interface

Adding transaction-enabled property of a tied variable is not
straightforward. Imagine you have been tied a hash into a (not
transaction-enabled) dbm file. When you fetch, you need to put a shared
lock (or version-control) the dbm file or key, when you read, you need
to put an exclusive lock, and when the transaction ends, you need to
release the lock. For this reason, we can add two callback: TIECOMMIT
and TIEROLLBACK.

If we don't want to use locking, or want to do an advanced
transaction-management, we can provide a transaction-id to the
callbacks. This can be done with a new package global variable (which is
localized in every call), the name can be  $Package::TRANSACTION_ID. A
new parameter is not good, because some of the callbacks (PUSH, POP,
UNSHIFT, PRINT, PRINTF, etc) are expecting LISTs as an attribute, and
this can cause unnecessary rewrite of the tie interface.

Following is the description of the modifications of the tie interface:

=over 4

=item New package global

$Package::TRANSACTION_ID will be a unique identifier of the current
transaction (if any).

=item New Callbacks

Two (or three) new callbacks required:

=over 4

=item TIEPREPARE $this

This is the first part of the 2 way commit transaction. This must return
true if the variable is prepared for the COMMIT, false otherwise. If
this callback is not defined, then the variable lose the right to abort
the transaction, and perl implicitly returns 1 in this cases.

=item TIECOMMIT $this

If it is defined, then it is called after TIEPREPARE returns 1 for all
the transaction-enabled variables in the current scope. This must be
used to commit the transaction.

=item TIEROLLBACK $this

If it is defined, then it is called at the end of a failed transaction.
If NOT defined, then STORE will be called with the old value of the
variable.

=back

=back

=head2 Object interface

Object interface is similar to the tied interface: you will need three
callbacks: PREPARE, COMMIT and ROLLBACK. These will do the same as
described in the Tie interface. The $Package::TRANSACTION_ID will be set
in this case also.

Note, if you declare an object as "trans", this means that this is
localized for the runtime of the transaction and that PREPARE, COMMIT,
ROLLBACK will be called at the end of the block of the declaration. It
doesn't mean that all the data structure under that is transaction safe.
It cannot be guaranteed, and you need to explicitly declare them as
"trans" variables.

=head1 IMPLEMENTATION

=head2 Transaction handling methods

=over 4

=item simple

This is the default method. This needs no magic, implementation is 
straightforward:

When you use "trans", then the following will happen:

  $save_value=$value;

When you reaches the end of the block you are in, the saved value should
be dropped. If it was an exception that caused the termination of the
block, then the old value must be copied back to the global space:

  $value=$save_value;

This solution is tie-safe, and this is very important.

=item lock

We need to maintain locks (mutexes) on variables. We assume this will be
used in threaded applications.

When we use "trans", then perl will put a shared lock on the variable.

When we read the variable, we also put a shared lock to that.

When we write the variable, we check if it is already locked, and if we
locked that already or no exclusive locks present, then write to the
value, and lock that with LOCK_EX. If other exclusive lock present on
the variable, then we need to wait for the releasing.

When the "trans" content ends, we frees the shared (or exclusive lock).
If the content ends with a die then we puts the original value back if
we have locked it with exclusive lock.

=item version

It is the mechanism of making multiple versioned copies of the variable
every time somebody access this. This needs tiestamping, and
postgreSQL-like concurrency control. I don't know more details.

=back

=head1 REFERENCES

PostgreSQL Multi-version concurrency control
  http://www.postgresql.org/docs/postgres/mvcc.htm

Two phase commit: (Google found that :-)
  http://oradoc.photo.net/ora8doc/DOC/server803/A54653_01/ds_ch3.htm

RFC 1: Implementation of Threads in Perl

RFC 19: Rename the C<local> operator

RFC 63: Exception handling syntax

RFC 64: New pragma 'scope' to change Perl's default scoping

RFC 80: Exception objects and classes for builtins

RFC 88: Structured Exception Handling Mechanism (Try)

RFC 106: Yet another lexical variable proposal: lexical variables
  made default without requiring strict 'vars'

RFC 119: object neutral error handling via exceptions

perldoc perlthread: the perl5 threading interface

perldoc perltie: the perl5 tie interface
RFC 130 (v4) Transaction-enabled variables for Perl6

Reply via email to