Author: timbo
Date: Mon Aug  9 15:22:59 2004
New Revision: 427

Added:
   dbi/trunk/Roadmap
Log:
Add DBI Roadmap document (newly written, draft)


Added: dbi/trunk/Roadmap
==============================================================================
--- (empty file)
+++ dbi/trunk/Roadmap   Mon Aug  9 15:22:59 2004
@@ -0,0 +1,304 @@
+=head1 DBI ROAD-MAP
+
+9th August 2004
+
+This document aims to provide a high level overview of the future direction of the 
DBI.
+
+It outlines the broad categories of changes, along with some rational,
+but does not go into implementation details and ignores many more
+minor planned enhancements.  More details can be found in:
+
+  http://svn.perl.org/modules/dbi/trunk/ToDo
+
+(username guest, password guest)
+
+
+=head2 Unicode
+
+Use of Unicode with the DBI is growing rapidly. The DBI could do more
+to help drivers support Unicode and help applications work with drivers
+that don't yet support Unicode directly.
+
+* Define expected behavior for fetching data and binding parameters.
+
+* Fix 'leaking' of UTF8 flag from one row to the next.
+
+* Provide interfaces to support Unicode issues for XS and pure Perl drivers
+and applications.
+
+
+=head2 Testing
+
+The DBI has a test suite. Every driver has a test suite.  Each is
+limited in its scope.  The driver test suite is testing for behavior
+that matches what the driver author thinks the DBI specifies but
+may be subtly incorrect.  These test suites are poorly maintained
+because the development cost is relatively high compared to the
+"return" from a single driver.
+
+A common test suite that can be reused by all the drivers is needed.
+The benefits include:
+
+* Ensuring all drivers conform to the DBI specification.
+Easing the porting applications between databases and the
+implementation of database independent reusable code modules
+layered over the DBI.
+
+* Improving the coverage of the DBI and driver code tested by the
+test suite.  Driver authors and others will be more motivated to
+contribute to the common test suite as the gains are multiplied by
+the number of drivers in use.
+
+* Improving the DBI specification by prompting the clarification
+of fuzzy issues in order to implement test cases.
+
+* Automatic documentation about driver functionality can be generated
+by the testing process.  Areas of missing functionality can be
+highlighted to encourage enhancements.
+
+* Improve the testing of DBI subclassing, DBI::PurePerl and the
+various "plumbing" drivers, such as DBD::Proxy and DBD::Multiplex,
+by automatically running the test suite through them.
+
+
+=head2 Performance
+
+The DBI has always treated performance as a priority. However some
+parts of the implementation remain unoptimized, especially in
+relation to threads.
+
+* The mechanism by which drivers access the core "DBI State" structure
+(DBIS) is very inefficient when perl is built to support threads
+(incl mod_perl 2).
+
+* The PERL_NO_GET_CONTEXT mechanism is not used by the DBI or drivers
+so their use of Perl API functions is significantly less efficient.
+
+* The majority of the handle creation code, including TIEHASH, is
+implemented in Perl.  Moving most of this to C will speed up handle
+creation significantly.
+
+* The popular fetchrow_hashref() method is many times slower than
+fetchrow_arrayref() because it has to re-get the names of the fields
+each time. A $h->{FetchHashReuse} attribute would allow the same
+hash to be reused each time making fetchrow_hashref() about the
+same speed as fetchrow_arrayref().
+
+
+=head2 Introspection
+
+* The methods of the DBI API are installed dynamically when the DBI
+is loaded.  The data structure used to define the methods and their
+dispatch behavior should be made part of the DBI API. This would
+enable more flexible and correct behavior by both modules subclassing
+the DBI and especially dynamic drivers such as DBD::Proxy and
+DBD::Multiplex.
+
+* All the handle attributes and related 'metadata' should also be
+made available for the same reasons. It's common for DBD::Proxy,
+for example, to not treat new attributes correctly because it's not
+been taught about them.
+
+* Currently is it not possible to discover all the child statement
+handles that belong to a database handle (or all database handles
+that belong to a driver handle).  This makes certain tasks more
+difficult, especially some debugging scenarios.  A cache of
+weak-references to child handles would solve the problem without
+creating reference loops.
+
+* A DBI handle is a reference to a tied hash and so has an 'outer'
+hash that the handle reference points to and an 'inner' hash holding
+the DBI data.  By allowing the inner handle to be changed, for
+example swapped with a different handle, many new behaviors become
+possible. For example a database handle to a database that's crashed
+could have it's inner handle changed to a new connection to a replica.
+
+* It is often useful to know what handle attributes have been changed
+since the handle was created (e.g., in mod_perl where a handle needs
+to be reset or cloned). This will become more important as developers
+start exploring the ability to change the inner handle.
+
+
+=head2 High Availability and Load Balancing
+
+* The DBD::Multiplex driver is intended to enable a wide range of
+dynamic functionality including support for various high-availability
+and load-balancing scenarios.  The old version has been used
+successfully but was limited. It's being rewritten to greatly
+increase its flexibility and has great potential, but development
+has stalled.
+
+* The DBD::Proxy module is complex and relatively inefficient because
+it's trying to be a complete proxy for most DBI method calls at
+both the database handle and statement handle levels.  For many
+applications a simpler proxy architecture that operates with a
+single round-trip to the server would be sufficient (result rows
+of SELECT statements would be serialized into the response).
+
+Apart from efficiency gains that would also enable the use of
+stateless servers which then enables the use of a pool of servers
+for high-availability and load balancing.
+
+I envisage a driver base class that implements everything except
+the 'transport' mechanism and then multiple drivers using the base
+class with specific transports.  For example, one such transport
+could be the Spread::Queue module.
+
+
+=head2 Extensibility
+
+The DBI can be extended in three main dimensions: subclassing the
+DBI, subclassing a driver, and callback hooks. Each has different
+pros and cons and each is most applicable in different situations.
+
+* Subclassing the DBI is functional but not well defined and some
+key elements are incomplete, particularly the DbTypeSubclass mechanism
+(that automatically subclasses to a class tree corresponding to the
+type of database being used).  It also needs more thorough testing.
+
+* Subclassing a driver is undocumented, poorly tested and very
+probably incomplete. However it's a powerful way to embed certain
+kinds of functionality 'below' applications while avoiding some of
+the side-effects of subclasing the DBI (especially in relation to
+error handling).
+
+* Callbacks are currently limited to error handling (the HandleError
+and HandleSetError attributes).  Providing callback hooks for more
+events, such as a row being fetched, would enable utility modules,
+for example, to add functionality independent of any subclassing
+in use.
+
+
+=head2 Database Portability
+
+* The DBI has not yet addressed the issue of portability among SQL
+dialects.  This is the main hurdle in the way of database portability
+for the DBI.
+
+The goal is not to fully parse the SQL and rewrite it in a different
+dialect.  That's well beyond the scope of the DBI and should be
+left to layered modules.  However, a simple token rewriting mechanism
+for five comment styles, two quoting styles, four placeholder styles,
+plus the ODBC "{foo ...}" escape syntax is sufficient to significantly
+raise the level of SQL portability.
+
+* Another major problem area is date/time formatting.  Since version 1.41
+the DBI has defined a way to express that dates should be fetched
+in SQL standard date format (YYYY-MM-DD).  However it requires the
+bind_col() method to be called on applicable columns.  This is one
+example of the more general case where bind_col() needs to be called
+with particular attributes on all columns of a particular type.
+
+A mechanism is needed whereby an application can specify default bind_col()
+attributes for each column type. So with a single step all DATE type
+columns, for example, can be set to be returned in the standard format.
+
+
+=head2 Debug-ability
+
+* Reduce the "noise" when the trace level is set high by moving more trace
+output to be enabled by the new named-topic trace mechanism.
+
+* Calls to XS functions (such as many DBI and driver methods) don't
+normally appear in the call stack.  Optionally enabling that would
+enable more useful diagnostics to be produced.
+
+* Integration with the Perl debugger would make it simpler to perform
+actions on a per-handle basis (such as breakpoint on execute,
+breakpoint on error).
+
+
+=head2 Other Enhancements
+
+* Support non-blocking mode for drivers that can enable it in their
+client API.
+
+* Scroll-cursor support
+
+
+=head2 Parrot and Perl 6
+
+The current DBI implementation in C code is very unlikely to run
+on Perl 6.  Perl 6 will target the Parrot virtual machine and so
+the internal architecture will be radically different from Perl 5.
+
+The most natural language to implement Perl 6 extensions will be
+Parrot Intermediate Representation (PIR). Since Parrot includes a
+Native Call Interface, extensions implemented in PIR should not
+need a compiler in order to interface to database client API shared
+libraries.
+
+It is a goal of the Parrot project to be a suitable target for many
+dynamic languages (including Python, PHP, Ruby, etc) and to enable
+those languages to reuses each others modules. So a database interface
+for Parrot is also a database interface for all those languages.
+
+The Perl DBI is more mature and featureful than the database
+interfaces of the other languages and so would make an excellent
+base for the Parrot Database interface.
+
+My plan is to better define the API between the DBI and the drivers and
+use that API as the primary API for the 'raw' Parrot database interface.
+This project is known a Parrot DBDI.  Here's my announcement:
+
+  http://groups.google.com/[EMAIL PROTECTED]
+
+(The project stalled, due to Parrot not having key functionality
+at the time, and has yet to be restarted.)
+
+The bulk of the DBI code actually exists in base classes 'behind'
+the driver API.  The method dispatcher code that Perl applications
+interface with is relatively small.
+
+Each language targeting Parrot would implement their own small
+language-specific dispatcher over the Parrot DBDI interface.
+
+A "big win" here is that a much wider community of developers share
+the same database drivers and so the benefits of the Open Source
+model are magnified.
+
+The bulk of the work will be translating the C and Perl base class
+code into Parrot PIR or a suitable language that generates PIR.
+
+
+=head1 PRIORITIES
+
+The foundations of many of the changes described above require
+changes to the interface between the DBI and drivers. To clearly
+define the transition point the source code will be forked into a
+DBI v1 branch and the mainline bumped to v2.
+
+DBI v1 will continue to be maintained for bug fixes and any
+enhancements that ease the transition to DBI v2.
+
+=head2 Transition Drivers
+
+The first priority is to make all the infrastructure changes that
+impact drivers and make an alpha release available that driver
+authors can target.  As far as possible the changes will be implemented
+in a way that enables driver authors use the same code base for DBI
+v1 and DBI v2.
+
+The main changes required by driver authors are:
+
+* Code changes for PERL_NO_GET_CONTEXT, plus removing PERL_POLLUTE
+and DBIS
+
+* Code changes in DBI/DBD interface (new way to create handles, new
+callbacks etc)
+
+* Common test suite infrastructure (driver-specific test base class)
+
+=head2 Transition Applications
+
+At the same time a small set of incompatible changes that may impact
+some applications will also be made. See
+http://svn.perl.org/modules/dbi/trunk/ToDo (login guest/guest).
+
+=head2 Incremental Developments
+
+Once DBI v2.0 is available the other enhancements can be implemented
+incrementally on the updated foundations. The priorities of those
+changes can be set in the light of then present circumstances.
+
+=cut

Reply via email to