[RDBO] ANNOUNCE: first post-rewrite Rosetta release (v0.720.0)

Darren Duncan Thu, 02 Feb 2006 22:29:02 -0800

P.S. Since the probe wasn't rejected by sourceforge, here's the real message.
-----------


2006-02-01   Darren Duncan <[EMAIL PROTECTED]>
--------------------------------------------------

I am pleased to announce the first CPAN release of the second majorcode base (started on 2005-10) of the Rosetta database accessframework, v0.720.0, which is available now in synchronized nativePerl 5 and Perl 6 versions.

This is a complete rewrite, including very different detail designs,implementations, and documentations, though it still retains the samehigh level design and purpose.


------------

The Perl 5 version is composed of these 2 distributions (more come later):

 * Rosetta-v0.720.0.tar.gz
 * Rosetta-Engine-Native-v0.1.0.tar.gz

These have Locale-KeyedText-v1.72.1.tar.gz (released at the sametime) as an external dependency.

The Perl 6 versions of all 3 of the above items are bundled withPerl6-Pugs-6.2.11.tar.gz (released a half-day earlier) in its ext/subdirectory.

The Perl 6 versions don't depend on anything outside the Perl6-Pugsdistro that they live in. But the Perl 5 versions also have externaldependencies on Perl 5.8.1+ and these Perl 5 packages, which addfeatures that Perl 6 and Pugs already have built-in: 'version','only', 'Readonly', Class::Std, Class::Std::Utils, Scalar::Util,Test::More; the latter 2 are bundled with Perl 5.


------------

Following is both a reintroduction to the remade Rosetta as it is andwill soon be, and a summary of the main changes from before therewrite (first major code base of 2002 thru 2005-09).

For various reasons such will be bared below, it should be moreapparent than ever that Rosetta is "not just another DBI wrapper" andreally stands out as something different than any existing tools onCPAN.

Note that many of these details aren't yet in Rosetta's owndocumentation (they will be later), so they are distinct to thisemail.

* Locale::KeyedText is officially not part of the Rosetta frameworkanymore, being a distinct external dependency instead of itslocalization component.

* Anything that was in the SQL::Routine name space has been renamedinto the 'Rosetta' name space.

* Briefly comparing DBI to Rosetta, DBI provides users with databasedriver independence; Rosetta provides them with database languageindependence, which is a higher abstraction, but it should still workquickly.

* Rosetta is now officially a federated relational database of itsown that just happens to be good with cross-database-managerportability issues, and be good as a toolkit on which to build ORMsand persistence tools, rather than being mainly about portable SQLgeneration.

* The native query and schema design language of Rosetta is now basedmainly on Tutorial D (by Christopher J. Date and Hugh Darwen) andclosely resembles relational algrebra, rather than being based on SQLas it was before (note that some current documentation suggestsotherwise, but that will be rewritten).

* Note, see http://www.oreilly.com/catalog/databaseid/ , the book byDate named "Database in Depth", which is one of the best referenceson database design I have ever seen. Everyone who works withdatabases should read it. Its not dry and has practical stuff youcan apply right now. I am.

* The native language of Rosetta is presently called "IntermediateRelational Language" ("IRL", pronounced "earl", or "girl" without the"g"); it is inspired by Pugs' "PIL", which serves a similar purposefor Perl 6 as what IRL does for Tutorial D and SQL and otherlanguages.

* IRL is strongly typed, where every value and container is of asingle type, and permits user data type definitions to be arbitrarilycomplex (such as temporal and spacial data) but non-recursive. Asidefrom forbidding "references", it includes the features of so-called"object-relational" databases which are actually part of the trueplain "relational" data model. Values of each distinct data type cannot be substituted as operator arguments for others, or stored incontainers for others, but they can be explicitly cross-converted insome circumstances (eg num to str or str to num).

* Despite actually being strongly typed, IRL has facilities tosimulate weak data types over strong ones; for example, you candefine an SV type that has numerical and character string components.More broadly speaking, you can define multi-part "disjunctive" types,each of a different other type, where only one member has asignificant value at once, and the others have their type's conceptof an "empty" value; actually, these have a single extra member thatsays which of the others holds the significant value.

* IRL natively uses 2-valued-logic (2VL) like Tutorial D, and not3-valued-logic (3VL) like SQL, so every boolean valued expressionalways evaluates to true or false, not true or false or unknown (aSQL NULL). But it does simulate 3-valued-logic using disjunctivedata types, one of whose members is the system defined "Unknown"strong data type, which can only ever hold the same single value; bydefinition, a disjunctive data type value whose member A is thesignificant one will never match with another whose significantmember is B, and hence we can distinguish between "Unknown" and zeroor the empty string when a number or string can't actually be set toUnknown (null).

* IRL has distinct data types for what are commonly referred to as"relations" (like a SQL table with a key, which may be over all ofits columns) and "bags" (like a SQL table that lacks a key), wherethe former forbids duplicates and the latter allows them. GivenRosetta's hard typing, a relation and a bag can not be substitutedfor each other (except that they can be cross-converted, as numbersand character strings can be cross-converted), but rather have theirown operators which either never output or can output duplicatesrespectively. A bag can be implemented over a relation where therelation has one extra attribute which stores a count of occurancesfor the otherwise distinct combination of other attributes, andoperators do the right thing with that count.

* There is no inherent order of the attributes/columns ofrelations/bags/tables, and there is no inherent order to theirtuples/rows, unlike SQL where at least the order of columns issignificant. IRL does all references by names rather than byposition; all operator parameters are named, as are relationattributes.

* Besides relations and bags, IRL has a distinct array data type,which is what you get when using an order-by; usually it only makessense to use this as the last step in a query when fetching data, ifthe order is important.

* All typical joins between relations/bags/tables are natural joins,where attributes/columns of each joined item implicitly correspondand match when they have the same names and data types (and if nonematch, you have a cartesian). You never specify join conditionsexplicitly by using "foo = bar" or any such thing; rather, if youwant to match on dis-similar names, you first rename (like SQL's"AS") one or both source columns. This also means that you can joinan arbitrary number of relations/tables in a single operation, andthey will just work, with the combined output relation/table havingdistinct attribute/column names already.

* Instead of saying "select <attr-list> from <relation> where<condition>", you nest arbitrary relational algebra expressions like"project( restrict( <relation>, <condition> ), <attr-list> )" or"restrict( project( <relation>, <attr-list> ), <condition> )"; bothof those latter 2 happen to give the exact same result.

* The finer grained IRL should be easier to write non-trivial queriesin than SQL, especially when adding things like groups and havingsand such, since you can more reliably know what pieces you have towork with, and exactly what will happen when you say certain things,and you don't have to needlessly duplicate expressions. Writingqueries in IRL should be more reliable than SQL since you don't haveto worry about getting different results from 2 logically identicalqueries and you don't have to deal with ambiguous syntax.

* IRL should also be a lot easier to optimize for speed given thelack of ambiguity that plagues attempts to optimize SQL.

* Rosetta is designed to be very componentized, where you cansubstitute back-ends and front-ends at will, so it can work over bothSQL based and non-SQL based database engines, and its user interfacecan resemble anything you want. It is also reasonably easy to mapSQL to IRL and back, so you can still query Rosetta databases usingvarious SQL dialects or other languages if you don't want to see theIRL, and this can help with migrating older applications.

* It is likely to be the ideal case for most Rosetta users to have analternate front end, such as some adapted from current DBI wrappers,object persistence or relational mapping tools, and so on, ratherthan using IRL directly. Using Rosetta rather than DBI should makethe tasks of people making such wrappers and tools easier, since theyhave a more reliable language to work against and they don't have tomaintain a multiplicity of back ends for each storage engine; Rosettadoes the latter for them.

* A typical Rosetta back-end that operates over an existing databaseengine will take care of optimizing the queries for the nativedatabase so they perform best. When using Rosetta, you just say*what* you want to happen, not so much how, and Rosetta will takecare of getting it done quickly and correctly.

* A self contained back-end named Rosetta::Engine::Native implementsa relational database in Perl, so you can have that functionalitywithout straying outside Perl if you want. Of course,Rosetta::Engine::Native is only meant to be a correct example, notfast, so it should only be used for testing. Other backends can beused for production.

* Genezzo is an already existing fast third party database,implemented in Perl, which will be adapted to use Rosetta as itsinterface, so you do have, a Perl option besides the for-testing-onlyNative.

* The license of Rosetta has changed, such that my GPL exceptiongranted to allow linked code to retain its own license has changed;it is no longer based on technicalities like how the linking is done,but rather on what kind of license the linked code has. This shouldmake things a lot easier for developers of all stripes.


* See the Changes file with 'Rosetta' for more details on some aspects.

------------

Note that the current Rosetta framework on CPAN is mostlydocumentation (incomplete and partly out of date), and has little inthe way of executable code right now.

I recommend looking, in particular, at the pod in these files:Rosetta.pm, Model.pm, Language.pod, Overview.pod, TODO.pod.

Over the next month or so, hopefully coinciding with the Pugs 6.28.0release (that is refactored over the new PIL2 and perl 6 objectmodel), I should have more code such that you can actually startplaying with Rosetta in your code.

I welcome any kind of assistence that you can provide with Rosetta,and I hope that it will have a huge positive impact on the community.Really, assistence would be appreciated.


Thank you and have a good day. -- Darren Duncan


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
Rose-db-object mailing list
Rose-db-object@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rose-db-object

[RDBO] ANNOUNCE: first post-rewrite Rosetta release (v0.720.0)

Reply via email to