P.S. Since the probe wasn't rejected by sourceforge, here's the real message.
-----------

2006-02-01   Darren Duncan <[EMAIL PROTECTED]>
--------------------------------------------------

I am pleased to announce the first CPAN release of the second major code base (started on 2005-10) of the Rosetta database access framework, v0.720.0, which is available now in synchronized native Perl 5 and Perl 6 versions.

This is a complete rewrite, including very different detail designs, implementations, and documentations, though it still retains the same high level design and purpose.

------------

The Perl 5 version is composed of these 2 distributions (more come later):

 * Rosetta-v0.720.0.tar.gz
 * Rosetta-Engine-Native-v0.1.0.tar.gz

These have Locale-KeyedText-v1.72.1.tar.gz (released at the same time) as an external dependency.

The Perl 6 versions of all 3 of the above items are bundled with Perl6-Pugs-6.2.11.tar.gz (released a half-day earlier) in its ext/ subdirectory.

The Perl 6 versions don't depend on anything outside the Perl6-Pugs distro that they live in. But the Perl 5 versions also have external dependencies on Perl 5.8.1+ and these Perl 5 packages, which add features that Perl 6 and Pugs already have built-in: 'version', 'only', 'Readonly', Class::Std, Class::Std::Utils, Scalar::Util, Test::More; the latter 2 are bundled with Perl 5.

------------

Following is both a reintroduction to the remade Rosetta as it is and will soon be, and a summary of the main changes from before the rewrite (first major code base of 2002 thru 2005-09).

For various reasons such will be bared below, it should be more apparent than ever that Rosetta is "not just another DBI wrapper" and really stands out as something different than any existing tools on CPAN.

Note that many of these details aren't yet in Rosetta's own documentation (they will be later), so they are distinct to this email.

* Locale::KeyedText is officially not part of the Rosetta framework anymore, being a distinct external dependency instead of its localization component.

* Anything that was in the SQL::Routine name space has been renamed into the 'Rosetta' name space.

* Briefly comparing DBI to Rosetta, DBI provides users with database driver independence; Rosetta provides them with database language independence, which is a higher abstraction, but it should still work quickly.

* Rosetta is now officially a federated relational database of its own that just happens to be good with cross-database-manager portability issues, and be good as a toolkit on which to build ORMs and persistence tools, rather than being mainly about portable SQL generation.

* The native query and schema design language of Rosetta is now based mainly on Tutorial D (by Christopher J. Date and Hugh Darwen) and closely resembles relational algrebra, rather than being based on SQL as it was before (note that some current documentation suggests otherwise, but that will be rewritten).

* Note, see http://www.oreilly.com/catalog/databaseid/ , the book by Date named "Database in Depth", which is one of the best references on database design I have ever seen. Everyone who works with databases should read it. Its not dry and has practical stuff you can apply right now. I am.

* The native language of Rosetta is presently called "Intermediate Relational Language" ("IRL", pronounced "earl", or "girl" without the "g"); it is inspired by Pugs' "PIL", which serves a similar purpose for Perl 6 as what IRL does for Tutorial D and SQL and other languages.

* IRL is strongly typed, where every value and container is of a single type, and permits user data type definitions to be arbitrarily complex (such as temporal and spacial data) but non-recursive. Aside from forbidding "references", it includes the features of so-called "object-relational" databases which are actually part of the true plain "relational" data model. Values of each distinct data type can not be substituted as operator arguments for others, or stored in containers for others, but they can be explicitly cross-converted in some circumstances (eg num to str or str to num).

* Despite actually being strongly typed, IRL has facilities to simulate weak data types over strong ones; for example, you can define an SV type that has numerical and character string components. More broadly speaking, you can define multi-part "disjunctive" types, each of a different other type, where only one member has a significant value at once, and the others have their type's concept of an "empty" value; actually, these have a single extra member that says which of the others holds the significant value.

* IRL natively uses 2-valued-logic (2VL) like Tutorial D, and not 3-valued-logic (3VL) like SQL, so every boolean valued expression always evaluates to true or false, not true or false or unknown (a SQL NULL). But it does simulate 3-valued-logic using disjunctive data types, one of whose members is the system defined "Unknown" strong data type, which can only ever hold the same single value; by definition, a disjunctive data type value whose member A is the significant one will never match with another whose significant member is B, and hence we can distinguish between "Unknown" and zero or the empty string when a number or string can't actually be set to Unknown (null).

* IRL has distinct data types for what are commonly referred to as "relations" (like a SQL table with a key, which may be over all of its columns) and "bags" (like a SQL table that lacks a key), where the former forbids duplicates and the latter allows them. Given Rosetta's hard typing, a relation and a bag can not be substituted for each other (except that they can be cross-converted, as numbers and character strings can be cross-converted), but rather have their own operators which either never output or can output duplicates respectively. A bag can be implemented over a relation where the relation has one extra attribute which stores a count of occurances for the otherwise distinct combination of other attributes, and operators do the right thing with that count.

* There is no inherent order of the attributes/columns of relations/bags/tables, and there is no inherent order to their tuples/rows, unlike SQL where at least the order of columns is significant. IRL does all references by names rather than by position; all operator parameters are named, as are relation attributes.

* Besides relations and bags, IRL has a distinct array data type, which is what you get when using an order-by; usually it only makes sense to use this as the last step in a query when fetching data, if the order is important.

* All typical joins between relations/bags/tables are natural joins, where attributes/columns of each joined item implicitly correspond and match when they have the same names and data types (and if none match, you have a cartesian). You never specify join conditions explicitly by using "foo = bar" or any such thing; rather, if you want to match on dis-similar names, you first rename (like SQL's "AS") one or both source columns. This also means that you can join an arbitrary number of relations/tables in a single operation, and they will just work, with the combined output relation/table having distinct attribute/column names already.

* Instead of saying "select <attr-list> from <relation> where <condition>", you nest arbitrary relational algebra expressions like "project( restrict( <relation>, <condition> ), <attr-list> )" or "restrict( project( <relation>, <attr-list> ), <condition> )"; both of those latter 2 happen to give the exact same result.

* The finer grained IRL should be easier to write non-trivial queries in than SQL, especially when adding things like groups and havings and such, since you can more reliably know what pieces you have to work with, and exactly what will happen when you say certain things, and you don't have to needlessly duplicate expressions. Writing queries in IRL should be more reliable than SQL since you don't have to worry about getting different results from 2 logically identical queries and you don't have to deal with ambiguous syntax.

* IRL should also be a lot easier to optimize for speed given the lack of ambiguity that plagues attempts to optimize SQL.

* Rosetta is designed to be very componentized, where you can substitute back-ends and front-ends at will, so it can work over both SQL based and non-SQL based database engines, and its user interface can resemble anything you want. It is also reasonably easy to map SQL to IRL and back, so you can still query Rosetta databases using various SQL dialects or other languages if you don't want to see the IRL, and this can help with migrating older applications.

* It is likely to be the ideal case for most Rosetta users to have an alternate front end, such as some adapted from current DBI wrappers, object persistence or relational mapping tools, and so on, rather than using IRL directly. Using Rosetta rather than DBI should make the tasks of people making such wrappers and tools easier, since they have a more reliable language to work against and they don't have to maintain a multiplicity of back ends for each storage engine; Rosetta does the latter for them.

* A typical Rosetta back-end that operates over an existing database engine will take care of optimizing the queries for the native database so they perform best. When using Rosetta, you just say *what* you want to happen, not so much how, and Rosetta will take care of getting it done quickly and correctly.

* A self contained back-end named Rosetta::Engine::Native implements a relational database in Perl, so you can have that functionality without straying outside Perl if you want. Of course, Rosetta::Engine::Native is only meant to be a correct example, not fast, so it should only be used for testing. Other backends can be used for production.

* Genezzo is an already existing fast third party database, implemented in Perl, which will be adapted to use Rosetta as its interface, so you do have, a Perl option besides the for-testing-only Native.

* The license of Rosetta has changed, such that my GPL exception granted to allow linked code to retain its own license has changed; it is no longer based on technicalities like how the linking is done, but rather on what kind of license the linked code has. This should make things a lot easier for developers of all stripes.

* See the Changes file with 'Rosetta' for more details on some aspects.

------------

Note that the current Rosetta framework on CPAN is mostly documentation (incomplete and partly out of date), and has little in the way of executable code right now.

I recommend looking, in particular, at the pod in these files: Rosetta.pm, Model.pm, Language.pod, Overview.pod, TODO.pod.

Over the next month or so, hopefully coinciding with the Pugs 6.28.0 release (that is refactored over the new PIL2 and perl 6 object model), I should have more code such that you can actually start playing with Rosetta in your code.

I welcome any kind of assistence that you can provide with Rosetta, and I hope that it will have a huge positive impact on the community. Really, assistence would be appreciated.

Thank you and have a good day. -- Darren Duncan


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
Rose-db-object mailing list
Rose-db-object@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rose-db-object

Reply via email to