[mdb-users] status in 2015 of Muldis database projects

Darren Duncan Sat, 24 Jan 2015 18:35:33 -0800

On 2015 Jan 1 I had announced here that I had updated my Perl moduleSet::Relation to reduce is dependency load without cutting functionality.

Now on 2015 Jan 24 I will briefly update you on the status of the more centralproject, the Muldis D language itself.

For the last few years I had been doing a lot of brainstorming and note-takingand made a lot of improvements to the design of the Muldis D language and itsecosystem, a lot of this being put unorganized in a TODO_DRAFT file on Githuband a lot not making it online myself; that was more to help me remember.

Starting a couple years ago I switched to an implementation-driven process whichsaw a few periods of movement and also long periods of none, as I was quite busywith some other concerns.


So in 2015 ...

For various reasons, I am now making an explicit push to actually get thingsimplemented, where I try and spend an hour each day on something tangible.

I feel that with continued focus, people should be able to execute Muldis D codewithin maybe 2 months of now, a milestone, and things can then snowball.


======

The first "for real" implementation ishttps://github.com/muldis/Muldis-D-Ref-Perl5 or the "Muldis D Reference Engine",in terms of the public-facing Perl 5 module Muldis::D::RefEng (not yet releasedon CPAN).


This has a few main components:

1. Low_Level, which is a Perl library corresponding directly to theMuldis_D::Low_Level system-defined package. That package defines the standardfoundation of types and routines which is easily implementable in or mappable totypical general purpose languages, Perl or otherwise. The rest of the higherlevel Muldis_D system-defined package, say 90% of the built-ins that user codecan invoke, would be written in Muldis D itself and can be reused, while justthe Low_Level part is expected to be rewritten from Perl for each implementationlangauge. This is the primary extent of the bootstrapping. The 9 mutuallydisjoint low-level types are: Boolean, Integer, Array, String, Bag, (named)Tuple, Capsule, Identifier, External. Almost all other types one is likely tosee in Muldis D would be subtypes of Capsule, which corresponds to the conceptof generic objects each belonging to a named class, making it easy to partitiontypes from each other in the face of polymorphism. In contrast, Tuple isanalogous to a classless object; or put another way, a Capsule is a taggedTuple. The most complicated type implementation-wise by far is Bag, which isthe homogenous collection type over which Set and Relation are defined; thecomplexity relates to all the infrastructure related to element indexing andidentity/sameness determination involved. Low_Level also does somefunctionality related to memory management. A Perl ::Value object is used torepresent each Muldis D value regardless of type, and is value-immutable.

2. Plain_Text implements the Muldis D Plain Text (MDPT) parser, which takesMuldis D Plain Text source code, typically in .mdpt text files (as per .pl or.java or .sql etc files), and maps it into Muldis D's homoiconic nativerepresentation. The latter is analogous to the "information schema" of SQL butthat all routine and type definitions are completely broken down, analogous toan abstract syntax tree that also remembers most concrete syntax details asmetadata, for lossless round-tripping of all the details users would considerimportant, including in translation to other languages. The internalrepresentation is composed of a tree of types/::Value provided by Low_Level,typically Capsule subtypes corresponding to different types of expression/etcnodes. (Due to how Muldis D works, we don't actually have to declare a higherlevel type to use it, as higher level types are just constraints on or unions ofthe Low_Level ones, so we can use all the AST types without them having to be"declared" first, so Low_Level is kept simple.) This parser is procedural andstream-oriented, so it works with very little RAM no matter how big the sourcefile is, particularly important for those that are database dumps.

3. A compiler component will take the homoiconic native Muldis D code andgenerate equivalent Perl 5 code from it. Generally there is a 1:1correspondence between each Muldis D routine and a generated Perl 5 routine, aswell as between a Muldis D package and a Perl 5 package. The generated Perl 5code will include calls to any other glue code provided by RefEng so that it hasthe desired runtime semantics such as the dispatch mechanism. It is expected inpractice that the "compiled" (Perl 5 source) modules will then be cached, atleast in memory but often also in the filesystem, as Perl 5 packages, maybeunder the Muldis::D::RefEng::Compiled::* namespace, so that as long as furtherdata-definition activities aren't happening, runtime performance will be morereasonable due to avoiding unnecessary repeated work.

4. A runtime and/or memory component will implement a virtual machine supportingmultiple concurrent processes and transactional memory. (The multi-processmodel is virtual, and all exists within a single actual Perl process.) Eachprocess would correspond to either a "database connection" or an "autonomoustransaction" or other things in SQL DBMSs or other programming languages. Thiswould provide the main event loop(s) in which all the regular compiling andruntime activities happen. Shared access between processes to possibly-commonresources such as "databases" is managed here, processes can start and endothers, and they can communicate using message passing.

5. Fundamentally Muldis::D::RefEng is an embeddable library, but a binary"muldisdre" would be included that wraps it. One can treat that binary inexactly the same manner as one treats "perl" or "python" or whatever to runMuldis D Plain Text source files, they don't even have to know it is written inPerl. This is part of a model of portability between Muldis::D::RefEng and sayports written in NQP or Python or Ruby or C or Go or Haskell or whatever, userscan just run it without knowing what its written in. Another precedent for thisis Perl 6 and its user-interchangeable Rakudo backends, MoarVM, JVM, and Parrot.

6. A wrapper/SDK for Muldis::D::RefEng, or more likely that would be the Perl 5package name of the wrapper, would be invokable from a general Perl 5 program inthe same way that DBI et al are. It would implement API-level input checkingfrom user code as well as provide Perl users a second way to write Muldis Dcode, which is as nested Perl data structures analogous to what various PerlORMs or abstractions take, so in that context Perl programs don't have togenerate string form (Plain Text) Muldis D at all. It would be particularlysuited for a backend of DBIx::Class or the like, where they can just map Perldata structures and no messy details of string serialization is involved likewith DBI+SQL today.


For status, the above RefEng 6 components are all loosely designed, and:

1. Component 1 / Low_Level is about 40-80% coded depending how you measure it,and the working proof-of-concept Set::Relation Perl module will have a lot ofits guts adapted to flesh this out more.

2. Component 2 / PlainText is about 10% coded, currently under file names likeStreamDecoder/StreamLexer/more.

3. Component 3 / the compiler is pending under RefEng but some precursors werereleased incomplete on CPAN years ago in the form of modules like "Rosetta" etal, those having generated a mixture of Perl and SQL. Likewise for component 6/ the SDK for embedding in Perl. Components 4 and 5 don't have enough writtenyet to commit.


======

Besides Muldis::D::RefEng, also on GitHub ishttps://github.com/muldis/Muldis-D-Standard which is the repository for theself-hosted portion of Muldis D, intended to house the 90% of the system librarycode written in Muldis D itself, as well as the standard test suite forimplementations, and example code, and so on. This would be commonly shared byMuldis::D::RefEng and any other implementations. Currently this is maybe 1-5%written depending on how you measure it. A precedent being that a majority ofthe Perl 6 standard library is written in Perl 6, that having its own sharedrepository.

Note that while Muldis::D::RefEng releases outside GitHub conceptually bundleMuldis-D-Standard, what it will actually ship to CPAN with ispre-compiled-as-Perl versions of the standard library, so Perl users can justinstall that Perl module and go. A loose analogy is that the DBD::SQLite Perlmodule ships to CPAN with the sqlite.h and sqlite.c but the latter don'tcanonically live in the former's version control.


======

Finally, I have started rewriting/revising the Muldis D language specification,at https://github.com/muldis/Muldis-D and started making periodic CPAN releases(where it is formatted) at http://search.cpan.org/dist/Muldis-D/ for the firsttime since 2011. This rewrite both considerably simplifies the spec andincorporates my last few years of brainstorming. In particular this is alsopartially implementation-driven and kept in sync with RefEng developments.

I jumped the spec version ahead to 0.200, from 0.148. The root file Muldis::Dwas brought up to date and most of the other files were renamed intoMuldis::D::Outdated::* to more clearly indicate their status; over timereplacements for the latter will appear under typically new

Muldis::D::* names.

One of the more significant changes is that the new version will have concreteMuldis D Plain Text code examples, that is actual source code examples,throughout it just like typical programming language documentation. Learning byexample is generally what users want and is more effective. Previously almostall the files had no code examples and just described things. At the time areason for this was partly that I was trying to de-emphasize the plain textsyntax as just 1 of a variety of forms, unlike now where it is the canonicalform, and another reason is that originally I had conceived the features longbefore actually figuring out the syntax, unlike now.


======

So my current plan is to co-develop Muldis::D::RefEng and the spec up until thepoint that a reasonable starter subset of the language works, and then releaseRefEng to CPAN for the first time.

It is expected that the first release will be just enough that you can invoke"muldisdre" and compile/run a hello world program as well as do some basic mathor string operations, as well a relational operation (probably relational join),basically something that differentiates it from all the other general purposelanguages. The goalpost may be moved for the first release, but it should havea substantially complete RefEng infrastructure in place, the ability to parseand run at least the simplest Muldis D programs.


This will maybe come out by the end of March if I'm lucky.

After that, subsequent releases would be mainly about implementing more systemtypes and routines, adding examples and tests. This includes a lot of Low_Levelroutines needed for a normal language but not for a "get something running"release. Then other projects in the ecosystem such as ports to other languagesor other / eg SQL backends.


Until then or afterwards, you can follow some of my progress on Github.

Thank you again for your interest and I'm sorry for taking so long to get to anexecutable.


-- Darren Duncan

_______________________________________________
muldis-db-users mailing list
muldis-db-users@mm.darrenduncan.net
http://mm.darrenduncan.net/mailman/listinfo/muldis-db-users

[mdb-users] status in 2015 of Muldis database projects

Reply via email to