Re: Database/DBD Bridging?

Darren Duncan Fri, 23 Sep 2011 14:01:26 -0700

I only got a copy of this message directly and not also via the list asexpected, since you addressed it to the list, but anyway ...


Brendan Byrd wrote on 2011 Sep 22 at 6:25am PST/UTC-8:

The problem with PostgreSQL's SQL/MED is that it's not Perl, and itwon't work for some of the more abstract objects available as DBD.

You may want to look into PL/Perl then, using Perl inside Postgres, to bringtogether some of these things, if it will work for you.

Iwould like to tie this DBD::FederatedDB into DBIC, so that it can searchand insert everything on-the-fly. Shoving everything into RAM isn'tright, either, since DBD::AnyData can already do that. The whole pointof having the databases process the rows one at a time is so that it canhandle 10 million row tables without a full wasteful dump.

Another thing to ask is whether what you're doing here is a batch process wheresome performance matters are less of an issue, or whether it is more on demandor more performance sensitive.


  It looks

like Set::Relation can work out great for sucking in table_info/row_infodata, and can be used as the temp cache as fractured rows come in.

Perhaps, although Set::Relation is more about making database operations likejoin etc available in Perl, so you'll want to be using such various tools totake advantage of it. But then no one besides myself has used it yet that Iknow of, and others often think of tool uses beyond the creator.

I would be highly interested in developing this with you. I'm spreadpretty thin with several other Perl modules, so I otherwise wouldn'ttackle it right now. But, if you already have something started, we cantry to finish it, and that's much better than starting from scratch alone.
Do you have a repository for this new module yet? What are you callingit? I take it the module is building off of SQL::Statement?

<snip>

If you mean the more robust/scalable solution, then that has 2 main parts, whichis a standard query language specification, Muldis D, plus multipleimplementations. It corresponds to but is distinct from the ecosystem of therebeing an ISO SQL standard and its implementations in various DBMSs.

The query language, Muldis D, is not SQL but it is relevant here because it isdesigned to correspond to SQL and to be an intermediary form forgenerating/parsing SQL or translating between SQL dialects, or between SQL andother languages like Perl. (This means all SQL, including stored procedures.)

This essentially is exactly what you want to do, have a common query syntaxwhere behind the scenes some is turned into SQL that is pushed to back-endDBMSs, and some of which is turned into Perl to do local processing. The greatthing is as a user you don't have to know where it executes, but just that theimplementation will pick the best way to handle particular code. I think of ananalogy like LLVM that can compile selectively to a CPU or a GPU.Automatically, more capable DBMSs like Postgres get more work pushed to them todo natively, and less capable things like DBD::CSV or whatever have less pushedto them and more done in Perl.

The language spec is in github at https://github.com/muldis/Muldis-D and it isalso published on CPAN in the pure-pod distribution Muldis-D, but the CPAN copyhas fallen behind at the moment.

The implementations I haven't started yet, or I did but canceled those effortsso to do it differently, so you can't run anything yet. But I know in my headexactly how I intend to do it.

I intend to make a few more large updates to the Muldis D spec before startingin earnest on the implementation, so to make that simpler and easier to do (itis substantially complete other than some large refinements); some clues to thisdirection are in the file TODO_DRAFT in github.

For timetable, if I could focus on this project I could have something usable ina few months; however, I also have a separate paying job that I'm currentlyfocusing on which doesn't leave much time for the new project, though I hope toget more time to work on it maybe in mid-late October.

If you are still interested in working on this, or you just want to follow it,please join the (low traffic) discussion list muldis-db-us...@mm.darrenduncan.net .

FYI, this project is quite serious, not pie in the sky, and it has interest fromsome significant people in the industry, such as C.J. Date (well known for "AnIntroduction to Database Systems" that sold over 800K copies), and one of hislatest co-authored books in 2010 explicitly covers part of my project with achapter.


-- Darren Duncan

Re: Database/DBD Bridging?

Reply via email to