I only got a copy of this message directly and not also via the list as expected, since you addressed it to the list, but anyway ...

Brendan Byrd wrote on 2011 Sep 22 at 6:25am PST/UTC-8:
The problem with PostgreSQL's SQL/MED is that it's not Perl, and it won't work for some of the more abstract objects available as DBD.

You may want to look into PL/Perl then, using Perl inside Postgres, to bring together some of these things, if it will work for you.

I would like to tie this DBD::FederatedDB into DBIC, so that it can search and insert everything on-the-fly. Shoving everything into RAM isn't right, either, since DBD::AnyData can already do that. The whole point of having the databases process the rows one at a time is so that it can handle 10 million row tables without a full wasteful dump.

Another thing to ask is whether what you're doing here is a batch process where some performance matters are less of an issue, or whether it is more on demand or more performance sensitive.

  It looks
like Set::Relation can work out great for sucking in table_info/row_info data, and can be used as the temp cache as fractured rows come in.

Perhaps, although Set::Relation is more about making database operations like join etc available in Perl, so you'll want to be using such various tools to take advantage of it. But then no one besides myself has used it yet that I know of, and others often think of tool uses beyond the creator.

I would be highly interested in developing this with you. I'm spread pretty thin with several other Perl modules, so I otherwise wouldn't tackle it right now. But, if you already have something started, we can try to finish it, and that's much better than starting from scratch alone.

Do you have a repository for this new module yet? What are you calling it? I take it the module is building off of SQL::Statement?
<snip>

If you mean the more robust/scalable solution, then that has 2 main parts, which is a standard query language specification, Muldis D, plus multiple implementations. It corresponds to but is distinct from the ecosystem of there being an ISO SQL standard and its implementations in various DBMSs.

The query language, Muldis D, is not SQL but it is relevant here because it is designed to correspond to SQL and to be an intermediary form for generating/parsing SQL or translating between SQL dialects, or between SQL and other languages like Perl. (This means all SQL, including stored procedures.)

This essentially is exactly what you want to do, have a common query syntax where behind the scenes some is turned into SQL that is pushed to back-end DBMSs, and some of which is turned into Perl to do local processing. The great thing is as a user you don't have to know where it executes, but just that the implementation will pick the best way to handle particular code. I think of an analogy like LLVM that can compile selectively to a CPU or a GPU. Automatically, more capable DBMSs like Postgres get more work pushed to them to do natively, and less capable things like DBD::CSV or whatever have less pushed to them and more done in Perl.

The language spec is in github at https://github.com/muldis/Muldis-D and it is also published on CPAN in the pure-pod distribution Muldis-D, but the CPAN copy has fallen behind at the moment.

The implementations I haven't started yet, or I did but canceled those efforts so to do it differently, so you can't run anything yet. But I know in my head exactly how I intend to do it.

I intend to make a few more large updates to the Muldis D spec before starting in earnest on the implementation, so to make that simpler and easier to do (it is substantially complete other than some large refinements); some clues to this direction are in the file TODO_DRAFT in github.

For timetable, if I could focus on this project I could have something usable in a few months; however, I also have a separate paying job that I'm currently focusing on which doesn't leave much time for the new project, though I hope to get more time to work on it maybe in mid-late October.

If you are still interested in working on this, or you just want to follow it, please join the (low traffic) discussion list muldis-db-us...@mm.darrenduncan.net .

FYI, this project is quite serious, not pie in the sky, and it has interest from some significant people in the industry, such as C.J. Date (well known for "An Introduction to Database Systems" that sold over 800K copies), and one of his latest co-authored books in 2010 explicitly covers part of my project with a chapter.

-- Darren Duncan

Reply via email to