Hi there, today I got some people in irc://irc.perl.org/#dbi to discuss planned extension to DBD::File.
There were several goals to reach: 1) add readonly mode to DBI::DBD::SqlEngine ==> will be solved by adding a new attribute sql_readonly to DBI::DBD::SqlEngine and if required fix DBI::SQL::Nano and SQL::Statement and bump SQL::Statement requirement in *::Nano 2) add support for other I/O layers to DBD::File a) add support for streams (PerlIO) b) add support for other kind of fetch_row/push_row processing The goal (a) came from recent projects and the goal (b) is a longer open one I identified first 2 years ago when refactoring DBD::File to it's current version. During this refactoring task I detected to complexity of dealing with DBD::AnyData. Sven Dowideit is now caring of AnyData and DBD::AnyData and run into the same problem. Proposed solution: DBD::File will implement an abstract i/o strategy which will access concrete implementations for directory scanning ($dbh->get_tables()) and table's I/O (fetch_row, push_row, ...) ==> DBD::File will get two new (default-) attributes: * f_dir_backend (or: f_dir_strategy) * f_stream_backend (or: f_stream_strategy) ==> backends will provide required methods ('perlio :via', get_line, tell, seek, ...) for data parsers. ==> Data parsers might have requirements not satisfied by every backend (thing of DBD::DBM *gg*) Additional improvements of this way: ( ) We could easily re-implement DBD::RAM (provides RAM tables) ( ) We could do groundwork for future wanted $dbh's mixing CSV tables with DBM tables with AnyData tables with Sys tables ... ( ) We can add a "clone" of DBD::ExampleP doing what it does using default provided Dir/Stream backends I would prefer to add default backends below DBD::File namespace, eg. DBD::File::Backend::PerlIO or DBD::File::Backend::Filesystem. Any comments? Best regards, Jens
<@[Sno]> well, I invited mst to discuss the stream support/API for DBD::File (and DBD::CSV ...) for f_file and f_dir <@Tux> guess it was caused by the number of netsplits last week <@Tux> I'll try to keep alive when I can. must visit cust tomorrow <@[Sno]> what's already in queue: impove DBI::DBD::SqlEngine with an attribute named 'sql_readonly' ==> throw exception when open_table is called for write access <@Tux> DBD_File also has f_readonly causing early fail om open for writing, but that has not (yet) been committed <@Tux> I have been playing with the next step, but ran out of time <@[Sno]> Tux: I move that f_readonly to SqlEngine <@[Sno]> because we can support more Pure-perl DBD's in that way <@Tux> then it won't work in nano <@[Sno]> and it's not released <@[Sno]> why it shouldn't work in Nano? <@[Sno]> Nano is bundled and I just have to fix it there <@Tux> and i/we have to change dbd::csv to recognize sql_** as valid options too <@Tux> "fix it there" is ok for me <@[Sno]> doesn't it already? we have a lot of sql_* options already active <@Tux> I would have to check <@[Sno]> I expect it does - I use them in SQL::Statement tests :D <@[Sno]> eq. sql_quoted_identifier_case and sql_identifier_case <@[Sno]> try set $dbh->{sql_identifier_case} = SQL_IC_UPPER and see what happens <@[Sno]> ok, back to stream (or like) support <@[Sno]> currently the simplest way would be to check with Scalar::Util (isvstring, reftype) or something like that if the given f_file attribute is a string or a file handle <@[Sno]> same for f_dir <@[Sno]> but this is probably a bit short from several perspectives <@[Sno]> 1) maybe we want to set the f_dir attribute to a tar archive (or an Archive::Tar instance?) and the tables to open shall come from the tar archive (readonly, of course) <@[Sno]> 2) SvenDowideit now works on AnyData / DBD::AnyData which supports more storage backends than simple files <@[Sno]> 3) DBD::DBM could be improved by not hacking around the open() call in DBD::File * SineSwiper (~sinesw...@74-130-25-242.dhcp.insightbb.com) has joined #dbi <@[Sno]> OO programmers hammer here seems to cry: use roles <@[Sno]> but timbunce_ will say: no additional dependencies for DBI (and I agree to this statement) which means, role management has to be implemented by hand <@[Sno]> Tux, mst, SvenDowideit - did I cover it so far? <SvenDowideit> are you guys only thinking about using archives as input? <@Tux> a) I don't want *any* new dep in DBD::File (that would add deps to DBI) <@Tux> so we'd need a backward compat way to pass a "stream" to DBD::File <SvenDowideit> i find the idea of sql querying data that comes in on a tcp stream (for eg) <@Tux> what Sno and I thought of was to set f_dir to undef and pass the "stream" in f_file or somesuch <@Tux> That was what I worked on, but I found a snakepit <@[Sno]> Tux: that was our first quick shot <@[Sno]> the idea with Archive::Tar and Archive::Zip came later <SvenDowideit> but wrt archive, its complicated when you have a zip of dirs and files - though I like that too <@[Sno]> and then came SvenDowideit - and he's doing it again ;) <SvenDowideit> grin, i'm 95% trouble <@Tux> Archive::* should be used from DBD::CSV and other backends <@[Sno]> but better trouble now than when finished <@Tux> sure <@[Sno]> I don't want a reimplementation of I/O backends in any derived DBD! * SineSwiper1 (~sinesw...@gw.insightns.com) has joined #dbi <@Tux> the only diff in DBD::File is that in the guts it doe not need to *open* a file, but just use the stream passed * SineSwiper has quit (Ping timeout: 360 seconds) <@[Sno]> for f_file neing a stream, yes <@Tux> which proved harder to do than I thought it would be <@[Sno]> but for f_dir being something different? <@[Sno]> which is business as usual for SvenDowideit in AnyData <SvenDowideit> darnit :) <@[Sno]> SvenDowideit: I do not have objections when an I/O role provides abilities to write data to anything ... <SvenDowideit> tbh - defining a demarkation between DBD::File stuff, and DBD::Other would be wise <SvenDowideit> that will stop someone like me bikeshedding <@[Sno]> SvenDowideit: DBD::File is an abstract base class which handles basic I/O for derived DBD's like DBD::CSV, DBD::DBM etc. <SvenDowideit> DBD::other - like DBD::Dir <SvenDowideit> or DBD::TCPStream <SvenDowideit> or DBD::Stream <@[Sno]> and it looks to me it should become a wrapper for "external" I/O implementations and 1..n default implementations for string and fhandle <@[Sno]> SvenDowideit: not 10 additional DBD::* base classes like in g_object - that sucks <@[Sno]> billions cross dependencies and in the end you need them all <SvenDowideit> you want to implement all those io types in one magical place? <@[Sno]> I want a DBD::File - as now <@[Sno]> that's required for backward compatibility anyway <@[Sno]> I might be open for a discussion about a class between DBI::DBD::SqlEngine and DBD::File which is more complex and DBD::File intruments that class to behave as it's doing now <@[Sno]> but that would restrict benefits to DBD::CSV <@[Sno]> so I'd prefer a DBD::File with some more attributes (f_dir_backend, f_stream_backend, ...) <@[Sno]> and instead of doing an opendir() - it calls it's $self->{dir_backend}->open() <@[Sno]> similar for f_file, open and f_stream_backend <SvenDowideit> gotcha <SvenDowideit> roles, but homemade <@[Sno]> that's why I asked mst to join - he might have ideas ... ;) <@[Sno]> gang of four named that "facede pattern" or so <SvenDowideit> facade - gosh, whats the german <@[Sno]> :P <@[Sno]> anyway - typical roles won't work either, 'cause they're injected once and we need it on $dbh/$table instance <SvenDowideit> ok, so as I'm reading code still (well done indeed) <@[Sno]> Tux: what do you think about those f_dir_strategy / f_stream_strategy attributes and they're instantiated with some DBD::File intelligence as *::File (opendir, open, tell, seek, ...) or *::Stream <SvenDowideit> can you explain why one might need a different dir_backend from file_backend for any one dbh/table ? <@[Sno]> to use Archive::Tar ... <@[Sno]> SvenDowideit: dir_backend is currently per $dbh <SvenDowideit> if i'm using Archive::Tar, then one class that gives both dir and file info would work right? <@[Sno]> we planned a future for DBD::File where it will be possible to mix between DBD::File and DBD::DBM etc. <@[Sno]> SvenDowideit: probably - depends on finally decided implementation <@[Sno]> but the dir_backend can return a default_stream_backend - and then: yes <@Tux> mje: http://pasta.test-smoke.org/329 <SvenDowideit> mmm, ok, there's a point, if you have 2 zips, you need a way to combine the streams <@mst> $dbh has some sort of object that returns an object representing a table perhaps? <@Tux> Sno, sane up to tell/seek, as those prove extremely unreliable in XS <@mje> Tux, I've tried asking Yanick to fix that TODO test - I will try again - thanks again <@Tux> as you might have no idea what the underlying mechanism is: perlio, standard IO, scalario etc etc <@[Sno]> mst: yes, the dbh has that sort of object <@Tux> I tried really hard to fix that in Text::CSV_XS but after looking deeper with leont, I gave up and reverted <@[Sno]> so anyone would be happy with f_dir_strategy/f_dir_backend and f_stream_strategy/f_stream_backend? <SvenDowideit> mmm, so these backends are basically parallel to something Jeff was doing in AnyData - though he only began to extract the code to show it * SineSwiper1 has quit (Ping timeout: 360 seconds) <@[Sno]> 'cause $dbh->get_tables() is not restricted to $sth it might be difficult (not clever?) to have a dir backend per table <SvenDowideit> and then he separated it further to make the parser of the f_stream_backend pluggable <@[Sno]> SvenDowideit: yes, that's the intention <SvenDowideit> excellent, that addresses my niggling feeling that AnyData should be redundant <@[Sno]> SvenDowideit: probably it provides exactly the missing backends <SvenDowideit> only in a proto-mess form <@[Sno]> or it will be a bundle of separate available backend <@[Sno]> SvenDowideit: see http://search.cpan.org/~timb/DBI-1.622/lib/DBD/File.pm#f_schema how f_dir_backend would be added like f_dir or similar <SvenDowideit> so, to take things to a maddening extreme <SvenDowideit> what should happen when i point DBD::File at a dir containing 12 zips <SvenDowideit> that each contain a mix of cvs, and other file types <@[Sno]> that what happens now, too (why using the default backend) <@[Sno]> s/why/when/ <SvenDowideit> when thinking about the future when you can mix DBD:File DBD:DBM and more <@[Sno]> see f_ext for details ;) <SvenDowideit> i'm thinking in the bold future when the facades allow magic <@[Sno]> future is unwritten - we though about some kind of data dictionary <@[Sno]> and of course, you're right, DBD's would be "reduced" to configuration providers <SvenDowideit> i'm kind of wondering if having DBD::File contain the dir code is suboptimal <SvenDowideit> compared to having a DBD::Dir that you mix with a DBD::File and a DBD::Parser (for want of a better name) <@[Sno]> I don't get your point <SvenDowideit> ignore that - I'm hung up on the name of the class <SvenDowideit> i keep thinking that DBD::File is about files, and really, its not (just) <@[Sno]> not anymore ;) <SvenDowideit> and the 2 strategies provides that separation - it aught to be possible for a f_dir_backend to return either another f_dir_backend or a f_stream <SvenDowideit> if it has the smarts to do it (which would be unlikely to be coded in the DBI core backends, except for trivial cases) <@[Sno]> another f_dir_backend? what should be done with that other instance? <@[Sno]> I thought about both - as the table objects are now - as a flyweight <SvenDowideit> keep iterating until you get nothing, or something useful <@[Sno]> like DBD::ExampleP ? * SvenDowideit goes look :) * apatsche (~dpatschei@81.16.159.162) has joined #dbi <SvenDowideit> dunno :) there's no docco! :p <@[Sno]> Tux: can you re-read mst's statement with the idea of multiple table types in one $dbh? * apatsche (~dpatschei@81.16.159.162) has left #dbi (Konversation terminated!) <@[Sno]> Tux: in preparation of data dictionary <@Tux> I see only one line from mst and that does not deal with your question <@[Sno]> Tux: tables are Flyweights meanwhile (it's all hold in shared f_meta structure) <@[Sno]> yeah - but it gave me an idea :D <@Tux> I like the opportunity of having a mixed env: no restrictions to what /might/ be useful, but withing doable bounds <@[Sno]> Tux: instead of $class =~ s/::Statement/::Table/; we could use f_meta->class //= ... <@Tux> yes <@[Sno]> which would allow basic DBD mixing ... <@[Sno]> we need to do more - because of mixing "dbm_*" and "csv_*" attributes in $dbh <@[Sno]> timbunce_: any wishes about the namespace for the default f_dir/f_stream backends of DBD::File? <SvenDowideit> ooo, you mean you're going to implement it? <@[Sno]> not today, probably tomorrow or Wednesday <@timbunce_> [Sno] I've not been following along. I'd need a summary and I'm heading out for a while now. So I'll pull my usual trick of asking for an email to dbi-dev :) <@[Sno]> hehe <@[Sno]> I would attache the chat log if noone rejects <@timbunce_> Edited heavily *please*! <@[Sno]> mst: you're dismissed - you can reduce the amount of your channels <@[Sno]> timbunce_: I write a summary - attaching it for those who want's to know all details <@timbunce_> cool, thanks.