Bring AnyData / DBD::AnyData back to work with modern DBI

2014-11-12 Thread Jens Rehsack
Hi,

for a recent project we identified DBD::AnyData as best concept to do a 
client's job ;)
So there is a tuit to do the groundwork for bringing AnyData back to modern DBI 
interface around DBI::DBD::SqlEngine based drivers.

Last weeks I spent some time digging into AnyData itself to identify interfaces 
to touch for harmonization with the data_sources concept in DBI::DBD::SqlEngine 
(DBD::File). This mail is intended to share the results and present a concept 
for resurrection of the (more or less) dead module. For that reason, I CC'ed 
some people who got in touch with me over last years regarding AnyData and/or 
DBD::AnyData - to give them a chance to contribute.

At first the situation as it is: We (dbd-file-team, in this special case more 
or less Merijn and myself) identified most of AnyData and DBD::AnyData as being 
dead and nearly unusable in environments with modern Perl and up-to-date CPAN 
modules. The module is grown, bloated (no judgement for the time of writing), 
inconsistent and kind of self-contained (no reasonable API to outside). 
AnyData::Storage::TieHash is not a storage class, it's a miniature 
Tie::Hash::DBD with own query processing (parallel to DBI's SqlEngines and 
weird automatisms). I stop here to avoid starting a flame-war - the intension 
is to improve, not to blame.

So where is the future of AnyData?

From my point of view, upcoming AnyData / DBD::AnyData shall be reduced to the 
max. That means: no embedded adTie, no complex logic in frontend to deal with 
grown backends. Clean API for format-parsers, clean API for storage harmonized 
with DBI::DBD::SqlEngine::TableSource and DBI::DBD::SqlEngine::DataSource.

Consequently, upcoming releases of AnyData will depend on DBI. Format-Parsers 
will be written using DBD::CSV and DBD::DBM as guide (simple get_record, 
put_record etc. wrapper). To provide a tied hash again, DBD::AnyData will be 
bundled with AnyData (instead of two distributions in past) and Tie::Hash::DBD 
or Tie::DBI will be used.

adConvert, adDump and adExport are special cases of features already provided 
by SQL::Statement and will be re-implemented by using that functionality.

That all means, future API might be puzzled using roles to avoid strong 
requirements (Moo or Role::Tiny isn't decided yet) and unfortunately most of 
existing format-parsers in DarkPAN might require a rewrite. I hope the AnyData 
resurrection will help to reduce maintaining costs for future and apologize 
here and now for resulting extra effort when updating.

Best regards
-- 
Jens Rehsack
pkgsrc, Perl5
rehs...@cpan.org
cpanid: REHSACK






Re: Bring AnyData / DBD::AnyData back to work with modern DBI

2014-11-12 Thread Tim Bunce
Perhaps the module name should be changed.

Tim.

On Wed, Nov 12, 2014 at 10:33:20AM +0100, Jens Rehsack wrote:
 Hi,
 
 for a recent project we identified DBD::AnyData as best concept to do a 
 client's job ;)
 So there is a tuit to do the groundwork for bringing AnyData back to modern 
 DBI interface around DBI::DBD::SqlEngine based drivers.
 
 Last weeks I spent some time digging into AnyData itself to identify 
 interfaces to touch for harmonization with the data_sources concept in 
 DBI::DBD::SqlEngine (DBD::File). This mail is intended to share the results 
 and present a concept for resurrection of the (more or less) dead module. For 
 that reason, I CC'ed some people who got in touch with me over last years 
 regarding AnyData and/or DBD::AnyData - to give them a chance to contribute.
 
 At first the situation as it is: We (dbd-file-team, in this special case more 
 or less Merijn and myself) identified most of AnyData and DBD::AnyData as 
 being dead and nearly unusable in environments with modern Perl and 
 up-to-date CPAN modules. The module is grown, bloated (no judgement for the 
 time of writing), inconsistent and kind of self-contained (no reasonable API 
 to outside). AnyData::Storage::TieHash is not a storage class, it's a 
 miniature Tie::Hash::DBD with own query processing (parallel to DBI's 
 SqlEngines and weird automatisms). I stop here to avoid starting a flame-war 
 - the intension is to improve, not to blame.
 
 So where is the future of AnyData?
 
 From my point of view, upcoming AnyData / DBD::AnyData shall be reduced to 
 the max. That means: no embedded adTie, no complex logic in frontend to deal 
 with grown backends. Clean API for format-parsers, clean API for storage 
 harmonized with DBI::DBD::SqlEngine::TableSource and 
 DBI::DBD::SqlEngine::DataSource.
 
 Consequently, upcoming releases of AnyData will depend on DBI. Format-Parsers 
 will be written using DBD::CSV and DBD::DBM as guide (simple get_record, 
 put_record etc. wrapper). To provide a tied hash again, DBD::AnyData will be 
 bundled with AnyData (instead of two distributions in past) and 
 Tie::Hash::DBD or Tie::DBI will be used.
 
 adConvert, adDump and adExport are special cases of features already provided 
 by SQL::Statement and will be re-implemented by using that functionality.
 
 That all means, future API might be puzzled using roles to avoid strong 
 requirements (Moo or Role::Tiny isn't decided yet) and unfortunately most of 
 existing format-parsers in DarkPAN might require a rewrite. I hope the 
 AnyData resurrection will help to reduce maintaining costs for future and 
 apologize here and now for resulting extra effort when updating.
 
 Best regards
 -- 
 Jens Rehsack
 pkgsrc, Perl5
 rehs...@cpan.org
 cpanid: REHSACK
 
 
 
 


Re: Bring AnyData / DBD::AnyData back to work with modern DBI

2014-11-12 Thread Jens Rehsack

Am 12.11.2014 um 16:14 schrieb Tim Bunce tim.bu...@pobox.com:

 Perhaps the module name should be changed.

Why? Let me just talk about the reason to keep and the consequence to change.

Reason to keep: People continuous ask me about AnyData / DBD::AnyData and send 
me fiddly patches hacking in the module, complain about working around compat 
issues etc.
So what I currently know: every AnyData / DBD::AnyData user makes adoptions to 
the projects.

Consequence to change:
My requirements are easy: I need a DBD scanning a directory for files, opening 
the files and return the file name plus the :, separated fields (some optional, 
some key-value pairs) for each line. Hacking a new DBD would mean to me: clone 
DBD::CSV and adopt the parser.

Why did I decide for AnyData? Because the idea behind AnyData matches the 
requirements I have. It will be easier to find another consultant having 
knowledge about DBI and (new) AnyData than DBI, private DBD and the 
patch-supporting pkg-manager we'll use to add private prefix :)
Once we start local CPAN module patches, the motivation to contribute instead 
of hacking locally is reduced significantly (typical project flow - once a 
direction is chosen, it's fixated ...)

Beside the argument to keep an existing (working!) API for people who using it 
(which practically doesn't exists for AnyData/DBD::AnyData), why I should do a 
new DBD?

Jens

 Tim.
 
 On Wed, Nov 12, 2014 at 10:33:20AM +0100, Jens Rehsack wrote:
 Hi,
 
 for a recent project we identified DBD::AnyData as best concept to do a 
 client's job ;)
 So there is a tuit to do the groundwork for bringing AnyData back to modern 
 DBI interface around DBI::DBD::SqlEngine based drivers.
 
 Last weeks I spent some time digging into AnyData itself to identify 
 interfaces to touch for harmonization with the data_sources concept in 
 DBI::DBD::SqlEngine (DBD::File). This mail is intended to share the results 
 and present a concept for resurrection of the (more or less) dead module. 
 For that reason, I CC'ed some people who got in touch with me over last 
 years regarding AnyData and/or DBD::AnyData - to give them a chance to 
 contribute.
 
 At first the situation as it is: We (dbd-file-team, in this special case 
 more or less Merijn and myself) identified most of AnyData and DBD::AnyData 
 as being dead and nearly unusable in environments with modern Perl and 
 up-to-date CPAN modules. The module is grown, bloated (no judgement for the 
 time of writing), inconsistent and kind of self-contained (no reasonable API 
 to outside). AnyData::Storage::TieHash is not a storage class, it's a 
 miniature Tie::Hash::DBD with own query processing (parallel to DBI's 
 SqlEngines and weird automatisms). I stop here to avoid starting a flame-war 
 - the intension is to improve, not to blame.
 
 So where is the future of AnyData?
 
 From my point of view, upcoming AnyData / DBD::AnyData shall be reduced to 
 the max. That means: no embedded adTie, no complex logic in frontend to deal 
 with grown backends. Clean API for format-parsers, clean API for storage 
 harmonized with DBI::DBD::SqlEngine::TableSource and 
 DBI::DBD::SqlEngine::DataSource.
 
 Consequently, upcoming releases of AnyData will depend on DBI. 
 Format-Parsers will be written using DBD::CSV and DBD::DBM as guide (simple 
 get_record, put_record etc. wrapper). To provide a tied hash again, 
 DBD::AnyData will be bundled with AnyData (instead of two distributions in 
 past) and Tie::Hash::DBD or Tie::DBI will be used.
 
 adConvert, adDump and adExport are special cases of features already 
 provided by SQL::Statement and will be re-implemented by using that 
 functionality.
 
 That all means, future API might be puzzled using roles to avoid strong 
 requirements (Moo or Role::Tiny isn't decided yet) and unfortunately most of 
 existing format-parsers in DarkPAN might require a rewrite. I hope the 
 AnyData resurrection will help to reduce maintaining costs for future and 
 apologize here and now for resulting extra effort when updating.
 
 Best regards
 -- 
 Jens Rehsack
 pkgsrc, Perl5
 rehs...@cpan.org
 cpanid: REHSACK
 
 
 
 

-- 
Jens Rehsack
rehs...@gmail.com