Re: Namespace suggestions for new module submission (record-level transaction howto?)

2004-01-04 Thread Arthur Corliss
On Sat, 3 Jan 2004, david nicol wrote:

 I am not certain how big Sleepycat's release is any more, but
 I think a DB::Inline done with Inline.pm wrapping sleepycat code
 would be an interesting project.  That might just move the problem
 from library synchronization to making sure that everyone has access
 to a compiler though.

Agreed, that's another concern that I have, which is one of my lesser reasons
for doing a Pure Perl solution.

 I would like to hear more about the record level locking and
 transactions.  Perltie does not support these features: what will
 the interface look like?  My own efforts to do record level locking
 with DirDB from above the perltie are done by:

snip

Keep in mind that I've got a great deal of performance tuning to do, along
with some serious regression tests to make sure everything works the way that
I'm intending.  This isn't stable code yet.  In a nutshell:  I'm cheating the
system through the use of the transaction log (something that will be in use
for *every* write to the db), and I still want to preserve concurrent writes.

Outside the nutshell:  This system is nothing more than an AVL binary tree
implementation, using four files (index, key values, associative values, and
the transaction log).  The write process goes something like this:  check the
transaction log for any open transactions for the same record, write lock the
log and add the entry (concurrently executing transactions can ignore the
advisory lock to mark their transaction complete).  Update the application
blocks in the relevant files, write-locking only if the file are to be extended
(and as before, other writes that aren't extending the files can ignore the
lock).

As for transactions:  what I've described above is all I have at the moment.
Atomic record updates.  What I'd like to do at some point is add support in
the log format definition for multiple record updates, but that isn't done
yet.

Another FYI, before someone asks:  I chose four files for storage for a reason
(all reasons are influenced by my feeble-mindedness, of course).  First, I
wanted to be able to crawl/rebalance the binary tree with fixed length records
for performance reasons.  Second, having separate files for the actual values
of the keys and associative values allows me to have full binary storage
capability without worrying about special encoding tricks, etc.  Outside of my
method of tracking available slots of storage (i.e., deleted records) for
reuse, there's nothing but data in those two files, not even record
demarcation.  The transaction log, of course, speaks for itself.

Now, if someone knows a better way, I'm all ears.  :-)

 If you're making up your own file format, how about CorlissDB?

The only problem I have with that is I don't want to give the impression that
this is just another wrapper for yet another C implementation.  Many people
will assume that they'll need some libraries and end up ignoring it.

 You said support tied hashes -- Did you mean support for storing
 hash references?

Nope.  It will support the hash binding via the tie() function.  That's the
primary method of use I have for it right now.

 I added support for hash references to my DirDB (and DirDB::FTP)
 modules and would appreciate your feedback on the semantics of the
 interface.  They are as follows:

   When you store a reference to a native perl hash to DirDB,
 the hash becomes blessed so that further manipulation of the referenced
 hash manipulates the persistent store as well.

   When you store a reference to a tied hash to DirDB, you get
 a deep copy.

   When you store anything other than a scalar or an unblessed hash
 reference, the module throws a croak without overwriting or corrupting

 These semantics make it possible to do multi-level autovivification
 inside a DirDB data structure, even over the network (by FTP.)

Sounds interesting.  I haven't used that module before, but I think I'll go
download it and check it out.  I can imagine a few uses for it.  As to the
semantics, I can't speak intelligently on that until I get a fuller feel of
how the module will typically be used.

--Arthur Corliss
  Bolverk's Lair -- http://arthur.corlissfamily.org/
  Digital Mages -- http://www.digitalmages.com/
  Live Free or Die, the Only Way to Live -- NH State Motto


Re: Namespace suggestions for new module submission

2004-01-02 Thread Mark Stosberg
On Thu, Jan 01, 2004 at 06:15:49PM -0900, Arthur Corliss wrote:
 Greetings:
 
 In the near future I'd like to submit a module for inclusion on CPAN.  I need
 some advice on the appropriate namespace, however, since I don't want to
 pollute top-level namespace.
 
 Unofficial module name (as it's being developed):  PerlDBM
 Synopsis:  Pure-perl implementation of a dbm engine.  Supported only on
platforms with 64-bit filesystems.  Database files are
portable (all data is stored in network-byte order), with
record-level locking and transactions.  Has it's own API for
low-level control, but also will support tied hashes.
 
 I did notice that most of the XS wrappers for C-based implementations were all
 in top-level namespace, though.  Any suggestions/preferences?

Will this be implemented with the DBI interface? Then DBD::YourProject
seems appropriate. 

DBD::SQLite seems to be a related case, although it's not Pure Perl,
it just allows you install it as a standard DBI driver.

Mark


Re: Namespace suggestions for new module submission

2004-01-02 Thread Arthur Corliss
On Fri, 2 Jan 2004, Mark Stosberg wrote:

 Will this be implemented with the DBI interface? Then DBD::YourProject
 seems appropriate.

 DBD::SQLite seems to be a related case, although it's not Pure Perl,
 it just allows you install it as a standard DBI driver.

I don't think it does enough to warrant inclusion in DBD::*, nor have I
planned to make it accessible via DBI.  It's just another method for
disk-based stateful hashes, like all the *DBM_File modules.  Modules like
AnyDBM_File and DB_File are causing some unpredictable results in some of my
code, depending on the version and implementation of the dbm libs they're
linked against.  This is just my way of getting predictable results without
requiring admins to upgrade or install new system libs, along with the
requisite Perl modules.

--Arthur Corliss
  Bolverk's Lair -- http://arthur.corlissfamily.org/
  Digital Mages -- http://www.digitalmages.com/
  Live Free or Die, the Only Way to Live -- NH State Motto


Namespace suggestions for new module submission

2004-01-01 Thread Arthur Corliss
Greetings:

In the near future I'd like to submit a module for inclusion on CPAN.  I need
some advice on the appropriate namespace, however, since I don't want to
pollute top-level namespace.

Unofficial module name (as it's being developed):  PerlDBM
Synopsis:  Pure-perl implementation of a dbm engine.  Supported only on
   platforms with 64-bit filesystems.  Database files are
   portable (all data is stored in network-byte order), with
   record-level locking and transactions.  Has it's own API for
   low-level control, but also will support tied hashes.

I did notice that most of the XS wrappers for C-based implementations were all
in top-level namespace, though.  Any suggestions/preferences?

--Arthur Corliss
  Bolverk's Lair -- http://arthur.corlissfamily.org/
  Digital Mages -- http://www.digitalmages.com/
  Live Free or Die, the Only Way to Live -- NH State Motto