Re: [RDBO] insert speed questions with RDBO

Ted Zlatanov Tue, 22 Jan 2008 08:42:11 -0800

On Mon, 07 Jan 2008 12:34:16 -0600 Ted Zlatanov <[EMAIL PROTECTED]> wrote:

TZ> On Sat, 5 Jan 2008 15:27:34 -0500 "John Siracusa" <[EMAIL PROTECTED]> 
wrote: 
JS> On Jan 5, 2008 1:25 PM, Ted Zlatanov <[EMAIL PROTECTED]> wrote:
>>> I realize much of this question belongs to database-specific domains,
>>> but here goes: I want RDBO insert operations to be faster.  I need to
>>> load thousands of records per second into a database.

JS> The first thing to investigate is using plain DBI.  If that's not fast
JS> enough, then RDBO will never be any faster so you'll have to look
JS> elsewhere.

JS> The second thing to consider is bulk loading rather than INSERT
JS> statements.  You can perhaps use RDBO to produce the (usually
JS> db-specific) bulk-load files themselves, then point  the database at
JS> the files using DBI and whatever the "load from file" syntax is for
JS> your db.

TZ> I'll try this.  The RDBO code will be read-only to let me know,
TZ> essentially, if a row already exists (using RDBO::Cached).  Each run
TZ> will generate a tab-separated file per table affected.  At the end of a
TZ> "run" of new objects, I'll just exit the agent process and run COPY FROM
TZ> rather than trying to flush all the caches; this is also necessary
TZ> because of Perl's habit of releasing memory whenever it feels like doing
TZ> it.  So I'll have to repopulate the cache occasionally, but that's not
TZ> as bad as the churn I was seeing before with a 100% RDBO-driven insert
TZ> process.

I can definitely keep up with the data flow when using COPY FROM.  As
soon as RDBO comes into the picture, though, I'm 3-4 times slower than
the incoming data.  I can write a customized loader that will extract
the unique foreign keys from the raw data and insert them before
the actual rows that use those foreign keys, but that's dancing around
the problem.

I'd like to try without autocommit; I tried the built-in RDBO
transactions but those didn't work well for me.  begin_work was fine,
but if an error happened anywhere in the transaction, rollback() didn't
do anything and the DB rejected any further transactions.  I could post
my code (I tried about 6 different ways), but I'd really like to know if
there is an example in the docs anywhere of using transactions.  My loop
is like this:

# @items comes from the outside
foreach my $item (@items)
{
 db_process($item);
}

Ideally I'd do, in pseudo-Perl:

begin_work
db_process(@items[0..200]);
commit

if (error)
{
 # maybe retry here
}
else
{
 delete @items[0..200];
}

Can anyonw show me how to do this properly, including the Rose::DB
initialization parameters please?

Thanks
Ted

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Rose-db-object mailing list
Rose-db-object@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rose-db-object

Re: [RDBO] insert speed questions with RDBO

Reply via email to