thx Brendan,
the source is a text file. I guess I will use MySQL tools to load it into the 
database everyday, deleting the table completely before recreating.....
 
i guess thats the best way? 


________________________________
From: Brendan Byrd <p...@resonatorsoft.org>
To: Rajeev Prasad <rp.ne...@yahoo.com>; DBIx::Class user and developer list 
<dbix-class@lists.scsys.co.uk> 
Sent: Friday, April 5, 2013 10:01 PM
Subject: Re: [Dbix-class] update a row only if any column has changed


On Fri, Apr 5, 2013 at 7:45 PM, Rajeev Prasad <rp.ne...@yahoo.com> wrote:

hello,
>
>I have a table with around 2,000,000 records (15 columns). I have to sync this 
>from an outside source once everyday. not all records are changed/removed 
>/new-added everyday. so what is the best way to update only those which have 
>changed/added/or deleted?
>
>i can use update_or_create but that will update (re-write the row) even if 
>nothing has changed in the row/record. wont that be an overhead? how can i 
>escape that? what would be the fastest and least resources consuming way to do 
>this table update?
>
>I also have another table with 500,000 rows and i wish to implement the same 
>solution to that too.
>
Geesh, when your data loads are hitting 6 and 7 digits rows figures, you should 
seriously consider the tools that come with the RDBMS, like replication.  Is 
your outside source a modern RDBMS that features master/slave replication?

Otherwise, without some sort of field indicator on the source side saying that 
this was a modified row, you can't tell if it's been changed or not.  That 
means DBIC, or whatever other ETL middleware you choose, will have to compare 
every single row to see if it exists or not.  With that many rows, it's better 
off to use the "UPSERT" SQL statements for mass updating.  In MySQL, it's 
commands like "REPLACE" and "INSERT ... ON DUPLICATE KEY UPDATE ...".  In 
Oracle, it's MERGE.

You also mentioned "removed".  So you want an exact copy of the source table?  
Yeah, screw all of the tip-toeing around the existing data and just TRUNCATE 
the table, followed by the best BULK INSERT command your RDBMS recommends.  
(For example, MySQL recommends LOAD DATA INFILE.)  And make sure you don't do 
something crazy, like reading the entire source table into memory via Perl 
variables. 

Keep in mind that almost none of this recommends using DBIC.  DBIC is for 
making changes in smaller quantities than what you're asking.  YMMV, so consult 
your DBA folks first before you start dumping massive quantities of data into 
their servers.
-- 
Brendan Byrd <p...@resonatorsoft.org>
Brendan Byrd <bb...@cpan.org> 
_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@lists.scsys.co.uk

Reply via email to