On Monday, March 16, 2015 at 4:14:20 PM UTC-7, LoveWeb2py wrote: > > Thank you for the feedback everyone. > > The main reason I fetch them all first is to make sure I'm not inserting > duplicate records. We have a lot of files that have thousands of records > and sometimes they're duplicates. I hash a few columns from each record and > if the value is the same then I don't insert the record. If there is a more > efficient way to do this please let me know. > > Are you doing bulk inserts from files generated elsewhere? Merging two (or more) databases?
If you are doing individual inserts, I'd get the maybe-new record in, do a select based on one or more fields in the record, and then do the column hash comparison on the results that were returned. For bulk inserts, I'd be inclined to do the import with the DB Engine's management tools, and maybe lean on having the DBE enforce "must be unique" rules. /dps > On Monday, March 16, 2015 at 6:26:50 PM UTC-4, Niphlod wrote: >> >> I have 300m+ records too, but I don't need to fetch all of those in a >> single query. >> It's not a matter of indexes or anything else: the "system hang" you're >> incurring to is not the one that the database takes from the moment you >> send the query to the first row returned, but the time you take to >> "accumulate" 1m row into memory. >> >> On Monday, March 16, 2015 at 10:46:39 PM UTC+1, Dane Wright wrote: >>> >>> I have a table which currently contains 10m+ records (local government >>> spending in www.appgov.org). A native SQL count(*) will take 3-5 secs >>> but reading all the records like this via the DAL will result in the >>> system hanging. I try not to read too many of these records in any one >>> online transaction by allowing only selections and calculating subtotals >>> offline. I also create MySQL indexes outside of Web2py to speed things up >>> where necessary. Of course I also can't ever say db.laspend.id>0 in the >>> Database Admin facility either! >>> >>>> >>>> -- Resources: - http://web2py.com - http://web2py.com/book (Documentation) - http://github.com/web2py/web2py (Source code) - https://code.google.com/p/web2py/issues/list (Report Issues) --- You received this message because you are subscribed to the Google Groups "web2py-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.

