[phpug] Cycling through large recordsets

AaronC Sat, 11 Jun 2011 18:53:00 -0700

Hi all,

This is more of a logic/efficiency question than purely technical.


I am in the progress of migrating two similar sites into one (as far
as user data is concerned). It is possible that some user data to be
present on both sites.

I have written a script to run in CLI that cycles through records in
various tables from site A, and checks for matching records in one
site B table. If match is present, data from site A is updated into
the site B table. If no no match is present, data from site A is
inserted into site B table. Here are the record counts:

Site A
Table 1 - 350,000 records
Table 2 - 1.3 million records

Site B
Table 1 - 172,000 records

Running SQL queries and looping through such large data sets obviously
has the issue of memory exhaustion. In the case of Site A, Table 1, a
memory limit of 500M is exhausted at around 220,000 records.

No, the nature of the data does not allow me to reverse the import
procedure and just run on Site B's records instead.

I am trying to come up with a way of effectively releasing memory as
this processes.

1. I could insert the update/insert queries into another table for
running individually by a cron (like a queue). I don't see why this
would help however, as the exhaustion above occurs with insert/update
queries disabled. All the script is doing is outputting a short
message on the console (one per record)
2. I could use a start/limit in my queries when selecting data from
the Site A tables. This seems to be the most obvious choice.

Does anyone have any other ideas? This is a rather tricky thing to
find on Google relating to pure php.

Cheers
Aaron

-- 
NZ PHP Users Group: http://groups.google.com/group/nzphpug
To post, send email to [email protected]
To unsubscribe, send email to
[email protected]

[phpug] Cycling through large recordsets

Reply via email to