On Fri, 19 Jun 2015 15:20:39 +0200, "liviuslivius [email protected] [firebird-support]" <[email protected]> wrote: > what is the proper way for copy data from big table to another database > if table is bigger then avaiable RAM and we need to do this in one > transaction? > You know if we do > SELECT * FROM SOURCE_TABLE - and table have e.g. 400 000 000 records > then retrive it is impossible because of RAM > > but if we do this in steps > > SELECT FIRST 1000000 SKIP 0 * FROM SOURCE_TABLE <-we got 1000000 reads -> > ok > SELECT FIRST 1000000 SKIP 1000000 * FROM SOURCE_TABLE <-we got 2000000 > reads -> not ok > SELECT FIRST 1000000 SKIP 2000000 * FROM SOURCE_TABLE <-we got 3000000 > reads -> not ok worser and worser (slower and slower)
You have two problems: * Increasing number of reads: Firebird will need to read to know what to skip * No order: this may yield incorrect results (duplicates, or missing records) if the optimizer chooses a different plan for a different combination of first/skip values (I believe this is a theoretical concern right now for Firebird, but not for other databases). The proper way to do this is to execute a single query, and interleave fetching of rows from the source with inserting rows into the target. That should work without consuming all memory as long as the component you use for querying doesn't fetch all rows at once or doesn't discard them. For example in Java the default result set is forward only. If it doesn't have rows, it will fetch a number (default is 400 in Jaybird), this is done through a fetch. It will then serve requests for rows from the application from this retrieved set of rows. When it needs more, it discards the old set rows and fetches the next set. This ensures that memory is not exhausted (assuming the application itself doesn't hold on to the retrieved info). Mark
