Are you bulk importing into an empty table?

Derby has a built in optimization that can often be applied when bulk
importing into an empty table.  If the db is not in incremental backup
mode, then if you are bulk importing into an empty table it does not
have to log the changes.  It instead optimizes the abort action to just
empty the table and thus does not need log records. This is not possible if there are rows in the table.

/mikem

Mike Andrews wrote:
dear derby developers,

if i bulk import data into a table, i get much better performance if i
do it in a single SYSCS_UTIL.SYSCS_IMPORT_TABLE statement rather than
in multiple shots.

for example, if there are three large files "a.txt", "b.txt", and
"c.txt", where "c.txt" is just the concatenation of "a.txt" and
"b.txt", then

CALL SYSCS_UTIL.SYSCS_IMPORT_TABLE (null, 'mytable', 'c.txt', ' ', null,null, 1)

takes much less time than the sum of:

CALL SYSCS_UTIL.SYSCS_IMPORT_TABLE (null, 'mytable', 'a.txt', ' ', null,null, 1)
CALL SYSCS_UTIL.SYSCS_IMPORT_TABLE (null, 'mytable', 'b.txt', ' ', null,null, 0)

even though they result in exactly the same set of data in the table.

any ideas why? is there a way to get better performance doing it in
multiple shots? currently my data is in several text files, and so i
concatenate them all and run a single SYSCS_UTIL.SYSCS_IMPORT_TABLE
for best performance.

best regards,
mike


Reply via email to