Hi Mike,
I would expect the single import to take a little less time than the
two-phase import, the difference being the tiny cost of compiling the
second import statement. What you report sounds like a bug. It would
help if you could script the problem and attach your repro to a JIRA.
Thanks,
-Rick
Mike Andrews wrote:
dear derby developers,
if i bulk import data into a table, i get much better performance if i
do it in a single SYSCS_UTIL.SYSCS_IMPORT_TABLE statement rather than
in multiple shots.
for example, if there are three large files "a.txt", "b.txt", and
"c.txt", where "c.txt" is just the concatenation of "a.txt" and
"b.txt", then
CALL SYSCS_UTIL.SYSCS_IMPORT_TABLE (null, 'mytable', 'c.txt', ' ', null,null, 1)
takes much less time than the sum of:
CALL SYSCS_UTIL.SYSCS_IMPORT_TABLE (null, 'mytable', 'a.txt', ' ', null,null, 1)
CALL SYSCS_UTIL.SYSCS_IMPORT_TABLE (null, 'mytable', 'b.txt', ' ', null,null, 0)
even though they result in exactly the same set of data in the table.
any ideas why? is there a way to get better performance doing it in
multiple shots? currently my data is in several text files, and so i
concatenate them all and run a single SYSCS_UTIL.SYSCS_IMPORT_TABLE
for best performance.
best regards,
mike