Re: bulk import issue

Rick Hillegas Tue, 01 Dec 2009 05:40:03 -0800

Hi Mike,

I would expect the single import to take a little less time than thetwo-phase import, the difference being the tiny cost of compiling thesecond import statement. What you report sounds like a bug. It wouldhelp if you could script the problem and attach your repro to a JIRA.


Thanks,
-Rick

Mike Andrews wrote:

dear derby developers,

if i bulk import data into a table, i get much better performance if i
do it in a single SYSCS_UTIL.SYSCS_IMPORT_TABLE statement rather than
in multiple shots.

for example, if there are three large files "a.txt", "b.txt", and
"c.txt", where "c.txt" is just the concatenation of "a.txt" and
"b.txt", then

CALL SYSCS_UTIL.SYSCS_IMPORT_TABLE (null, 'mytable', 'c.txt', ' ', null,null, 1)

takes much less time than the sum of:

CALL SYSCS_UTIL.SYSCS_IMPORT_TABLE (null, 'mytable', 'a.txt', ' ', null,null, 1)
CALL SYSCS_UTIL.SYSCS_IMPORT_TABLE (null, 'mytable', 'b.txt', ' ', null,null, 0)

even though they result in exactly the same set of data in the table.

any ideas why? is there a way to get better performance doing it in
multiple shots? currently my data is in several text files, and so i
concatenate them all and run a single SYSCS_UTIL.SYSCS_IMPORT_TABLE
for best performance.

best regards,
mike

Re: bulk import issue

Reply via email to