Stavros Macrakis wrote:
Hi, I have an application whose output is about 500,000 pairs (string,
integer) -- this is the result of some fairly fancy text processing.
I'd like to put this data into a (new) Derby table. Using individual
Inserts for each row takes over an hour, which seems much too long.
Using the bulk import feature involves writing out to a file and then
importing from the file, which seems rather roundabout.
So... What is the recommended way to insert a large number of rows
from an application? Is the answer the same for 10^3 or 10^8 rows? Do
the data types involved (e.g. large text field with newlines) make any
difference to the answer?
Thanks,
-s
Hi Stavros,
If you can wait a couple months before you deploy this to production,
then you can use the Table Function feature which is implemented in the
development trunk and which will be exposed in the 10.4 release in the
first quarter of next year.
Table Functions let you make arbitrary external data sources look like
tables to Derby. Basically, any data source which you can wrap in a
ResultSet can then be treated by Derby as though it were a table. This
feature is described in the 10.4 Developer's Guide in the section titled
"Programming Derby-style Table Functions":
http://db.apache.org/derby/docs/dev/devguide/ A slide deck explaining
how to use this feature can be found at
http://people.apache.org/~rhillegas/vtiDemo/doc/saucerSeparation.html A
useful toolkit is checked into the development trunk in the
java/demo/vtis subdirectory. Please read the README file. I think you
can solve your problem easily by extending StringColumnVTI.
Once you have written your table function and declared it to Derby, you
can then bulk import your external data stream as follows:
insert into MY_TABLE
select * from table( MY_TABLE_FUNCTION() ) s
Hope this helps,
-Rick