Hello, I have an embedded database application that generally involves inserting somewhere between 50k and 1000k rows of data into a database, and then analyzing and querying that data afterwards. The data goes into about ten tables, but the bulk of the data is in just a few of them. I run my database with "test" durability, and I add all of the primary key and foreign key constraints after the data is inserted into the database. Currently all of the data is inserted using prepared statements executed in batches, and this gives me between 10 and 20 row inserts per millisecond on average. I have spent quite a bit of time optimizing the insert step, and while I was doing research I came across this discussion from last year:
http://www.mail-archive.com/[email protected]/msg10194.html The discussion suggests using bulk import as a way to speed up this initial insert step. Unfortunately, I cannot use the built in import functions, as my data includes Timestamps with nanosecond granularity. As far as I can tell, there is no way to convince derby to parse a time specified down to the nanosecond. In one of the emails, someone suggested that you can get bulk import performance by using a table function, and then running a query like "INSERT INTO MYTABLE (...) SELECT S.* FROM TABLE (MYFUNC ()) S". In my tests, however, this doesn't seem to perform the insert any faster than simply inserting the rows one at a time with a prepared statement. I think this may be because I don't have a way to set the 'insertMode=bulkImport' property, which the bulk import system procedure is allowed to do. Does anyone know of a way to work around this, or of a better way to get my data into the database as quickly as possible? Thanks in advance for your time. Cheers, Nathan Boy
