PhilCope wrote:
I have a large cloudscape 5 database (over 5million records) and I have found
that when a ResultsSet includes all or many of these records, the initial
call to ResultsSet.next() takes a very long (but finite) time (I would
estimate about 10-15mins) . I can (indeed have) arrange the code so that
this call occurs in a separate java thread which can be interrupted from the
UI thread. BUT, as you may know, java threads that are set to as interrupted
continue to execute until either application code or some "system" calls
actually check the interrupted state of the running thread.
So, given this background info, my questions on Derby are
1. Have any significant performance improvements been made such that, for
databases of this size, migrating from cloudscape to Derby would provide a
significantly better response time ?
2. If not, are there any improvements to the responsiveness of the
timeconsuming call to .next() to the setting of the interrupted flag on the
current thread in Derby ?
Thanks
Phil Cope
The issue is not really the timeconsumming call of .next() or not
necessarily the size of the result set, it is just that when you call
next it has to wait until query processing is ready
to return the 1st row. In some cases derby can return the 1st row
before completing processing of the entire query. For instance I
believe if you just did a simple select of all the rows from your
5 million row table you would see that the 1st row comes back very
quickly. In other cases it may do a lot of processing before it even
gets to the 1st row (imagine a query with no key that required the
db to process every row in the db and only the last row in the table
actually would be returned). In other cases the semantics of the query
require the db to pr
Can you post the query, it may help people to give you suggestions. If
possible derby tries to stream results out as it gets them, but there
are queries where all the rows have to be seen and processed before the
first row can be returned. The simplest example is a query with an
order by at the end. If there is no index that provides the ordering
of the order by then derby will process all the query, and throw all the
rows in the sorter and sort them all and then give you the first row
back. Sometimes this order by behavior can be worked around by creating
an index on the exact keys in the same order as the order by. Also note
that while not necessary, the current derby/cloudscape sorter algorithm
will not
return the 1st row of the sort before it has finished sorting all the
rows.
As queries get more complicated it may be harder and harder for derby to
return a row "early".