Hi Johann,
Some responses inline...
On 11/7/16, 7:16 AM, Johann Petrak wrote:
What I would like to do is this: use Derby in embedded mode to create a
very large table with fields "key" and "value":
create table mytable (
key varchar(100) not null,
value varchar not null)
and a non-unique index on field key.
This table would get filled with about 100 million rows where each value
could be several thousand characters.
Then, I would need to query the database to select all rows in order
of keys,
like this:
SELECT key,value from tab order by key
The result set would get processed by repeatedly calling next.
Now here are my questions:
* can derby do this without putting the complete result into memory
first (like other embedded Java databases do)?
Since you are creating an index on the ORDER BY column, I would expect
that a dataset of this size would cause Derby to read the index from
start to finish, probing, as it goes, into the heap to retrieve the
value column. I would not expect that Derby would toss any rows into a
sorter. You might fill up Derby's page cache, but you would not need to
throw extra heap at Derby in order to get this experiment to succeed.
* Can derby deal with the statement.setFetchSize(n) method to actually
limit the number of rows getting fetched at once?
The embedded Derby JDBC driver implements both Statement.setFetchSize()
and ResultSet.setFetchSize().
Hope this helps,
-Rick
My main concern is if it is possible to deal with the result set of
that query, where I have 100 million rows, and
each row has considerable size, without needing a lot of heap space or
RAM: ideally I would only need the RAM needed
to hold a couple of hundred or thousand result rows in memory.
Thanks a lot!