Hi Johann,

Some responses inline...

On 11/7/16, 7:16 AM, Johann Petrak wrote:
What I would like to do is this: use Derby in embedded mode to create a
very large table with fields "key" and "value":
create table mytable (
  key varchar(100) not null,
  value varchar not null)
and a non-unique index on field key.

This table would get filled with about 100 million rows where each value
could be several thousand characters.

Then, I would need to query the database to select all rows in order of keys,
like this:
  SELECT key,value from tab order by key

The result set would get processed by repeatedly calling next.

Now here are my questions:
* can derby do this without putting the complete result into memory first (like other embedded Java databases do)?
Since you are creating an index on the ORDER BY column, I would expect that a dataset of this size would cause Derby to read the index from start to finish, probing, as it goes, into the heap to retrieve the value column. I would not expect that Derby would toss any rows into a sorter. You might fill up Derby's page cache, but you would not need to throw extra heap at Derby in order to get this experiment to succeed.
* Can derby deal with the statement.setFetchSize(n) method to actually limit the number of rows getting fetched at once?
The embedded Derby JDBC driver implements both Statement.setFetchSize() and ResultSet.setFetchSize().

Hope this helps,
-Rick

My main concern is if it is possible to deal with the result set of that query, where I have 100 million rows, and each row has considerable size, without needing a lot of heap space or RAM: ideally I would only need the RAM needed
to hold a couple of hundred or thousand result rows in memory.

Thanks a lot!

Reply via email to