Michael Segel wrote:
Let me clarify.
My suggestion was that
You are essentially creating a poor man's mpp and the local temp table is on a 
"control" node.  The temp table holds the result set returned from each query. 
This assumes that while the table may exceed the 4 gig limit, a query should not. True if 
this assumption is wrong, then clearly derby is not the right tool in the first place.

Using. Multiple copies of the same table doesn' really solve the problem.

What you end up having is a set of tabls foo_n where n is the table number. Then to do your query, you would have to create a view foo_v that you query from to hide the detail from the app and user. This would be an implementation of a poor man's table partitioning.
It could be done , however you now have to write a pre-insert trigger to 
determine which table to store the data.  Then what happens when you overload 
the table? ( Suppose you partition on a time period and the amount of data 
exceeds 4gb?)
( Talk about a set of complex stored procedures)
Yes it could be done using derby today.
So too could you do a poor man's mpp version as well. ( Although it would be a 
bit more work on the controlling app's side.)

The point of my post is that while these solutions are possible, it would be better to 
use a database more suited to the task, or actually enhance derby to improve their 
storage options. However this would mean that you would have to rethink derby, and the 
design may change radically enough you would end up with a new product aka 
"stormcloud".

Note: while this isn't rocket science it would require a concerted effort that goes beyond what you could do under Apache. Hint: in theory, Sun could start with Derby and create their own database aka "stormcloud" and then run with it.
(Look at Apache's license.)

Actually anyone could do this, but it would take some deep pockets to support a 
multiple man year effort.

the original poster is trying to use a yugo to pull a loaded semi up a steep 
hill.


Sent via BlackBerry.

-Mike Segel
Principal
MSCC
312 952 8175


-----Original Message-----
From: [EMAIL PROTECTED]
Date: Fri, 10 Nov 2006 16:16:55 To:"Derby Discussion" <[email protected]>, [EMAIL PROTECTED]
Subject: Re: maximum file size

If the OS won't allow a large enough DB file then a local temp file won't work 
either.

I am  curious about one thing though: Why would it not be a good idea to use 
multiple tables? Other than the fact that it would be a practical solution that 
could actually be implemented with the current version of Derby that is.

It doesn't seem like it would be very helpful to dismiss a possible solution 
out of hand without at least some sort of vague and reasonable explanation.

Donald
 -------------- Original message ----------------------
From: "Michael Segel" <[EMAIL PROTECTED]>
Not a good idea to use multiple files.
Why not go all the way and make derby in to an mpp db? All you would have to do is to preprocess the inbound query then send it off to all of the nodes, taking the result set(s) in to a local temp file and then post process and return the results... :-)

-----Original Message-----
From: [EMAIL PROTECTED]
Date: Fri, 10 Nov 2006 14:12:35 To:"Derby Discussion" <[email protected]>
Subject: Re: maximum file size

You could use multiple tables to get around the file size limit.

Decide how many rows would go in each table and then use some mechanism for assigning a unique ID to each row. You would then be able to determine which table a specific row is in with an integer division.

table # = <global row #> / <rows per table>

local row # = <global row #> % <rows per table>

The Telemetry Data Warehouse for the Hubble Space Telescope divides the data up by time - each data table contains one calendar year's worth of telemetry data.

Just a couple of ideas.

 -------------- Original message ----------------------
From: Suresh Thalamati <[EMAIL PROTECTED]>
redcloud wrote:
Hi! I need to build a SQL table containing 1000000000 (!!!) rows. But i filled up a table with 20000000 rows (file size of table 4GB) and my filesystem denied to go on filling up the table. My question is: can derby db build "infinitive" size table by chunking in multiple files?
No. Currently a table maps to a single file in Derby. Table size is limited by the size of the file that can be created on a file system.

I agree with Michael on this one. It's not a good idea of duplicate table structures. I think the original poster needs to analyze the data, can it be split out other ways. Is there any particular columns that take up more space than others can reside in seperate tables. Is there any columns that contain duplicate data that can be normalized in other ways. But looking at the figures it appears that each row is occupying less that 250 bytes(if my math is right) which leads us back to the question, "Is derby the right solution?", I don't think it is.

On a side note, I am very curious about what the data is. Maybe a database is just not the right place for it altogether.

Paul



Reply via email to