Re: maximum file size

Paul J DeCoursey Fri, 10 Nov 2006 09:18:48 -0800

Michael Segel wrote:

Let me clarify.
My suggestion was that
You are essentially creating a poor man's mpp and the local temp table is on a 
"control" node.  The temp table holds the result set returned from each query. 
This assumes that while the table may exceed the 4 gig limit, a query should not. True if 
this assumption is wrong, then clearly derby is not the right tool in the first place.


Using. Multiple copies of the same table doesn' really solve the problem.

What you end up having is a set of tabls foo_n where n is the table number. Then to do your query, you would have to create a view foo_v that you query from to hide the detail from the app and user. This would be an implementation of a poor man's table partitioning.
It could be done , however you now have to write a pre-insert trigger to 
determine which table to store the data.  Then what happens when you overload 
the table? ( Suppose you partition on a time period and the amount of data 
exceeds 4gb?)
( Talk about a set of complex stored procedures)
Yes it could be done using derby today.
So too could you do a poor man's mpp version as well. ( Although it would be a 
bit more work on the controlling app's side.)

The point of my post is that while these solutions are possible, it would be better to 
use a database more suited to the task, or actually enhance derby to improve their 
storage options. However this would mean that you would have to rethink derby, and the 
design may change radically enough you would end up with a new product aka 
"stormcloud".
Note: while this isn't rocket science it would require a concerted effort that goes beyond what you could do under Apache.Hint: in theory, Sun could start with Derby and create their own database aka "stormcloud" and then run with it.
(Look at Apache's license.)

Actually anyone could do this, but it would take some deep pockets to support a 
multiple man year effort.

the original poster is trying to use a yugo to pull a loaded semi up a steep 
hill.


Sent via BlackBerry.

-Mike Segel
Principal
MSCC
312 952 8175


-----Original Message-----
From: [EMAIL PROTECTED]
Date: Fri, 10 Nov 2006 16:16:55To:"Derby Discussion" <[email protected]>, [EMAIL PROTECTED]
Subject: Re: maximum file size

If the OS won't allow a large enough DB file then a local temp file won't work 
either.

I am  curious about one thing though: Why would it not be a good idea to use 
multiple tables? Other than the fact that it would be a practical solution that 
could actually be implemented with the current version of Derby that is.

It doesn't seem like it would be very helpful to dismiss a possible solution 
out of hand without at least some sort of vague and reasonable explanation.

Donald
 -------------- Original message ----------------------
From: "Michael Segel" <[EMAIL PROTECTED]>
Not a good idea to use multiple files.
Why not go all the way and make derby in to an mpp db? All you would have to dois to preprocess the inbound query then send it off to all of the nodes, takingthe result set(s) in to a local temp file and then post process and return theresults... :-)
-----Original Message-----
From: [EMAIL PROTECTED]
Date: Fri, 10 Nov 2006 14:12:35To:"Derby Discussion" <[email protected]>
Subject: Re: maximum file size

You could use multiple tables to get around the file size limit.
Decide how many rows would go in each table and then use some mechanism forassigning a unique ID to each row. You would then be able to determine whichtable a specific row is in with an integer division.
table # = <global row #> / <rows per table>

local row # = <global row #> % <rows per table>
The Telemetry Data Warehouse for the Hubble Space Telescope divides the data upby time - each data table contains one calendar year's worth of telemetry data.
Just a couple of ideas.

 -------------- Original message ----------------------
From: Suresh Thalamati <[EMAIL PROTECTED]>
redcloud wrote:
Hi! I need to build a SQL table containing 1000000000 (!!!) rows. But ifilled up a table with 20000000 rows (file size of table 4GB) and myfilesystem denied to go on filling up the table. My question is: canderby db build "infinitive" size table by chunking in multiple files?
No. Currently a table maps to a single file in Derby. Table size islimited by the size of the file that can be created on a file system.

I agree with Michael on this one. It's not a good idea of duplicatetable structures. I think the original poster needs to analyze thedata, can it be split out other ways. Is there any particular columnsthat take up more space than others can reside in seperate tables. Isthere any columns that contain duplicate data that can be normalized inother ways. But looking at the figures it appears that each row isoccupying less that 250 bytes(if my math is right) which leads us backto the question, "Is derby the right solution?", I don't think it is.

On a side note, I am very curious about what the data is. Maybe adatabase is just not the right place for it altogether.


Paul

Re: maximum file size

Reply via email to