The place to start is with a clear, well understood, and agreed set of requirements. Here is a suggested starting point.

First, some definitions. A "data space" is the collection of pointer and data pages that hold the data from a single table. An "index" (in this context), is the set of index pages that compromise a single index on a table. A "database object" is either a data space or an index. A "table space" consists of a page structured file, a collection of management pages, and a page number space that can hold zero or more database objects.

The minimal requirements:

1. To extend the theoretical number of database pages in a database
   beyond 2^32.
2. To be as inobstrusive as possible so that no explicit configuration
   is required for databases that don't use table spaces.  In other
   words, a default (root) table space.
3. To allow a user to define and manage additional table spaces.
4. To allow a user to define tables spaces for individual tables and
   individual indexes.  In other words, while indexes may default to
   the same table space of the table data space, indexes may reside in
   different table spaces.
5. A user should be able to move a table data space or index from one
   table space to another.  Whether this is an on-line or off-line
   operation is deferred to specific proposals.
6. It should be possible to bring up a database with some or all table
   spaces unavailable other than the root.
7. A table space should have sufficient internal integrity to tolerate
   operational problems resulting improper management of database files.
8. It must be necessary to allow the filename of an offline table space
   to be changed.

Personally, I think the following are also important, but not requirements:

1. There should be minimal changes to the on-disk structure. More
specifically, pages references in the ODS should remain as 32 bits. This precludes allowing a database object to straggle table spaces.
2. There should be as few changes as possible to internal page, index,
   and data handling mechanism other than tracking the table space for
   database objects for the CCH interface.
3. It should be anticipated that table spaces will be abused, for
   example replacing table space files with older versions. To the
   degree feasible, this should be supported, and where not supported,
   detected.

Non-goals:

1. Change the size of record numbers
2. Store blobs in other than the table's data space.
3. Implement any ODS change other than those required to support table
   spaces (other projects are fine, but should be layered on, not part
   of, table spaces).

Thoughts?


On 3/2/2016 4:23 PM, Dmitry Yemanov wrote:
Historically, Firebird databases consist of a sequential set of pages of
the fixed size (4-16KB currently). This page set is distributed across
one (usually) or multiple files (*) The page number initially was SLONG,
now it's ULONG. So the theoretically possible maximum database size is
currently limited to 2^32 * 16KB.

When we speak about tablespaces, it usually means that the database
consists of multiple files and different database object are stored in
different files. Each such file is named within a database and called a
tablespace. And each tablespace has its own page set and page numbering.

A typical usage pattern is that tablespaces are used to separate table
data from indices (and logs from the rest of the database) and thus
allow better concurrent performance due to parallel I/O. Often it's
argued that RAIDs now handle the same job and maybe even better. For
many usage cases - maybe. But I'm pretty sure that the opposite cases
are also possible, when a carefully designed partitioning could
outperform automatic RAID data management.

Another usage case could be extending the database size beyond the
current limits. The limit is 64TB, the biggest FB database I known is
7TB. Not that far, I'd say. The limit may be shifted with even larger
page sizes, but it has its drawbacks as well.

Someone may think about per-tablespace physical backups and other
possible usage cases. So I'm sure this feature is something to be at
least considered. From another side, tablespaces complicate maintenance,
so it's something more for enterprise users rather than for common FB users.

Now back to the code. During the Firebird development, we have
introduced a concept of "page spaces", represented with a PageSpace
class. It implements a two-level numbering for database pages: pagespace
ID + page number. The whole engine is aware of that. Default pagespace
(ID == 0, IIRC) is reserved to the database file(s). Non-zero pagespace
IDs are currently used for GTTs (global temporary tables) that have
their data/indices stored in temporary files.

Technically, nothing prevents us from declaring named tablespaces via
DDL (CREATE/ALTER/DROP TABLESPACE?), storing their definitions inside
the metadata (RDB$TABLESPACES table?), allocating some pagespace ID to
the every tablespace, and allowing to specify a tablespace when creating
database objects (tables, indices, what else?).

Of course, there are more details hidden that must be addressed. Maybe
I'm missing something in my review. But I think this thread could be a
good starting point for discussion.

Others are welcome to contribute their thoughts.

(*) My personal opinion is that legacy multi-file databases must die,
preferrably in FB4. They make zero sense in modern filesystems. They're
not supported by nbackup. They may complicate implementation of
tablespaces. Anyone here still using multi-file databases?


Dmitry


------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel

------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel

Reply via email to