[
https://issues.apache.org/jira/browse/IGNITE-18535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sergey Chugunov updated IGNITE-18535:
-------------------------------------
Epic Link: IGNITE-19502 (was: IGNITE-17766)
> Define new classes for versioned tables/indexes schemas
> -------------------------------------------------------
>
> Key: IGNITE-18535
> URL: https://issues.apache.org/jira/browse/IGNITE-18535
> Project: Ignite
> Issue Type: Improvement
> Components: sql
> Reporter: Ivan Bessonov
> Assignee: Andrey Mashenkov
> Priority: Major
> Labels: ignite-3
> Fix For: 3.0.0-beta2
>
> Time Spent: 50m
> Remaining Estimate: 0h
>
> Current approach with schema management is faulty and can't support indexes.
> On top of that, it doesn't allow us to truly have multi-versioned historical
> data. Once the table is removed, it's removed for good, meaning that
> "current" RO transactions will not be able to finish. This is not acceptable.
> h3. Schema definitions
> What we need to have is the following:
> {code:java}
> SchemaDefinitions = map {version -> SchemaDefinition}
> SchemaDefinition = {timestamp, set {TableDefinition}, set{IndexDefinition}}
> TableDefinition = {name, id, array[ColumnDefinition], ...}
> IndexDefinition = {name, id, tableId, state, array[IdxColumnDefinition],
> ...}{code}
> Schema must be versioned, that's the first point. Well, it's already
> versioned in "main", here I mean the global versioning to tie everything to
> transactions and management of SQL indexes.
> Each definition correspond to a time period, where it represents the "actual"
> state of things. It must be used for RO queries, for example. RW transactions
> always use LATEST schema, obviously.
> Now, the meaning of defined values:
> * version - a simple auto-incrementing integer value;
> * "timestamp" - the schema is considered to be valid from this timestamp
> until the timestamp of "next" version (or "inifinity" if the next version
> doesn't yet exist);
> * most of tables and indexes properties are self-explanatory;
> * index state - RO or RW. We should differentiate the indexes that are not
> yet built frome indexes that are fully available.
> Currently, it's not too clear where to store this structure. The problem lies
> in the realm of metadata synchronization, that's not yet designed. But the
> thing is that all nodes must eventually have an up-to-date state and every
> data/index update must be consistent with the version that belongs to a
> current operation's timestamp.
> There are two likely candidates - Meta-Storage or Configuration. We'll figure
> it out later.
> h3. Seralization / storage
> It would be convenient to only store the oldest version + the collection of
> diffs. Every node would unpack that locally, but we would save a lot on the
> storage space in meta-storage in case when user has a lot of tables/indexes.
> This approach would also be beneficial for another reason: we need to know,
> what's changed between versions. It may be hard to calculate if all that we
> have are definitions themselves.
> h3. General thoughts
> This may be a good place to start using integer tableId and indexId more
> often. UUIDs are too much. What's good is that "serializability" of schemas
> gives us easy way of generating integer ids, just like it's don right now
> with configuration.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)