[
https://issues.apache.org/jira/browse/IGNITE-19834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Pavel Tupitsyn updated IGNITE-19834:
------------------------------------
Description:
h2. Motivation
Current Ignite 3 behavior is inconsistent when user data has unmapped columns:
* POJO: unmapped columns (not in schema) are ignored;
* Tuple: unmapped columns are ignored on client, but cause exception on server
(when using server-side API from a Compute task).
We should ensure consistent and reliable behavior across all APIs and clients.
h2. Non-goals
* Validate column types (already handled by serializers)
* Deal with any other schema aspects (indexes, constraints) which are not
present on the client side
h2. Requirements
Incompatible rows must be rejected by all APIs (Record, KeyValue, RecordBinary,
KeyValueBinary):
* Unmapped columns are present;
* Columns without default value are missing.
* Validation should be performed by the server when possible.
* Unmapped columns should be validated by the client, because rows are
serialized according to the schema (server does not see unmapped columns).
h2. Design
h3. Case 1: Missing Columns
Already mostly handled by the client and the server:
Client sends noValueSet to indicate which columns were not provided by the user;
Server rejects rows when the column is not set by the user and does not have a
default value.
The only required fix is to always set requireAllFields to true in Marshaller.
h3. Case 2: Unmapped Columns
*Server-side API*
* Fix Marshaller to reject POJOs with unmapped fields;
* Reject tuples from client connector when schema is outdated (see explanation
below).
*Client-side API*
Client serializes user rows according to the latest known schema. Unmapped
columns will not reach the server side. Therefore, the client must reject
unmapped columns in user rows (Tuples, POJOs).
However, there is no guarantee that the client always has the latest schema:
* Column might be removed on the server, but the client uses old schema and
validation passes when it should fail;
** Solution: server rejects rows with outdated schema from the client.
* Column might be added on the server, but the client uses old schema and
validation fails when it should pass.
** Solution: when an unmapped column is detected by the client, it should
request the latest schema and retry the validation to avoid false-positive
exceptions.
The fact that the server rejects rows with outdated schema from the client also
simplifies client schema synchronization logic - we won't have to deal with
things like IGNITE-19241 Java thin 3.0: propagate table schema updates to
client on write-only operations anymore. Client will simply reload the schema
when given a certain error code.
*Schemas and Transactions*
IEP-98 Schema Synchronization proposes a more complex logic of handling schema
updates within transactions. This may alter the way we validate schemas on the
server, but should not affect the client: if a given schema version is observed
by the client, any server node should be able to handle this version
potentially waiting for it to be installed before proceeding).
h2. Implementation Notes
Client and server APIs implement the same interfaces. Therefore, the same tests
should run against both APIs and ensure identical behavior (see
ItSqlSynchronousApiTest as an example of this approach).
was:
h2. Motivation
Current Ignite 3 behavior is inconsistent when user data has unmapped columns:
POJO: unmapped columns (not in schema) are ignored;
Tuple: unmapped columns are ignored on client, but cause exception on server
(when using server-side API from a Compute task).
We should ensure consistent and reliable behavior across all APIs and clients.
h2. Non-goals
* Validate column types (already handled by serializers)
* Deal with any other schema aspects (indexes, constraints) which are not
present on the client side
h2. Requirements
Incompatible rows must be rejected by all APIs (Record, KeyValue, RecordBinary,
KeyValueBinary):
* Unmapped columns are present;
* Columns without default value are missing.
* Validation should be performed by the server when possible.
* Unmapped columns should be validated by the client, because rows are
serialized according to the schema (server does not see unmapped columns).
h2. Design
h3. Case 1: Missing Columns
Already mostly handled by the client and the server:
Client sends noValueSet to indicate which columns were not provided by the user;
Server rejects rows when the column is not set by the user and does not have a
default value.
The only required fix is to always set requireAllFields to true in Marshaller.
h3. Case 2: Unmapped Columns
*Server-side API*
* Fix Marshaller to reject POJOs with unmapped fields;
* Reject tuples from client connector when schema is outdated (see explanation
below).
*Client-side API*
Client serializes user rows according to the latest known schema. Unmapped
columns will not reach the server side. Therefore, the client must reject
unmapped columns in user rows (Tuples, POJOs).
However, there is no guarantee that the client always has the latest schema:
* Column might be removed on the server, but the client uses old schema and
validation passes when it should fail;
** Solution: server rejects rows with outdated schema from the client.
* Column might be added on the server, but the client uses old schema and
validation fails when it should pass.
** Solution: when an unmapped column is detected by the client, it should
request the latest schema and retry the validation to avoid false-positive
exceptions.
The fact that the server rejects rows with outdated schema from the client also
simplifies client schema synchronization logic - we won't have to deal with
things like IGNITE-19241 Java thin 3.0: propagate table schema updates to
client on write-only operations anymore. Client will simply reload the schema
when given a certain error code.
*Schemas and Transactions*
IEP-98 Schema Synchronization proposes a more complex logic of handling schema
updates within transactions. This may alter the way we validate schemas on the
server, but should not affect the client: if a given schema version is observed
by the client, any server node should be able to handle this version
potentially waiting for it to be installed before proceeding).
h2. Implementation Notes
Client and server APIs implement the same interfaces. Therefore, the same tests
should run against both APIs and ensure identical behavior (see
ItSqlSynchronousApiTest as an example of this approach).
> Thin 3.0: Schema validation
> ---------------------------
>
> Key: IGNITE-19834
> URL: https://issues.apache.org/jira/browse/IGNITE-19834
> Project: Ignite
> Issue Type: Epic
> Components: thin client
> Reporter: Pavel Tupitsyn
> Assignee: Pavel Tupitsyn
> Priority: Major
> Fix For: 3.0.0-beta2
>
>
> h2. Motivation
> Current Ignite 3 behavior is inconsistent when user data has unmapped columns:
> * POJO: unmapped columns (not in schema) are ignored;
> * Tuple: unmapped columns are ignored on client, but cause exception on
> server (when using server-side API from a Compute task).
> We should ensure consistent and reliable behavior across all APIs and clients.
> h2. Non-goals
> * Validate column types (already handled by serializers)
> * Deal with any other schema aspects (indexes, constraints) which are not
> present on the client side
> h2. Requirements
> Incompatible rows must be rejected by all APIs (Record, KeyValue,
> RecordBinary, KeyValueBinary):
> * Unmapped columns are present;
> * Columns without default value are missing.
> * Validation should be performed by the server when possible.
> * Unmapped columns should be validated by the client, because rows are
> serialized according to the schema (server does not see unmapped columns).
> h2. Design
> h3. Case 1: Missing Columns
> Already mostly handled by the client and the server:
> Client sends noValueSet to indicate which columns were not provided by the
> user;
> Server rejects rows when the column is not set by the user and does not have
> a default value.
> The only required fix is to always set requireAllFields to true in Marshaller.
> h3. Case 2: Unmapped Columns
> *Server-side API*
> * Fix Marshaller to reject POJOs with unmapped fields;
> * Reject tuples from client connector when schema is outdated (see
> explanation below).
> *Client-side API*
> Client serializes user rows according to the latest known schema. Unmapped
> columns will not reach the server side. Therefore, the client must reject
> unmapped columns in user rows (Tuples, POJOs).
> However, there is no guarantee that the client always has the latest schema:
> * Column might be removed on the server, but the client uses old schema and
> validation passes when it should fail;
> ** Solution: server rejects rows with outdated schema from the client.
> * Column might be added on the server, but the client uses old schema and
> validation fails when it should pass.
> ** Solution: when an unmapped column is detected by the client, it should
> request the latest schema and retry the validation to avoid false-positive
> exceptions.
> The fact that the server rejects rows with outdated schema from the client
> also simplifies client schema synchronization logic - we won't have to deal
> with things like IGNITE-19241 Java thin 3.0: propagate table schema updates
> to client on write-only operations anymore. Client will simply reload the
> schema when given a certain error code.
> *Schemas and Transactions*
> IEP-98 Schema Synchronization proposes a more complex logic of handling
> schema updates within transactions. This may alter the way we validate
> schemas on the server, but should not affect the client: if a given schema
> version is observed by the client, any server node should be able to handle
> this version potentially waiting for it to be installed before proceeding).
> h2. Implementation Notes
> Client and server APIs implement the same interfaces. Therefore, the same
> tests should run against both APIs and ensure identical behavior (see
> ItSqlSynchronousApiTest as an example of this approach).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)