Pavel Tupitsyn created IGNITE-19834:
---------------------------------------

             Summary: Thin 3.0: Schema validation
                 Key: IGNITE-19834
                 URL: https://issues.apache.org/jira/browse/IGNITE-19834
             Project: Ignite
          Issue Type: Epic
          Components: thin client
            Reporter: Pavel Tupitsyn
            Assignee: Pavel Tupitsyn
             Fix For: 3.0.0-beta2


h2. Motivation
Current Ignite 3 behavior is inconsistent when user data has unmapped columns:
POJO: unmapped columns (not in schema) are ignored;
Tuple: unmapped columns are ignored on client, but cause exception on server 
(when using server-side API from a Compute task).

We should ensure consistent and reliable behavior across all APIs and clients.

h2. Non-goals
* Validate column types (already handled by serializers)
* Deal with any other schema aspects (indexes, constraints) which are not 
present on the client side

h2. Requirements
Incompatible rows must be rejected by all APIs (Record, KeyValue, RecordBinary, 
KeyValueBinary):
* Unmapped columns are present;
* Columns without default value are missing.
* Validation should be performed by the server when possible.
* Unmapped columns should be validated by the client, because rows are 
serialized according to the schema (server does not see unmapped columns).

h2. Design
h3. Case 1: Missing Columns
Already mostly handled by the client and the server:
Client sends noValueSet to indicate which columns were not provided by the user;
Server rejects rows when the column is not set by the user and does not have a 
default value.

The only required fix is to always set requireAllFields to true in Marshaller.

h3. Case 2: Unmapped Columns
*Server-side API*
* Fix Marshaller to reject POJOs with unmapped fields;
* Reject tuples from client connector when schema is outdated (see explanation 
below).

*Client-side API*
Client serializes user rows according to the latest known schema. Unmapped 
columns will not reach the server side. Therefore, the client must reject 
unmapped columns in user rows (Tuples, POJOs).

However, there is no guarantee that the client always has the latest schema:
* Column might be removed on the server, but the client uses old schema and 
validation passes when it should fail;
** Solution: server rejects rows with outdated schema from the client.
* Column might be added on the server, but the client uses old schema and 
validation fails when it should pass.
** Solution: when an unmapped column is detected by the client, it should 
request the latest schema and retry the validation to avoid false-positive 
exceptions.

The fact that the server rejects rows with outdated schema from the client also 
simplifies client schema synchronization logic - we won't have to deal with 
things like IGNITE-19241 Java thin 3.0: propagate table schema updates to 
client on write-only operations anymore. Client will simply reload the schema 
when given a certain error code.

*Schemas and Transactions*
IEP-98 Schema Synchronization proposes a more complex logic of handling schema 
updates within transactions. This may alter the way we validate schemas on the 
server, but should not affect the client: if a given schema version is observed 
by the client, any server node should be able to handle this version 
potentially waiting for it to be installed before proceeding).

h2. Implementation Notes
Client and server APIs implement the same interfaces. Therefore, the same tests 
should run against both APIs and ensure identical behavior (see 
ItSqlSynchronousApiTest as an example of this approach).




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to