[
https://issues.apache.org/jira/browse/IGNITE-14743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrey Mashenkov updated IGNITE-14743:
--------------------------------------
Description:
For now, TupleAssembler writes offsets for varlen columns as 2-byte \{{short
}}type.
This implicitly restricts key/value sizes down to 64 kB in total.
Let's use adaptive 1-2-4 byte types (byte, short, int) for offsets.
was:
For now, TupleAssembler writes offsets for varlen columns as 2-byte \{{short
}}type.
This implicitly restricts key/value sizes down to 64 kB in total.
Let's use 4-bytes {{int}} type for offsets for large tuples.
Possible ways are:
# Just use ints for offsets, this increases the memory overhead for Rows.
# Pre-calculate potential row size during SchemaDescriptor initialization and
keep 'offset_size' in schema.
Unlimited varlen type (which is default) usage will end-up user will have
4-byte offset size in most cases.
# Pre-calculate exact tuple size for each row and use row flags.
This requires Tuple data analysis which we already do to detect non-null
varlen values and nulls. Strings may be a headache as we have to analyze each
char for accurate tuple size calculation.
# Pre-calculate tuple size skipping chars analysis.
Using adaptive offset_size approaches allows us to use 1-2-4 byte numbers
(byte, short, int) for offsets.
Collations for String columns may be introduced and used as a hint, but we will
need to check a collation for every char on write.
> Support Row with large values.
> ------------------------------
>
> Key: IGNITE-14743
> URL: https://issues.apache.org/jira/browse/IGNITE-14743
> Project: Ignite
> Issue Type: Improvement
> Reporter: Andrey Mashenkov
> Assignee: Andrey Mashenkov
> Priority: Major
> Labels: iep-54, ignite-3
> Fix For: 3.0.0-alpha3
>
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> For now, TupleAssembler writes offsets for varlen columns as 2-byte \{{short
> }}type.
> This implicitly restricts key/value sizes down to 64 kB in total.
> Let's use adaptive 1-2-4 byte types (byte, short, int) for offsets.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)