chaoyli edited a comment on issue #5142:
URL:
https://github.com/apache/incubator-doris/issues/5142#issuecomment-750838601
The disk layout should take encoding algorithm and reading efficiency into
consideration.
So the disk layout will store the array_size without the absolute for every
array element.
```
The disk layout
* First Level array_sizes (int32), First Level have no nulls
| Bytes 0-3 | Bytes 4-7 | Bytes 8-11 |
|------------|------------|------------|
| 2 | 3 | 1 |
* Second Nulls (uint8)
| Byte 1- 3 | Bytes 4 | Bytes 5 - 7|
|-----------------------------------|
| 0 | 1 | 0 |
* Second Level array_sizes (int32)
| Bytes 0-23 |
|----------------------|
| 2, 2, 3, 0, 1, 2 |
* Elements array (int32):
| Bytes 0-9 |
|-------------------------------|
| 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 |
```
For null, array_size, element array, it will construct the specified
ColumnWriter to write the data.
The ArrayColumnWriter only need to call the three writers.
```
1. Call writer to write nulls.
2. Call writer to write array_sizes. And also add a new meta to record the
corresponding relation between array_size ordinal
and element ordinal.
3. Call element writer recursively.
```
Upon read, when seek to specified the ordinal, it will seek the null,
array_size, element separately.
When seek the element column, it will get the start ordinal from array_size
reader.
Because the array_size have seek to specified ordinal, so It only needs one
sum.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]