Tian Jiang created IOTDB-6140:
---------------------------------
Summary: Handle inconsistent time encoding during LOAD
Key: IOTDB-6140
URL: https://issues.apache.org/jira/browse/IOTDB-6140
Project: Apache IoTDB
Issue Type: Bug
Components: Core/Engine
Affects Versions: master branch
Reporter: Tian Jiang
Attachments: image-2023-09-07-09-35-29-181.png
When loading TsFiles into IoTDB with the LOAD command, if some chunks in the
files cross partitions of the loader, the chunks will be decoded and re-split
according to the loader's parition interval.
Although the value encoding is recorded within the files in ChunkHeader or
ChunkMetadata, the time encoding is not recorded anywhere.
The current implementation uses the loader's time encoding to decode time
chunks, which may introduce inconsistency with the generator of the files.
!image-2023-09-07-09-35-29-181.png|thumbnail!
For example, an edge server generates TsFiles with PLAIN as the time encoding
(one possible reason is that the server lacks of computing resource for
encoding) and uploads it to the cloud cluster, where the time encoding is
TS_2DIFF.
As a result, the receiver cluster will try to decode not-encoded chunks with
TS_2DIFF, and the result is unpredictable.
To avoid the encoding inconsistency, one simplest way may be specifying the
time encoding in the LOAD command, like:
{code:sql}
LOAD '/data/tsfiles_to_load/' time_encoding=PLAIN
{code}
The important part is how to deal with such inconsistency when detected.
The least-effort-taking method is to reject such a LOAD command if the provided
parameter is inconsistent with the local setting.
One more friendly approach could be re-encoding the chunks by the receiver with
its own encoding.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)