Tian Jiang created IOTDB-6140:
---------------------------------

             Summary: Handle inconsistent time encoding during LOAD
                 Key: IOTDB-6140
                 URL: https://issues.apache.org/jira/browse/IOTDB-6140
             Project: Apache IoTDB
          Issue Type: Bug
          Components: Core/Engine
    Affects Versions: master branch
            Reporter: Tian Jiang
         Attachments: image-2023-09-07-09-35-29-181.png

When loading TsFiles into IoTDB with the LOAD command, if some chunks in the 
files cross partitions of the loader, the chunks will be decoded and re-split 
according to the loader's parition interval.
Although the value encoding is recorded within the files in ChunkHeader or 
ChunkMetadata, the time encoding is not recorded anywhere.
The current implementation uses the loader's time encoding to decode time 
chunks, which may introduce inconsistency with the generator of the files.
 !image-2023-09-07-09-35-29-181.png|thumbnail! 
For example, an edge server generates TsFiles with PLAIN as the time encoding 
(one possible reason is that the server lacks of computing resource for 
encoding) and uploads it to the cloud cluster, where the time encoding is 
TS_2DIFF.
As a result, the receiver cluster will try to decode not-encoded chunks with 
TS_2DIFF, and the result is unpredictable.

To avoid the encoding inconsistency, one simplest way may be specifying the 
time encoding in the LOAD command, like:
{code:sql}
LOAD '/data/tsfiles_to_load/' time_encoding=PLAIN
{code}
The important part is how to deal with such inconsistency when detected. 
The least-effort-taking method is to reject such a LOAD command if the provided 
parameter is inconsistent with the local setting.
One more friendly approach could be re-encoding the chunks by the receiver with 
its own encoding.







--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to