Re: discuss about Procedure module

2022-05-09 Thread HW-Chao Wang
The procedure framework is an interface that provides unified state management 
for those with multiple operations. Ensure consistency by rolling back and 
retrying. It can be used to deal with fault scenarios. 1. A readme and an 
example will be written for other developers to use. 2. At that time, it was 
considered that the procedure could run in a separate process, so it was placed 
in a separate module. After discussion, the procedure is run in the config node 
process and will be combined in the next pr. pull/5811。




---Original---
From: "Xiangdong Huang"https://github.com/apache/iotdb/pull/5477

Best,
---
Xiangdong Huang
School of Software, Tsinghua University

黄向东
清华大学 软件学院

Change semantics of TsFile filename

2022-05-09 Thread Haiming Zhu
Hi, everyone

Currently, the filename format of each tsfile is
{file_created_time}-{version_id}-{inner_space_merge_num}-{cross_space_merge_num}.tsfile.
In one time partition, the order of tsfiles is guaranteed by the
version_id, for example, 1651825804093-2-0-0.tsfile is after
1651825804092-1-0-0.tsfile

The problem is that filename conflict may occur in the cross space
compaction and load scenes. In the cross space compaction, assuming there
exists 3-2-0-0.tsfile, 4-3-0-0.tsfile and 5-5-0-0.tsfile in the sequence
folder, if file 4-3-0-0.tsfile is selected, compaction cannot generate 3 or
more target files because only 2 version_id are left between 2 and 5, so
some big target files may be generated. In the load, assuming there exists
3-2-0-0.tsfile, 3-3-0-0.tsfile and 3-3-0-0.tsfile in the sequence folder,
no more sequence files cannot be loaded between 3-2-0-0.tsfile and
3-3-0-0.tsfile, they can only be loaded into the unsequence folder.

In response to these problems, the format won't be changed, but the meaning
of file_created_time and version_id will be different. Instead of
version_id, we use file_created_time to guarantee the order of tsfiles, and
if two tsfiles have the same file_created_time, then we use version_id to
guarantee the order. This semantics change may afftect query, compaction
and load module.

Hope for some suggestions.

Best,

Haiming Zhu
School of Software, Tsinghua University

朱海铭
清华大学 软件学院


discuss about Procedure module

2022-05-09 Thread Xiangdong Huang
Hi,

I see there is a procedure module on master branch, and there is some
design document [1] about it.

But I still have some questions about the module, and want to have a
discussion:

1. what is it for? (can someone use several sentences or several paragraphs
to introduce it? and put the introduction into a README.md file?) . (and I
can find 11 kinds of implementations ... why..)

2. why the procedure should be considered as a new "module" rather than
just a class "packet"?

[1] https://github.com/apache/iotdb/pull/5477

Best,
---
Xiangdong Huang
School of Software, Tsinghua University

 黄向东
清华大学 软件学院