Good idea!
I think it's ok to seperate the tsfiles to different folders according to the
time range if the time range is small enough.
Do we need to deal with the query who's time range is very large(historied
query)? Like one hundred days, we have to read all tsFiles of the days.
Thanks!
Ah, Don't we have? we have the time partition folder from v0.10 on..
Best,
---
Xiangdong Huang
School of Software, Tsinghua University
黄向东
清华大学 软件学院
Jialin Qiao 于2020年7月21日周二 下午8:37写道:
> Hi,
>
> This is not the current implementation... We do not have a
Hi,
This is not the current implementation... We do not have a partition folder on
disk now.
By adding a partition folder, there is no need to store all TsFileResources in
the memory, and the device index will not hurt us.
Thanks,
--
Jialin Qiao
School of Software, Tsinghua University
乔嘉林
Hi Jialin,
Yes it is current logic. But I do not know the relation between what you
said and this discussion...
Best,
---
Xiangdong Huang
School of Software, Tsinghua University
黄向东
清华大学 软件学院
Jialin Qiao 于2020年7月21日周二 下午4:47写道:
> Hi,
>
> I would like to give
Hi,
I would like to give a vision about managing the data files according to time
partition.
After we introduce the time partition (data is partitioned by time interval),
we do split them in memory and different TsFiles. But we may lake a partition
folder layer on top of the TsFiles.
Maybe
Hi,
> I wonder whether we could index the file by its name. (naming the tsfile
by date)
I think it is a good idea, but maybe not very easy to implement. If we can
organize the data like this, then it is very very regular and very easy to
access or delete expired data...
> we would need is a
Hi,
> I wonder whether we could index the file by its name. (naming the tsfile by
> date) E.g., we store each day's data in one file and name it as
> sg-2020-07-20.TsFile. Then, we do not need to maintain the index in memory,
> we just need to check whether the file exist in the queried
Thanks fort he clear explanation, yes I remember that there were also reported
performance issues with that.
But to generalize the concept of a device all we would need is a tree strucutre
where each node has start time / end time for "everything" in the file.
Like in your example:
Root (1,
Hi,
> The question I would ask is why "devices" hurt us.
I'd like to introduce this a bit. For each storage group, we flush the memtable
into TsFiles one by one. For each TsFile, we maintain a temporal index on
device level in memory. Suppose there are 3 devices in one TsFile, the index is
Hey Jialin, xinagdong,
very good question!
And I tend to agree with Xiangdong.
If the users do it that way it probably makes most sense for them.
The question I would ask is why "devices" hurt us (I know a bit about the
implementation of course but probably we have to adopt our datamodel also a
Hi,
This is a quite good topic!
1. maybe we should hear more users opinions.
For me, I think emphasize the concept of "device" is good. We can even
expose the concept in our APIs.
2.
> A more efficient way is
> root.sg.device1.measurement1_int0
> root.sg.device1.measurement1_int1
>
Hi
Recently, I find that some users create timeseries do not following the real
world semantic of device
E.g., a device has a sensor that collects some data in array format (int[3])
and some in long type.
Many users will create timeseries like this:
root.sg.device1.measurement1.int0
12 matches
Mail list logo