[Discuss] enlarge default value of avg_series_point_number_threhold
Hi, As we know, we have a parameter avg_series_point_number_threshold=1, which controls the maximum point number of one series in memtable can not exceed the threshold. The main purpose of this configuration is solve the problem that if the number of point in one TVList is too large, it will take too much sorting time on query and flush. However, we recently got many feedback about when import historical data series by series, each time we insert a series with 10,000 points, then the 10,000 points will form a tsfile. As a result, there would be a lot of small tsfiles after the data imported and the query efficiency is not as good as we expected. Therefore, I would like to change the default value of this parameter to 100,000. Two reasons about it. One, change the threshold larger can improve the efficiency of the scenario above. Besides, it may not have any side effect to enlarge the number on other application scenario. Someone may remember, this configuration was actually introduced to IoTDB before the memory control mechanism added. After we have the memory control, this threshold almost cannot be reached except the importing historical data scenario. This is the first step to solve the problem. We may need to find a more smart strategy to replace this parameter in the future. How do you think? BR, Haonan Hou
[BUILD-UNSTABLE]: Job 'IoTDB/IoTDB-Pipe/master [master] [596]'
BUILD-UNSTABLE: Job 'IoTDB/IoTDB-Pipe/master [master] [596]': Check console output at "https://ci-builds.apache.org/job/IoTDB/job/IoTDB-Pipe/job/master/596/";>IoTDB/IoTDB-Pipe/master [master] [596]"
Re: About replacing byteBuffer
In fact, we can do that. There's no copy if bytebuf passed to the compositeByteBuf are wrapped by ByteBuffer. The problem is that we might not force every module to do this(may be checked only by review code). If anyone uses Netty's byteBuf directly and does not manage memory properly(and we don't have good surveillance to find this problem), this could result in a memory leak. Maybe, how to make better use of Netty, provide more monitoring capabilities, and constrain the development standards of all modules, this may need further consideration. BR, --- Sijia Li -邮件原件- 发件人: Xiangdong Huang 发送时间: 2022年5月17日 8:57 收件人: dev 主题:Re: About replacing byteBuffer If we introduce Netty, data copy when scaling a bytebuf is not what we want. Can we use compositeByteBuf to replace it and meanwhile enjoy the benefit of pooledByteBuf? --- Xiangdong Huang School of Software, Tsinghua University 黄向东 清华大学 软件学院 Jialin Qiao 于2022年5月16日周一 12:35写道: > Hi, > > The serialization interface needs to be refactored afterward. > > Before that, using ByteArrayOutputStream is easier. > > Thanks, > — > Jialin Qiao > Apache IoTDB PMC > > > 李思佳 于2022年5月16日周一 11:44写道: > > > Hi all, > > > > When I was developing the snapshot interface for the configNode > > module, I noticed that the parameters received by the serialization > > interface were all defined as ByteBuffer, which seemed to have some > > problems. For > example, > > the external main process has no way of knowing how big the buffer > > will > be. > > We can only estimate a large value to allocate memory. > > > > Then I looked at the serialization interfaces of other modules, and > > it seemed that most modules did the same thing. This could be a > > problem once the actual size of the buffer exceeds our estimate. So > > I did a quick > survey > > of Netty's byteBuf last week, and here's the Chinese version of the > results< > > https://apache-iotdb.feishu.cn/docs/doccnW1EFoyLOScys9GTOuaEUbh>. > > > > At the same time, we found that the consensus module also has some > > ByteBuf requirements. But byteBuf doesn't seem to be enough to give > > us precise control over the size of the memory pool, and we may need > > to wrap it if we decide to use it. > > > > Finally, we decided to use stream type instead of byteBuffer in > > configNode for the time being. I will start this work to see if this > > is > the > > better way this week. If any idea, please let me know. > > > > By the way, Netty’s ByteBuf provides powerful tool operations that > > we will not discard outright, but rather as an option. > > > > BR, > > --- > > Sijia Li > > > > >
Re: maintain the IoTDB-Skywalking plugin codes
Hi Wei, Thanks for your contribution. I will ask the PMC to create a new repo. (Then you may need to maintain the github action, and jenkins if needed) Best, --- Xiangdong Huang School of Software, Tsinghua University 黄向东 清华大学 软件学院 刘威 于2022年5月16日周一 22:32写道: > Hi, I'm the original author of IoTDB-SkyWalking plugin. > I have voted to apache/iotdb-skywalking-storage in my last mail. > > Now, let me introduce the current situation of the plugin. > > The plugin now support SkyWalking v8.9.0, v8.9.1 and v9.0.0 with passing > all e2e tests. > I wrote a blog to introduce its design in v8.9.0. You could read it at the > link > > https://skywalking.apache.org/blog/2021-11-23-design-of-iotdb-storage-option/ > . > In v9.0.0, it has been refactored and optimized according to the > DataConverter > in the new design of SkyWalking storage, which could refer to PR#8755 in > SkyWalking repo. > > What's more, I wrote another blog about how to apply IoTDB as backend > storage. You could see the link > > https://skywalking.apache.org/blog/2021-12-08-application-guide-of-iotdb-storage-option/ > . > > In this case, considering its low maintenance frequency and its small > scope of influence, > I think it's more appropriate to move it to a separate repository. > > Thanks for the support of two mentors (@Xiangdong Huang and @Sheng Wu) and > all other contributors. > IoTDB-SkyWalking plugin is a meaningful attempt. > > -- > Wei Liu > School of Computer Science, NPU > > 刘威 > 西北工业大学计算机学院
Re: About replacing byteBuffer
If we introduce Netty, data copy when scaling a bytebuf is not what we want. Can we use compositeByteBuf to replace it and meanwhile enjoy the benefit of pooledByteBuf? --- Xiangdong Huang School of Software, Tsinghua University 黄向东 清华大学 软件学院 Jialin Qiao 于2022年5月16日周一 12:35写道: > Hi, > > The serialization interface needs to be refactored afterward. > > Before that, using ByteArrayOutputStream is easier. > > Thanks, > — > Jialin Qiao > Apache IoTDB PMC > > > 李思佳 于2022年5月16日周一 11:44写道: > > > Hi all, > > > > When I was developing the snapshot interface for the configNode module, I > > noticed that the parameters received by the serialization interface were > > all defined as ByteBuffer, which seemed to have some problems. For > example, > > the external main process has no way of knowing how big the buffer will > be. > > We can only estimate a large value to allocate memory. > > > > Then I looked at the serialization interfaces of other modules, and it > > seemed that most modules did the same thing. This could be a problem once > > the actual size of the buffer exceeds our estimate. So I did a quick > survey > > of Netty's byteBuf last week, and here's the Chinese version of the > results< > > https://apache-iotdb.feishu.cn/docs/doccnW1EFoyLOScys9GTOuaEUbh>. > > > > At the same time, we found that the consensus module also has some > > ByteBuf requirements. But byteBuf doesn't seem to be enough to give us > > precise control over the size of the memory pool, and we may need to wrap > > it if we decide to use it. > > > > Finally, we decided to use stream type instead of byteBuffer in > > configNode for the time being. I will start this work to see if this is > the > > better way this week. If any idea, please let me know. > > > > By the way, Netty’s ByteBuf provides powerful tool operations that we > > will not discard outright, but rather as an option. > > > > BR, > > --- > > Sijia Li > > > > >
Re:maintain the IoTDB-Skywalking plugin codes
Hi, I'm the original author of IoTDB-SkyWalking plugin. I have voted to apache/iotdb-skywalking-storage in my last mail. Now, let me introduce the current situation of the plugin. The plugin now support SkyWalking v8.9.0, v8.9.1 and v9.0.0 with passing all e2e tests. I wrote a blog to introduce its design in v8.9.0. You could read it at the link https://skywalking.apache.org/blog/2021-11-23-design-of-iotdb-storage-option/ . In v9.0.0, it has been refactored and optimized according to the DataConverter in the new design of SkyWalking storage, which could refer to PR#8755 in SkyWalking repo. What's more, I wrote another blog about how to apply IoTDB as backend storage. You could see the link https://skywalking.apache.org/blog/2021-12-08-application-guide-of-iotdb-storage-option/ . In this case, considering its low maintenance frequency and its small scope of influence, I think it's more appropriate to move it to a separate repository. Thanks for the support of two mentors (@Xiangdong Huang and @Sheng Wu) and all other contributors. IoTDB-SkyWalking plugin is a meaningful attempt. -- Wei Liu School of Computer Science, NPU ??
Re: Refactor process of DeleteStorageGroups to support rollback
Hi, The separation of logical and physical deletion is a good idea. The dev@list doesn't support attachments... Thanks, — Jialin Qiao Apache IoTDB PMC lu 于2022年5月16日周一 20:18写道: > Hi all: >There are three steps in the current DeleteStorageGroup process, > there are three steps that need to go through. >1. Delete partition informations on the PartitionRegion >2. Delete metadata informations on schemaRegion based on the > partition information and clean up the cache >3. Delete the data files and directories on dataRegion based on the > partition information >I/O operations in step 3 may take a long time and is prone to > exceptions, which is prone to deletion failures. >Here is a logical deletion plan instead of deleting data files > directly. The deletion is divided into logical and physical deletions, the > logical deletion simply screens the read and write operations related to > storage group, and the physical deletion deletes files and is executed > asynchronously.This simplifies the operation of deleting RPCs and reduce > the probability of timeouts and exceptions and is able to rollback when > exceptions occur. > Detail Chinese doc is attached. Any idea is welcomed. > > FYI > > Lu Ming > > > > >
Re:maintain the IoTDB-Skywalking plugin codes
+1 for apache/iotdb-skywalking-storage -- Wei Liu School of Computer Science, NPU ?? -- Original -- From: "dev" https://github.com/apache/skywalking/discussions/9059 Best, --- Xiangdong Huang School of Software, Tsinghua University ??
Refactor process of DeleteStorageGroups to support rollback
Hi all: There are three steps in the current DeleteStorageGroup process, there are three steps that need to go through. 1. Delete partition informations on the PartitionRegion 2. Delete metadata informations on schemaRegion based on the partition information and clean up the cache 3. Delete the data files and directories on dataRegion based on the partition information I/O operations in step 3 may take a long time and is prone to exceptions, which is prone to deletion failures. Here is a logical deletion plan instead of deleting data files directly. The deletion is divided into logical and physical deletions, the logical deletion simply screens the read and write operations related to storage group, and the physical deletion deletes files and is executed asynchronously.This simplifies the operation of deleting RPCs and reduce the probability of timeouts and exceptions and is able to rollback when exceptions occur. Detail Chinese doc is attached. Any idea is welcomed. FYI Lu Ming