Hi Jialin, Thanks a lot for your inputs. Will test v0.12 out then and also start using insertTablets().
I had seen the original IoTDB paper <https://www.vldb.org/pvldb/vol13/p2901-wang.pdf> - on which you are a co-author :) - had shown IoTDB working on the Raspberry Pi. I have been testing the IoTDB data ingestion/ querying on desktop and Raspberry Pi, in comparison with InfluxDB v2.0. Do let me know if there are any plans of benchmarking / comparative study that your team intends to do with IoTDB, InfluxDB or similar TSDBs. I could contribute to that. Regards, dgargcs On Sun, 11 Apr 2021 at 11:55, Jialin Qiao <[email protected]> wrote: > Hi, > > 1. *Can I use the API on v0.11.2 (I am currently using this) or do I > need v0.11.3?* > > You can use 0.11.2, we do not change the RPC API in a minor(bug-fix) > version. > > 2. *Is there any limit to the number of rows that can > be inserted at once using insertTablet()? Or, is there an optimal number of > rows per insertTablet() to get best performance?* > > It depends on the columns of the Tablet. The default rpc size limit is > 64MB, so rows * columns * 8Byte < 64MB. > I usually set rows to 1000 when I have 1000 columns. > > 3. *Since insertTablet() also takes rows for one > device, what exactly is the difference between insertTablet() and > insertRecordsOfOneDevice()? Are they for different use-cases? Which > performs better?* > > insertRecordsOfOneDevice is actually an improved version of > insertRecords(): all records belong to one device, so we only acquire one > writelock in the writing process. > > InsertTablet requires that each row has all measurements(using primitive > data types to store data), insertRecordsOfOneDevice allows each row has > different measurements(using Object to store data). > > Performance: InsertTablet > insertRecordsOfOneDevice > insertRecords > > insertRecord > > InsertTablet is always the fastest :) > > 4. *I suppose IoTDB does not have benchmark numbers for such devices? * > > We haven't tested these cases. > > 5. *How soon can we expect it to come out? Or can I test it out even now > since I eventually > plan to start working with it? Is it stable enough?* > > The release is already got more than 3 binding votes, it will come out in > one or two days. > You can get it from > https://dist.apache.org/repos/dist/dev/iotdb/0.12.0/rc1 > > We fixed nearly all important known bugs in 0.12.0. The single-node version > is stable enough. > The data migration in the cluster version is not supported, and the test of > the cluster version is not very much. > Welcome to test it and give feedback :) > > 6. *What benefits does the new Tsfile structure in v0.12 bring? Does it > improve DB data ingestion/query performance?* > > It removes some redundant fields in the previous version (decreasing disk > occupation) and optimizes the performance of the raw data query. > > Thanks, > ————————————————— > Jialin Qiao > School of Software, Tsinghua University > > 乔嘉林 > 清华大学 软件学院 > > > Dhruv Garg <[email protected]> 于2021年4月11日周日 下午1:38写道: > > > Hello Jialin, > > > > Thanks for your response. > > > > 1. Alright I will definitely move from insertRecords() to insertTablet() > > then. *Can I use the API on v0.11.2 (I am currently using this) or do I > > need v0.11.3?* > > > > 2. The info you provided on insertTablet() is very helpful. I could start > > with that. I think specifying the data types at the top of the file will > > also help reduce the data-type inference time for the api and possibly > > reduce ingestion time. *Is there any limit to the number of rows that can > > be inserted at once using insertTablet()? Or, is there an optimal number > of > > rows per insertTablet() to get best performance?* > > > > 3. Also, I see that there is a new API in v0.12 called > > insertRecordsOfOneDevice(). *Since insertTablet() also takes rows for one > > device, what exactly is the difference between insertTablet() and > > insertRecordsOfOneDevice()? Are they for different use-cases? Which > > performs better?* > > > > 4. So the Raspberry Pi are low-end devices. They run on lower RAM (2GB), > > have ARM processors and use SD cards for persistent storage. *I suppose > > IoTDB does not have benchmark numbers for such devices? * > > > > A couple of additional questions: > > 5. I see that the IoTDB team is nearing release of v0.12. *How soon can > we > > expect it to come out? Or can I test it out even now since I eventually > > plan to start working with it? Is it stable enough?* > > > > 6. *What benefits does the new Tsfile structure in v0.12 bring? Does it > > improve DB data ingestion/query performance?* > > > > Thanks in advance! > > > > On Thu, 8 Apr 2021 at 17:26, Jialin Qiao <[email protected]> wrote: > > > > > Hi, > > > > > > 1. InsertTablets could reach more than 3 times faster than > insertRecords. > > > > > > 2. Yes, Tablet is actually a small table with some columns and rows. > > > It has a time column and many value columns. In each row, all columns > > must > > > have a value. > > > > > > An example: > > > > > > time, root.sg.d1.s1, root.sg.d1.s2 > > > 1, 1, 2.2 > > > 2, 1, 2.2 > > > 3, 1, 2.2 > > > > > > Tablets do not allow to have null values. > > > > > > However, the tablet uses an array of primitive types to store data in a > > > columnar format, e.g., long[], int[], double[]. > > > If you store data in a text file, the data type indicator is needed, > like > > > this: > > > > > > time, root.sg.d1.s1, root.sg.d1.s2 > > > long, int, double > > > 1, 1, 2.2 > > > 2, 1, 2.2 > > > 3, 1, 2.2 > > > > > > Then you could generate a tablet from the data and then use the > > > insertTablet. > > > > > > 3. You could check the memory allocated in Raspberry pi. Is it the same > > as > > > on desktop? This may impact the write throughput. > > > > > > Thanks, > > > ————————————————— > > > Jialin Qiao > > > School of Software, Tsinghua University > > > > > > 乔嘉林 > > > 清华大学 软件学院 > > > > > > > > > Dhruv Garg <[email protected]> 于2021年4月8日周四 下午1:36写道: > > > > > > > Hello all, > > > > > > > > In the past month I have been using the JDBC client of IoTDB to write > > > data > > > > from CSV into IoTDB and also query on the data. Looking at the CSV > code > > > in > > > > ImportCsv.java > > > > < > > > > > > > > > > https://github.com/apache/iotdb/blob/master/cli/src/main/java/org/apache/iotdb/tool/ImportCsv.java > > > > >, > > > > it seems that csv itself is again parsed into an IoTDB-friendly > > structure > > > > and then ingested. I would like to avoid the csv-parsing time and > > > directly > > > > provide the data as needed to IoTDB. This should improve the > ingestion > > > > performance. > > > > > > > > What I am talking about is similar to InfluxDB where we can also > parse > > > from > > > > CSV to InfluxDB's native line protocol and then ingest the data into > > DB. > > > > However, for better performance, InfluxDB also provides write APIs > > > > < > > > > > > > > > > https://github.com/influxdata/influxdb-client-java/blob/master/client/src/main/java/com/influxdb/client/WriteApi.java > > > > > > > > > to take records in line protocol as input and directly ingest those. > > > > > > > > I have three questions: > > > > > > > > 1. I see that IoTDB is getting newer write APIs like InsertTablets > > and > > > > it seems that it is designed to be faster than insertRecords. > > > > Approximately > > > > how much performance improvement have you seen with InsertTablets? > > > > 2. Is there a way to create a data file such that it is easy to > use > > > > InsertTablets with it? This is to know if I can create an > > > IoTDB-friendly > > > > and IoTDB-specific input file and then directly use InsertTablets > to > > > > ingest > > > > the data. > > > > 3. As a preliminary check, I am also trying out IoTDB on Raspberry > > Pi > > > 4B > > > > devices. However, the ingestion time with CSV on the Raspberry Pi > is > > > > taking > > > > 10 times of what it is on the desktop (amd64). This ratio should > > have > > > > ideally been closer to 4X, based on other applications that I have > > > > benchmarked. Have you all run any Raspberry Pi benchmarks for > IoTDB > > > > earlier? > > > > > > > > I would be awaiting your response. Thanks! > > > > > > > > Regards, > > > > dgargcs > > > > > > > > > >
