Hi Jialin,

Thanks a lot for your inputs. Will test v0.12 out then and also start using
insertTablets().

I had seen the original IoTDB paper
<https://www.vldb.org/pvldb/vol13/p2901-wang.pdf> - on which you are a
co-author :) - had shown IoTDB working on the Raspberry Pi. I have been
testing the IoTDB data ingestion/ querying on desktop and Raspberry Pi, in
comparison with InfluxDB v2.0. Do let me know if there are any plans of
benchmarking / comparative study that your team intends to do with IoTDB,
InfluxDB or similar TSDBs. I could contribute to that.

Regards,
dgargcs

On Sun, 11 Apr 2021 at 11:55, Jialin Qiao <[email protected]> wrote:

> Hi,
>
> 1. *Can I use the API on v0.11.2 (I am currently using this) or do I
> need v0.11.3?*
>
> You can use 0.11.2, we do not change the RPC API in a minor(bug-fix)
> version.
>
> 2. *Is there any limit to the number of rows that can
> be inserted at once using insertTablet()? Or, is there an optimal number of
> rows per insertTablet() to get best performance?*
>
> It depends on the columns of the Tablet. The default rpc size limit is
> 64MB, so rows * columns * 8Byte < 64MB.
> I usually set rows to 1000 when I have 1000 columns.
>
> 3. *Since insertTablet() also takes rows for one
> device, what exactly is the difference between insertTablet() and
> insertRecordsOfOneDevice()? Are they for different use-cases? Which
> performs better?*
>
> insertRecordsOfOneDevice is actually an improved version of
> insertRecords(): all records belong to one device, so we only acquire one
> writelock in the writing process.
>
> InsertTablet requires that each row has all measurements(using primitive
> data types to store data), insertRecordsOfOneDevice allows each row has
> different measurements(using Object to store data).
>
> Performance: InsertTablet > insertRecordsOfOneDevice > insertRecords >
> insertRecord
>
> InsertTablet is always the fastest :)
>
> 4.  *I suppose IoTDB does not have benchmark numbers for such devices? *
>
> We haven't tested these cases.
>
> 5. *How soon can we expect it to come out? Or can I test it out even now
> since I eventually
> plan to start working with it? Is it stable enough?*
>
> The release is already got more than 3 binding votes, it will come out in
> one or two days.
> You can get it from
> https://dist.apache.org/repos/dist/dev/iotdb/0.12.0/rc1
>
> We fixed nearly all important known bugs in 0.12.0. The single-node version
> is stable enough.
> The data migration in the cluster version is not supported, and the test of
> the cluster version is not very much.
> Welcome to test it and give feedback :)
>
> 6. *What benefits does the new Tsfile structure in v0.12 bring? Does it
> improve DB data ingestion/query performance?*
>
> It removes some redundant fields in the previous version (decreasing disk
> occupation) and optimizes the performance of the raw data query.
>
> Thanks,
> —————————————————
> Jialin Qiao
> School of Software, Tsinghua University
>
> 乔嘉林
> 清华大学 软件学院
>
>
> Dhruv Garg <[email protected]> 于2021年4月11日周日 下午1:38写道:
>
> > Hello Jialin,
> >
> > Thanks for your response.
> >
> > 1. Alright I will definitely move from insertRecords() to insertTablet()
> > then. *Can I use the API on v0.11.2 (I am currently using this) or do I
> > need v0.11.3?*
> >
> > 2. The info you provided on insertTablet() is very helpful. I could start
> > with that. I think specifying the data types at the top of the file will
> > also help reduce the data-type inference time for the api and possibly
> > reduce ingestion time. *Is there any limit to the number of rows that can
> > be inserted at once using insertTablet()? Or, is there an optimal number
> of
> > rows per insertTablet() to get best performance?*
> >
> > 3. Also, I see that there is a new API in v0.12 called
> > insertRecordsOfOneDevice(). *Since insertTablet() also takes rows for one
> > device, what exactly is the difference between insertTablet() and
> > insertRecordsOfOneDevice()? Are they for different use-cases? Which
> > performs better?*
> >
> > 4. So the Raspberry Pi are low-end devices. They run on lower RAM (2GB),
> > have ARM processors and use SD cards for persistent storage. *I suppose
> > IoTDB does not have benchmark numbers for such devices? *
> >
> > A couple of additional questions:
> > 5. I see that the IoTDB team is nearing release of v0.12. *How soon can
> we
> > expect it to come out? Or can I test it out even now since I eventually
> > plan to start working with it? Is it stable enough?*
> >
> > 6. *What benefits does the new Tsfile structure in v0.12 bring? Does it
> > improve DB data ingestion/query performance?*
> >
> > Thanks in advance!
> >
> > On Thu, 8 Apr 2021 at 17:26, Jialin Qiao <[email protected]> wrote:
> >
> > > Hi,
> > >
> > > 1. InsertTablets could reach more than 3 times faster than
> insertRecords.
> > >
> > > 2. Yes, Tablet is actually a small table with some columns and rows.
> > > It has a time column and many value columns. In each row, all columns
> > must
> > > have a value.
> > >
> > > An example:
> > >
> > > time, root.sg.d1.s1, root.sg.d1.s2
> > > 1, 1, 2.2
> > > 2, 1, 2.2
> > > 3, 1, 2.2
> > >
> > > Tablets do not allow to have null values.
> > >
> > > However, the tablet uses an array of primitive types to store data in a
> > > columnar format, e.g., long[], int[], double[].
> > > If you store data in a text file, the data type indicator is needed,
> like
> > > this:
> > >
> > > time, root.sg.d1.s1, root.sg.d1.s2
> > > long, int, double
> > > 1, 1, 2.2
> > > 2, 1, 2.2
> > > 3, 1, 2.2
> > >
> > > Then you could generate a tablet from the data and then use the
> > > insertTablet.
> > >
> > > 3. You could check the memory allocated in Raspberry pi. Is it the same
> > as
> > > on desktop? This may impact the write throughput.
> > >
> > > Thanks,
> > > —————————————————
> > > Jialin Qiao
> > > School of Software, Tsinghua University
> > >
> > > 乔嘉林
> > > 清华大学 软件学院
> > >
> > >
> > > Dhruv Garg <[email protected]> 于2021年4月8日周四 下午1:36写道:
> > >
> > > > Hello all,
> > > >
> > > > In the past month I have been using the JDBC client of IoTDB to write
> > > data
> > > > from CSV into IoTDB and also query on the data. Looking at the CSV
> code
> > > in
> > > > ImportCsv.java
> > > > <
> > > >
> > >
> >
> https://github.com/apache/iotdb/blob/master/cli/src/main/java/org/apache/iotdb/tool/ImportCsv.java
> > > > >,
> > > > it seems that csv itself is again parsed into an IoTDB-friendly
> > structure
> > > > and then ingested. I would like to avoid the csv-parsing time and
> > > directly
> > > > provide the data as needed to IoTDB. This should improve the
> ingestion
> > > > performance.
> > > >
> > > > What I am talking about is similar to InfluxDB where we can also
> parse
> > > from
> > > > CSV to InfluxDB's native line protocol and then ingest the data into
> > DB.
> > > > However, for better performance, InfluxDB also provides write APIs
> > > > <
> > > >
> > >
> >
> https://github.com/influxdata/influxdb-client-java/blob/master/client/src/main/java/com/influxdb/client/WriteApi.java
> > > > >
> > > > to take records in line protocol as input and directly ingest those.
> > > >
> > > > I have three questions:
> > > >
> > > >    1. I see that IoTDB is getting newer write APIs like InsertTablets
> > and
> > > >    it seems that it is designed to be faster than insertRecords.
> > > > Approximately
> > > >    how much performance improvement have you seen with InsertTablets?
> > > >    2. Is there a way to create a data file such that it is easy to
> use
> > > >    InsertTablets with it? This is to know if I can create an
> > > IoTDB-friendly
> > > >    and IoTDB-specific input file and then directly use InsertTablets
> to
> > > > ingest
> > > >    the data.
> > > >    3. As a preliminary check, I am also trying out IoTDB on Raspberry
> > Pi
> > > 4B
> > > >    devices. However, the ingestion time with CSV on the Raspberry Pi
> is
> > > > taking
> > > >    10 times of what it is on the desktop (amd64). This ratio should
> > have
> > > >    ideally been closer to 4X, based on other applications that I have
> > > >    benchmarked. Have you all run any Raspberry Pi benchmarks for
> IoTDB
> > > > earlier?
> > > >
> > > > I would be awaiting your response. Thanks!
> > > >
> > > > Regards,
> > > > dgargcs
> > > >
> > >
> >
>

Reply via email to