Tian Jiang created IOTDB-2210:
---------------------------------
Summary: Streaming ingestion interfaces to reduce headers
Key: IOTDB-2210
URL: https://issues.apache.org/jira/browse/IOTDB-2210
Project: Apache IoTDB
Issue Type: Improvement
Reporter: Tian Jiang
When ingesting with an extremely small batch size, the tablet header may become
a significant overhead. For example, if the batch size is 1 and the data type
is double, in a tablet, each timeseries only has 16 bytes of data. The header
(measurement name and data type) of each timeseries may also be around 10
bytes, so more than 1/3 of the bandwidth is wasted on transmitting headers. The
small-batch situation is rather common when the sampling frequency is low or
clients do not have enough capacity to batch data.
To avoid wasting bandwidth on headers when headers are fixed, we propose a
streaming ingestion interface by splitting tablets into headers and contents.
The interface is composed of two functions:
`long createTabletStream(String deviceId, List<IMeasurementSchema> schemas, int
maxRowNum, boolean isAligned)`
`void writeTabletStream(long tabletId, long[] timestamps, Object[] values,
BitMap[] bitMaps)`
Firstly `createTabletStream` is invoked by a client, then the server creates an
empty tablet with the provided headers, assigns a unique id to it, and returns
the id to the client. Next, with the tabletId and data to write, the client
invokes `writeTabletStream`. The server finds the associated tablet with the
tabletId, fills the tablet with data from the client, and inserts the tablet.
Some exceptions that should be handled: the tabletId cannot be found by the
server; the tablet content cannot fit the tablet; unused tablets should be
cleaned...
--
This message was sent by Atlassian Jira
(v8.20.1#820001)