This is an automated email from the ASF dual-hosted git repository.

qiaojialin pushed a commit to branch develop
in repository https://gitbox.apache.org/repos/asf/tsfile.git


The following commit(s) were added to refs/heads/develop by this push:
     new 8c8cc524 add README-zh (#89)
8c8cc524 is described below

commit 8c8cc524e15ccec09e0852a637134b13a2bdc0d0
Author: CritasWang <[email protected]>
AuthorDate: Tue May 28 17:27:05 2024 +0800

    add README-zh (#89)
---
 README-zh.md             | 125 +++++++++++++
 README.md                | 464 +++++------------------------------------------
 cpp/tsfile/README-zh.md  |  34 ++++
 cpp/tsfile/README.md     |  34 ++++
 java/tsfile/README-zh.md | 198 ++++++++++++++++++++
 java/tsfile/README.md    | 183 ++++++++++++++++---
 6 files changed, 594 insertions(+), 444 deletions(-)

diff --git a/README-zh.md b/README-zh.md
new file mode 100644
index 00000000..11ed40f8
--- /dev/null
+++ b/README-zh.md
@@ -0,0 +1,125 @@
+<!--
+
+    Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+        http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+-->
+
+# TsFile Document
+<pre>
+___________    ___________.__.__          
+\__    ___/____\_   _____/|__|  |   ____  
+  |    | /  ___/|    __)  |  |  | _/ __ \ 
+  |    | \___ \ |     \   |  |  |_\  ___/ 
+  |____|/____  >\___  /   |__|____/\___  >  version 1.0.0
+             \/     \/                 \/  
+</pre>
+[![Maven 
Version](https://maven-badges.herokuapp.com/maven-central/org.apache.tsfile/tsfile-parent/badge.svg)](http://search.maven.org/#search|gav|1|g:"org.apache.tsfile")
+
+## 简介
+
+TsFile是一种为时间序列数据设计的列式存储文件格式,它支持高效压缩、高读写吞吐量,并且兼容多种框架,如Spark和Flink。TsFile很容易集成到物联网大数据处理框架中。
+
+时序数据即时间序列数据,是指带时间标签(按照时间的顺序变化,即时间序列化)的数据,其来源多元、数据量庞大,可广泛应用于物联网、智能制造、金融分析等领域。在数据驱动的当下,时序数据的重要性不言而喻。
+
+尽管时序数据如此普遍且重要,但长期以来,时序数据的管理都缺乏标准化的文件格式。TsFile 的出现为用户管理时序数据提供了统一的文件格式。
+
+[点击查看更多](https://www.timecho.com/archives/tian-bu-shi-chang-kong-bai-apache-tsfile-ru-he-chong-xin-ding-yi-shi-xu-shu-ju-guan-li)
+
+
+## TsFile 特性
+
+TsFile 通过自研实现了时序数据高效率管理、高灵活传输,并支持多类软件深度集成。其特性包括:
+
+- 时序模型:专门为物联网设计的数据模型,每个时间序列与特定设备相关联,所有设备通过分层结构相互连接;
+
+- 跨语言独立使用:可以使用多种语言的 SDK 直接读写 TsFile,使得一些轻量级的数据读写场景成为可能。
+
+- 高效写入和压缩:为时间序列量身定制的列式存储格式,将数据按设备进行组织,并保证每个序列的数据连续存储,最小化存储空间。相比 CSV,压缩比可提升 90% 
以上。
+
+- 高查询性能:通过设备、物理量和时间维度索引,TsFile 实现了基于特定时间范围的时序数据快速过滤和查询。相比通用文件格式,查询吞吐可提升 2-10 倍。
+
+- 开放集成:TsFile 是时序数据库 IoTDB 的底层存储文件格式,可与 IoTDB 形成可插拔的存算分离架构。TsFile 支持与 
Spark、Flink 等大数据软件建立无缝生态集成,从而确保跨不同数据处理环境的兼容性和互操作性,实现时序数据跨生态深度分析。
+
+## TsFile 基本概念
+
+TsFile 可管理多个设备的时序数据。每个设备可具有不同的物理量。
+
+每个设备的每个物理量对应一条时间序列。
+
+TsFile 数据模型(Schema)定义了所有设备物理量的集合,如下表所示(m1 ~ m5)
+
+| Time | deviceId | m1 | m2 | m3 | m4 | m5 |
+|------|----------|----|----|----|----|----|
+| 1    | device1  | 1  | 2  | 3  |    |    |
+| 2    | device1  | 1  | 2  | 3  |    |    |
+| 3    | device2  | 1  |    | 3  | 4  | 5  |
+| 4    | device2  | 1  |    | 3  | 4  | 5  |
+| 5    | device3  | 1  | 2  | 3  | 4  | 5  |
+
+其中 Time 和 deviceId 为内置字段,无需定义,可直接写入。
+
+## TsFile 设计原理
+
+### 文件结构
+
+下为 Apache TsFile 的文件结构。
+
+- Page:一段连续的时序数据,存储的基本单元,按时间升序排序,时间戳和值各有单独的列进行存储。
+
+- Chunk:由同一序列的多个连续的 Page 组成,一个文件同一个序列可以存储多个 Chunk。
+
+- ChunkGroup:由一个设备的一至多个 Chunk 组成,多个 Chunk 可共享一列时间存储(多值模型)。
+
+- Index:TsFile 末尾的元数据文件包含序列内部时间维度的索引和序列间的索引信息。
+
+![TsFile 
文件结构](https://alioss.timecho.com/upload/Apache%20TsFile%20%E5%8F%91%E5%B8%83%E5%9B%BE3-20240315.png)
+
+### 编码和压缩
+
+TsFile 通过采用二阶差分编码、游程编码(RLE)、位压缩和 Snappy 
等先进的编码和压缩技术,优化时序数据的存储和访问,并支持对时间戳列和数据值列进行单独编码,以实现更好的数据处理效能。
+
+其独特之处在于编码算法专为时序数据特性设计,聚焦在时间属性和数据之间的相关性。
+
+(![TsFile、Parquet 和 ORC 
三种文件格式的比较](https://alioss.timecho.com/upload/Apache%20TsFile%20%E5%8F%91%E5%B8%83%E5%9B%BE4-20240315.png))
+
+
+基于对时序数据应用需求的深刻理解,TsFile 
有助于实现时序数据高压缩比和实时访问速度,并为企业进一步构建高效、可扩展、灵活的数据分析平台提供底层文件技术支撑。
+
+| 数据类型    | 推荐编码       | 推荐压缩算法 |
+|---------|------------|--------|
+| INT32   | TS_2DIFF   | LZ4    |
+| INT64   | TS_2DIFF   | LZ4    |
+| FLOAT   | GORILLA    | LZ4    |
+| DOUBLE  | GORILLA    | LZ4    |
+| BOOLEAN | RLE        | LZ4    |
+| TEXT    | DICTIONARY | LZ4    |
+
+更多类型的编码和压缩方式参见[文档](https://iotdb.apache.org/zh/UserGuide/latest/Basic-Concept/Encoding-and-Compression.html)
+
+## 开发 TsFile
+
+[Java](./java/tsfile/README-zh.md#开发)
+
+[C++](./cpp/tsfile/README-zh.md#开发)
+
+
+## 使用 TsFile
+
+[Java](./java/tsfile/README-zh.md#使用)
+
+[C++](./cpp/tsfile/README-zh.md#使用)
\ No newline at end of file
diff --git a/README.md b/README.md
index 9219ee2c..7b0c8b05 100644
--- a/README.md
+++ b/README.md
@@ -25,53 +25,56 @@ ___________    ___________.__.__
 \__    ___/____\_   _____/|__|  |   ____  
   |    | /  ___/|    __)  |  |  | _/ __ \ 
   |    | \___ \ |     \   |  |  |_\  ___/ 
-  |____|/____  >\___  /   |__|____/\___  >  version 1.0.1-SNAPSHOT
+  |____|/____  >\___  /   |__|____/\___  >  version 1.0.0
              \/     \/                 \/  
 </pre>
 [![Maven 
Version](https://maven-badges.herokuapp.com/maven-central/org.apache.tsfile/tsfile-parent/badge.svg)](http://search.maven.org/#search|gav|1|g:"org.apache.tsfile")
 
-## Abstract
+## Introduction
 
 TsFile is a columnar storage file format designed for time series data, which 
supports efficient compression, high throughput of read and write, and 
compatibility with various frameworks, such as Spark and Flink. It is easy to 
integrate TsFile into IoT big data processing frameworks.
 
-[Click for More 
Information](https://www.timecho.com/archives/tian-bu-shi-chang-kong-bai-apache-tsfile-ru-he-chong-xin-ding-yi-shi-xu-shu-ju-guan-li)
-
-## Motivation
-
 Time series data is becoming increasingly important in a wide range of 
applications, including IoT, intelligent control, finance, log analysis, and 
monitoring systems. 
 
-TsFile is the first existing standard file format for time series data. The 
industry companies usually write time series data without unification, or use 
general columnar file format, which makes data collection and processing 
complicated without a standard. With TsFile, organizations could write data in 
TsFile inside end devices or gateway, then transfer TsFile to the cloud for 
unified management in IoTDB and other systems. In this way, we lower the 
network transmission and the computin [...]
+TsFile is the first existing standard file format for time series data. 
Despite the widespread presence and significance of temporal data, there has 
been a longstanding absence of standardized file formats for its management. 
The advent of TsFile introduces a unified file format to facilitate users in 
managing temporal data.
 
-TsFile is a specially designed file format rather than a database. Users can 
open, write, read, and close a TsFile easily like doing operations on a normal 
file. Besides, more interfaces are available on a TsFile.
+[Click for More 
Information](https://www.timecho-global.com/archives/apache-tsfile-time-series-data-storage-redefined)
 
-TsFile offers several distinctive features and benefits:
+## TsFile Features
 
-* Efficient Storage and Compression: TsFile employs advanced compression 
techniques to minimize storage requirements, resulting in reduced disk space 
consumption and improved system efficiency. 
+TsFile offers several distinctive features and benefits:
 
-* Flexible Schema and Metadata Management: TsFile allows for directly write 
data without pre defining the schema, which is flexible for data aquisition. 
+- Mutil Language Independent Use: Multiple language SDK can be used to 
directly read and write TsFile, making it possible for some lightweight data 
reading and writing scenarios.
 
-* High Query Performance with time range: TsFile has indexed devices, sensors 
and time dimensions to accelerate query performance, enabling fast filtering 
and retrieval of time series data. 
+- Efficient Writing and Compression: A column storage format tailored for time 
series, organizing data by device and ensuring continuous storage of data for 
each sequence, minimizing storage space. Compared to CSV, the compression ratio 
can be increased by more than 90%.
 
-* Seamless Integration: TsFile is designed to seamlessly integrate with 
existing time series databases such as IoTDB, data processing frameworks, such 
as Spark and Flink. 
+- High Query Performance: By indexing devices, measurement, and time 
dimensions, TsFile implements fast filtering and querying of temporal data 
based on specific time ranges. Compared to general file formats, query 
throughput can be increased by 2-10 times.
 
+- Open Integration: TsFile is the underlying storage file format of the 
temporal database IoTDB, which can form a pluggable storage computing 
separation architecture with IoTDB. TsFile supports compatibility with Spark 
Flink and other big data software establish seamless ecosystem integration to 
ensure compatibility and interoperability across different data processing 
environments, and achieve deep analysis of temporal data across ecosystems.
 
-# Features
+## TsFile Basic Concepts
 
-When conceptualizing the structure of TsFile, there were several key 
considerations:
+TsFile can manage the time series data of multiple devices. Each device can 
have different measurement.
 
-- Efficient Compression: Recognizing the importance of space optimization, 
TsFile compresses data extensively to minimize storage requirements.
+Each measurement of each device corresponds to a time series.
 
-- Device Packing: Multiple devices are packed together to reduce the number of 
files, streamlining data management.
+The TsFile Scheme defines a set of measurement for all devices, as shown in 
the table below (m1~m5)
 
-- Data Locality: Time series data expected to be queried together are kept 
close in physical locations to enhance query performance.
+| Time | deviceId | m1 | m2 | m3 | m4 | m5 |
+|------|----------|----|----|----|----|----|
+| 1    | device1  | 1  | 2  | 3  |    |    |
+| 2    | device1  | 1  | 2  | 3  |    |    |
+| 3    | device2  | 1  |    | 3  | 4  | 5  |
+| 4    | device2  | 1  |    | 3  | 4  | 5  |
+| 5    | device3  | 1  | 2  | 3  | 4  | 5  |
 
-- Disk Fragmentation: TsFile ensures data is packed with sizes aligned with 
file systems to avoid disk fragmentation.
+Among them, Time and deviceId are built-in fields that do not need to be 
defined and can be written directly.
 
-- Efficient Access: With millions of time series needing efficient access, 
TsFile is optimized for rapid data retrieval.
+## TsFile Design
 
-# Columnar Storage and File Structure
+### File Structure
 
-TsFile adopts a columnar storage design, similar to other file formats, 
primarily to optimize time-series data's storage efficiency and query 
performance. This design aligns with the nature of time series data, which 
often involves large volumes of similar data types recorded over time. However, 
TsFile was developed particularly with a structure of page, chunk, chunk group, 
block, and index:
+TsFile adopts a columnar storage design, similar to other file formats, 
primarily to optimize time-series data's storage efficiency and query 
performance. This design aligns with the nature of time series data, which 
often involves large volumes of similar data types recorded over time. However, 
TsFile was developed particularly with a structure of page, chunk, chunk group, 
and index:
 
 - Page: The basic unit for storing time series data, sorted by time in 
ascending order with separate columns for timestamps and values.
 
@@ -79,18 +82,15 @@ TsFile adopts a columnar storage design, similar to other 
file formats, primaril
 
 - Chunk Group: Multiple chunks within a chunk group belong to one or multiple 
series of a device written in the same period, facilitating efficient query 
processing.
 
-- Block: Buffered in memory before being flushed to TsFile, all chunk groups 
form a block, allowing for efficient data locality in distributed file systems 
like HDFS.
-
 - Index: The file metadata at the end of TsFile contains a chunk-level index 
and file-level statistics for efficient data access.
 
-The following diagram illustrates TsFile's innovative columnar storage design, 
showcasing the efficiency of its page, chunk, and block structure.
-
+![TsFile 
Architecture](https://alioss.timecho.com/upload/Apache%20TsFile%20%E5%8F%91%E5%B8%83%E5%9B%BE3-20240315.png)
 
+## Encoding and Compression
 
-![TsFile 
Architecture](https://alioss.timecho.com/upload/Apache%20TsFile%20%E5%8F%91%E5%B8%83%E5%9B%BE3-20240315.png)
+TsFile employs advanced encoding and compression techniques to optimize 
storage and access for time series data. It uses methods like run-length 
encoding (RLE), bit-packing, and Snappy for efficient compression, allowing 
separate encoding of timestamp and value columns for better data processing. 
Its unique encoding algorithms are designed specifically for the 
characteristics of time series data in IoT scenarios, focusing on regular time 
intervals and the correlation among series. 
 
-# Encoding and Compression Techniques
-TsFile employs advanced encoding and compression techniques to optimize 
storage and access for time series data. It uses methods like run-length 
encoding (RLE), bit-packing, and Snappy for efficient compression, allowing 
separate encoding of timestamp and value columns for better data processing. 
Its unique encoding algorithms are designed specifically for the 
characteristics of time series data in IoT scenarios, focusing on regular time 
intervals and the correlation among series. Additi [...]
+Its uniqueness lies in the encoding algorithm designed specifically for time 
series data characteristics, focusing on the correlation between time 
attributes and data.
 
 The table below compares 3 file formats in different dimensions.
 
@@ -99,402 +99,26 @@ The table below compares 3 file formats in different 
dimensions.
 
 Its development facilitates efficient data encoding, compression, and access, 
reflecting a deep understanding of industry needs, pioneering a path toward 
efficient, scalable, and flexible data analytics platforms.
 
-# Building With Java
-
-## Prerequisites
-
-To build TsFile wirh Java, you need to have:
-
-1. Java >= 1.8 (1.8, 11 to 17 are verified. Please make sure the environment 
path has been set accordingly).
-2. Maven >= 3.6 (If you want to compile TsFile from source code).
-
-
-## Build TsFile with Maven
-
-```
-mvn clean package -P with-java -DskipTests
-```
-
-## Install to local machine
-
-```
-mvn install -P with-java -DskipTests
-```
+| Data Type    | Recommended Encoding       | Recommended Compression |
+|---------|------------|--------|
+| INT32   | TS_2DIFF   | LZ4    |
+| INT64   | TS_2DIFF   | LZ4    |
+| FLOAT   | GORILLA    | LZ4    |
+| DOUBLE  | GORILLA    | LZ4    |
+| BOOLEAN | RLE        | LZ4    |
+| TEXT    | DICTIONARY | LZ4    |
 
-# Add TsFile as a dependency in Maven
+more 
see[Docs](https://iotdb.apache.org/UserGuide/latest/Basic-Concept/Encoding-and-Compression.html)
 
-The current SNAPSHOT version is `1.0.1-SNAPSHOT`, you can use it after Maven 
install
+## Build TsFile
 
-```xml  
-<dependencies>
-    <dependency>
-      <groupId>org.apache.tsfile</groupId>
-      <artifactId>tsfile-java</artifactId>
-      <version>1.0.1-SNAPSHOT</version>
-    </dependency>
-<dependencies>
-```
-
-The current release version is `1.0.0`
-
-```xml  
-<dependencies>
-    <dependency>
-      <groupId>org.apache.tsfile</groupId>
-      <artifactId>tsfile</artifactId>
-      <version>1.0.0</version>
-    </dependency>
-<dependencies>
-```
-
-# TsFile Java API
-
-## Write TsFile
-
-1. construct a `TsFileWriter` instance.
-    * Without pre-defined schema
-        
-    ```java
-    public TsFileWriter(File file) throws IOException
-    ```
-    * With pre-defined schema
-
-    ```java
-    public TsFileWriter(File file, Schema schema) throws IOException
-    ```
-    This one is for using the HDFS file system. `TsFileOutput` can be an 
instance of class `HDFSOutput`.
-
-    ```java
-    public TsFileWriter(TsFileOutput output, Schema schema) throws IOException 
-    ```
-
-    If you want to set some TSFile configuration on your own, you could use 
param `config`. For example:
-
-    ```java
-    TSFileConfig conf = new TSFileConfig();
-    conf.setTSFileStorageFs("HDFS");
-    TsFileWriter tsFileWriter = new TsFileWriter(file, schema, conf);
-    ```
-
-    In this example, data files will be stored in HDFS, instead of local file 
system. If you'd like to store data files in local file system, you can use 
`conf.setTSFileStorageFs("LOCAL")`, which is also the default config.
-
-    You can also config the ip and rpc port of your HDFS by 
`config.setHdfsIp(...)` and `config.setHdfsPort(...)`. The default ip is 
`localhost` and default rpc port is `9000`.
-
-    **Parameters:**
-
-    * file : The TsFile to write
-
-    * schema : The file schemas, will be introduced in next part.
-
-    * config : The config of TsFile.
-2. add measurements
-  
-    Or you can make an instance of class `Schema` first and pass this to the 
constructor of class `TsFileWriter`
-    
-    The class `Schema` contains a map whose key is the name of one measurement 
schema, and the value is the schema itself.
-    
-    Here are the interfaces:
-
-    ```java
-    // Create an empty Schema or from an existing map
-    public Schema()
-    public Schema(Map<String, MeasurementSchema> measurements)
-    // Use this two interfaces to add measurements
-    public void registerMeasurement(MeasurementSchema descriptor)
-    public void registerMeasurements(Map<String, MeasurementSchema> 
measurements)
-    // Some useful getter and checker
-    public TSDataType getMeasurementDataType(String measurementId)
-    public MeasurementSchema getMeasurementSchema(String measurementId)
-    public Map<String, MeasurementSchema> getAllMeasurementSchema()
-    public boolean hasMeasurement(String measurementId)
-    ```
-
-    You can always use the following interface in `TsFileWriter` class to add 
additional measurements: 
-
-    ```java
-    public void addMeasurement(MeasurementSchema measurementSchema) throws 
WriteProcessException
-    ```
-
-    The class `MeasurementSchema` contains the information of one measurement, 
there are several constructors:
-    ```java
-    public MeasurementSchema(String measurementId, TSDataType type, TSEncoding 
encoding)
-    public MeasurementSchema(String measurementId, TSDataType type, TSEncoding 
encoding, CompressionType compressionType)
-    public MeasurementSchema(String measurementId, TSDataType type, TSEncoding 
encoding, CompressionType compressionType, 
-    Map<String, String> props)
-    ```
-    
-    **Parameters:**
-    ​    
-    * measurementID: The name of this measurement, typically the name of the 
sensor.
-      
-    * type: The data type, now support six types: `BOOLEAN`, `INT32`, `INT64`, 
`FLOAT`, `DOUBLE`, `TEXT`;
-    
-    * encoding: The data encoding. 
-    
-    * compression: The data compression. 
-
-    * props: Properties for special data types.Such as `max_point_number` for 
`FLOAT` and `DOUBLE`, `max_string_length` for
-    `TEXT`. Use as string pairs into a map such as ("max_point_number", "3").
-    
-    > **Notice:** Although one measurement name can be used in multiple 
deltaObjects, the properties cannot be changed. I.e. 
-        it's not allowed to add one measurement name for multiple times with 
different type or encoding.
-        Here is a bad example:
-
-    ```java
-    // The measurement "sensor_1" is float type
-    addMeasurement(new MeasurementSchema("sensor_1", TSDataType.FLOAT, 
TSEncoding.RLE));
-    
-    // This call will throw a WriteProcessException exception
-  addMeasurement(new MeasurementSchema("sensor_1", TSDataType.INT32, 
TSEncoding.RLE));
-  ```
-  ```
-
-  ```
-
-3. insert and write data continually.
-  
-    Use this interface to create a new `TSRecord`(a timestamp and device pair).
-    
-    ```java
-    public TSRecord(long timestamp, String deviceId)
-  ```
-  ```
-    Then create a `DataPoint`(a measurement and value pair), and use the 
addTuple method to add the DataPoint to the correct
-    TsRecord.
-    
-    Use this method to write
-    
-    ```java
-    public void write(TSRecord record) throws IOException, 
WriteProcessException
-  ```
-
-4. call `close` to finish this writing process. 
-  
-    ```java
-    public void close() throws IOException
-    ```
-
-We are also able to write data into a closed TsFile.
-
-1. Use `ForceAppendTsFileWriter` to open a closed file.
-
-       ```java
-       public ForceAppendTsFileWriter(File file) throws IOException
-       ```
-
-2. call `doTruncate` truncate the part of Metadata
-
-3. Then use `ForceAppendTsFileWriter` to construct a new `TsFileWriter`
-
-```java
-public TsFileWriter(TsFileIOWriter fileWriter) throws IOException
-```
-Please note, we should redo the step of adding measurements before writing new 
data to the TsFile.
-
-### Example
-
-You could write a TsFile by constructing **TSRecord** if you have the 
**non-aligned** (e.g. not all sensors contain values) time series data.
-
-A more thorough example can be found at 
`java/examples/src/main/java/org/apache/tsfile/tsfile/TsFileWriteWithTSRecord.java`
-
-You could write a TsFile by constructing **Tablet** if you have the 
**aligned** time series data.
-
-A more thorough example can be found at 
`java/examples/src/main/java/org/apache/tsfile/tsfile/TsFileWriteWithTablet.java`
-
-You could write data into a closed TsFile by using **ForceAppendTsFileWriter**.
-
-A more thorough example can be found at 
`java/examples/src/main/java/org/apache/tsfile/tsfile/TsFileForceAppendWrite.java`
-
-## Interface for Reading TsFile
-
-* Definition of Path
-
-A path is a dot-separated string which uniquely identifies a time-series in 
TsFile, e.g., "root.area_1.device_1.sensor_1". 
-The last section "sensor_1" is called "measurementId" while the remaining 
parts "root.area_1.device_1" is called deviceId. 
-As mentioned above, the same measurement in different devices has the same 
data type and encoding, and devices are also unique.
-
-In read interfaces, The parameter `paths` indicates the measurements to be 
selected.
-
-Path instance can be easily constructed through the class `Path`. For example:
-
-```java
-Path p = new Path("device_1.sensor_1");
-```
-
-We will pass an ArrayList of paths for final query call to support multiple 
paths.
-
-```java
-List<Path> paths = new ArrayList<Path>();
-paths.add(new Path("device_1.sensor_1"));
-paths.add(new Path("device_1.sensor_3"));
-```
-
-> **Notice:** When constructing a Path, the format of the parameter should be 
a dot-separated string, the last part will
- be recognized as measurementId while the remaining parts will be recognized 
as deviceId.
-
-
-* Definition of Filter
-
- * Usage Scenario
-Filter is used in TsFile reading process to select data satisfying one or more 
given condition(s). 
-
- * IExpression
-The `IExpression` is a filter expression interface and it will be passed to 
our final query call.
-We create one or more filter expressions and may use binary filter operators 
to link them to our final expression.
-
-* **Create a Filter Expression**
-  
-    There are two types of filters.
-    
-     * TimeFilter: A filter for `time` in time-series data.
-        ```
-        IExpression timeFilterExpr = new GlobalTimeExpression(TimeFilter);
-        ```
-        Use the following relationships to get a `TimeFilter` object (value is 
a long int variable).
-        
-        |Relationship|Description|
-        |---|---|
-        |TimeFilter.eq(value)|Choose the time equal to the value|
-        |TimeFilter.lt(value)|Choose the time less than the value|
-        |TimeFilter.gt(value)|Choose the time greater than the value|
-        |TimeFilter.ltEq(value)|Choose the time less than or equal to the 
value|
-        |TimeFilter.gtEq(value)|Choose the time greater than or equal to the 
value|
-        |TimeFilter.notEq(value)|Choose the time not equal to the value|
-        |TimeFilter.not(TimeFilter)|Choose the time not satisfy another 
TimeFilter|
-       
-     * ValueFilter: A filter for `value` in time-series data.
-       
-        ```
-        IExpression valueFilterExpr = new SingleSeriesExpression(Path, 
ValueFilter);
-        ```
-        The usage of  `ValueFilter` is the same as using `TimeFilter`, just to 
make sure that the type of the value
-        equal to the measurement's(defined in the path).
-    
-* **Binary Filter Operators**
+[Java](./java/tsfile/README.md#building-with-java)
 
-    Binary filter operators can be used to link two single expressions.
-
-     * BinaryExpression.and(Expression, Expression): Choose the value satisfy 
for both expressions.
-     * BinaryExpression.or(Expression, Expression): Choose the value satisfy 
for at least one expression.
-    
-
-Filter Expression Examples
-
-* **TimeFilterExpression Examples**
-
-    ```java
-    IExpression timeFilterExpr = new GlobalTimeExpression(TimeFilter.eq(15)); 
// series time = 15
-    ```
-```
-    ```java
-    IExpression timeFilterExpr = new 
GlobalTimeExpression(TimeFilter.ltEq(15)); // series time <= 15
-```
-```java
-    IExpression timeFilterExpr = new GlobalTimeExpression(TimeFilter.lt(15)); 
// series time < 15
-```
-    ```java
-IExpression timeFilterExpr = new GlobalTimeExpression(TimeFilter.gtEq(15)); // 
series time >= 15
-    ```
-    ```java
-    IExpression timeFilterExpr = new 
GlobalTimeExpression(TimeFilter.notEq(15)); // series time != 15
-```
-    ```java
-    IExpression timeFilterExpr = BinaryExpression.and(
-        new GlobalTimeExpression(TimeFilter.gtEq(15L)),
-    new GlobalTimeExpression(TimeFilter.lt(25L))); // 15 <= series time < 25
-```
-    ```java
-    IExpression timeFilterExpr = BinaryExpression.or(
-        new GlobalTimeExpression(TimeFilter.gtEq(15L)),
-        new GlobalTimeExpression(TimeFilter.lt(25L))); // series time >= 15 or 
series time < 25
-    ```
-* Read Interface
-
-First, we open the TsFile and get a `ReadOnlyTsFile` instance from a file path 
string `path`.
-
-```java
-TsFileSequenceReader reader = new TsFileSequenceReader(path);
-   
-ReadOnlyTsFile readTsFile = new ReadOnlyTsFile(reader);
-```
-Next, we prepare the path array and query expression, then get final 
`QueryExpression` object by this interface:
-
-```java
-QueryExpression queryExpression = QueryExpression.create(paths, statement);
-```
-
-The ReadOnlyTsFile class has two `query` method to perform a query.
-* **Method 1**
-
-    ```java
-    public QueryDataSet query(QueryExpression queryExpression) throws 
IOException
-    ```
-
-* **Method 2**
-
-    ```java
-    public QueryDataSet query(QueryExpression queryExpression, long 
partitionStartOffset, long partitionEndOffset) throws IOException
-    ```
-
-    This method is designed for advanced applications such as the TsFile-Spark 
Connector.
-
-    * **params** : For method 2, two additional parameters are added to 
support partial query:
-        *  ```partitionStartOffset```: start offset for a TsFile
-        *  ```partitionEndOffset```: end offset for a TsFile
-
-        > **What is Partial Query ?**
-        >
-        > In some distributed file systems(e.g. HDFS), a file is split into 
severval parts which are called "Blocks" and stored in different nodes. 
Executing a query paralleled in each nodes involved makes better efficiency. 
Thus Partial Query is needed. Paritial Query only selects the results stored in 
the part split by ```QueryConstant.PARTITION_START_OFFSET``` and 
```QueryConstant.PARTITION_END_OFFSET``` for a TsFile.
-
-* QueryDataset Interface
-
-The query performed above will return a `QueryDataset` object.
-
-Here's the useful interfaces for user.
-
-  * `bool hasNext();`
-
-    Return true if this dataset still has elements.
-  * `List<Path> getPaths()`
-
-    Get the paths in this data set.
-  * `List<TSDataType> getDataTypes();` 
-
-   Get the data types. The class TSDataType is an enum class, the value will 
be one of the following:
-
-       BOOLEAN,
-       INT32,
-       INT64,
-       FLOAT,
-       DOUBLE,
-       TEXT;
- * `RowRecord next() throws IOException;`
-
-    Get the next record.
-    
-    The class `RowRecord` consists of a `long` timestamp and a `List<Field>` 
for data in different sensors,
-     we can use two getter methods to get them.
-    
-    ```java
-    long getTimestamp();
-    List<Field> getFields();
-    ```
-    
-    To get data from one Field, use these methods:
-    
-    ```java
-    TSDataType getDataType();
-    Object getObjectValue();
-    ```
-
-
-
-### Example
+[C++](./cpp/tsfile/README.md#build)
 
 
-You should install TsFile to your local maven repository.
+## Use TsFile
 
+[Java](./java/tsfile/README.md#use-tsfile)
 
-A more thorough example with query statement can be found at 
-`java/examples/src/main/java/org/apache/tsfile/TsFileRead.java`
-`java/examples/src/main/java/org/apache/tsfile/TsFileSequenceRead.java`
\ No newline at end of file
+[C++](./cpp/tsfile/README.md#use-tsfile)
diff --git a/cpp/tsfile/README-zh.md b/cpp/tsfile/README-zh.md
new file mode 100644
index 00000000..0878f43a
--- /dev/null
+++ b/cpp/tsfile/README-zh.md
@@ -0,0 +1,34 @@
+<!--
+
+    Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+        http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+-->
+
+# TsFile C++ Document
+<pre>
+___________    ___________.__.__          
+\__    ___/____\_   _____/|__|  |   ____  
+  |    | /  ___/|    __)  |  |  | _/ __ \ 
+  |    | \___ \ |     \   |  |  |_\  ___/ 
+  |____|/____  >\___  /   |__|____/\___  >  version 1.0.0
+             \/     \/                 \/  
+</pre>
+
+## 开发
+
+## 使用
\ No newline at end of file
diff --git a/cpp/tsfile/README.md b/cpp/tsfile/README.md
new file mode 100644
index 00000000..e6db19b1
--- /dev/null
+++ b/cpp/tsfile/README.md
@@ -0,0 +1,34 @@
+<!--
+
+    Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+        http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+-->
+
+# TsFile C++ Document
+<pre>
+___________    ___________.__.__          
+\__    ___/____\_   _____/|__|  |   ____  
+  |    | /  ___/|    __)  |  |  | _/ __ \ 
+  |    | \___ \ |     \   |  |  |_\  ___/ 
+  |____|/____  >\___  /   |__|____/\___  >  version 1.0.0
+             \/     \/                 \/  
+</pre>
+
+## Build
+
+## Use TsFile
\ No newline at end of file
diff --git a/java/tsfile/README-zh.md b/java/tsfile/README-zh.md
new file mode 100644
index 00000000..45820503
--- /dev/null
+++ b/java/tsfile/README-zh.md
@@ -0,0 +1,198 @@
+<!--
+
+    Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+        http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+-->
+
+# TsFile Java Document
+<pre>
+___________    ___________.__.__          
+\__    ___/____\_   _____/|__|  |   ____  
+  |    | /  ___/|    __)  |  |  | _/ __ \ 
+  |    | \___ \ |     \   |  |  |_\  ___/ 
+  |____|/____  >\___  /   |__|____/\___  >  version 1.0.0
+             \/     \/                 \/  
+</pre>
+
+## 开发
+
+### 前置条件
+
+构建 Java 版的 TsFile,必须要安装以下依赖:
+
+1. Java >= 1.8 (1.8, 11 到 17 都经过验证. 请确保设置了环境变量).
+2. Maven >= 3.6 (如果要从源代码编译TsFile).
+
+
+### 使用 maven 构建
+
+```
+mvn clean package -P with-java -DskipTests
+```
+
+### 安装到本地机器
+
+```
+mvn install -P with-java -DskipTests
+```
+
+## 使用
+
+### 在 Maven 中添加 TsFile 依赖
+
+当前发布版本是 `1.0.0`,可以这样引用
+
+```xml  
+<dependencies>
+    <dependency>
+      <groupId>org.apache.tsfile</groupId>
+      <artifactId>tsfile</artifactId>
+      <version>1.0.0</version>
+    </dependency>
+<dependencies>
+```
+
+当前 SNAPSHOT 版本是 `1.0.1-SNAPSHOT`, 可以这样引用
+
+```xml  
+<dependencies>
+    <dependency>
+      <groupId>org.apache.tsfile</groupId>
+      <artifactId>tsfile-java</artifactId>
+      <version>1.0.1-SNAPSHOT</version>
+    </dependency>
+<dependencies>
+```
+
+### TsFile Java API
+
+#### 写入 TsFile
+TsFile 可以通过以下三个步骤生成,完整的代码参见"写入 TsFile 示例"章节。
+
+1. 注册元数据 (Schema)
+
+    创建一个`Schema`类的实例。
+    
+    `Schema`类保存的是一个映射关系,key 是一个 measurement 的名字,value 是 measurement schema.
+    
+    下面是一系列接口:
+    
+    ```java
+
+    /**
+     * measurementID: 物理量的名称,通常是传感器的名称
+     * type: 数据类型,现在支持六种类型:`BOOLEAN`, `INT32`, `INT64`, `FLOAT`, `DOUBLE`, 
`TEXT`
+     * encoding: 编码类型
+     */
+    public MeasurementSchema(String measurementId, TSDataType type, TSEncoding 
encoding) // 默认使用 LZ4 压缩算法
+
+    // 使用预定义的 measurement 列表初始化 Schema
+    public Schema(Map<String, MeasurementSchema> measurements)
+
+    /** 
+     * 构造 TsFileWriter 进行数据写入
+     * file : 写入 TsFile 数据的文件
+     * schema : 文件的 schemas
+     */
+    public TsFileWriter(File file, Schema schema) throws IOException
+    ```
+
+2. 使用 `TsFileWriter` 写入数据。
+  
+    ```java
+    /**
+     * 使用接口创建一个新的`TSRecord`(时间戳和设备)
+     */
+    public TSRecord(long timestamp, String deviceId)
+
+    /**
+     * 创建一个`DataPoint`(度量 (measurement) 和值的对应),并使用 addTuple 方法将数据 DataPoint 
添加正确的值到 TsRecord。
+     */
+      for (IMeasurementSchema schema : schemas) {
+        tsRecord.addTuple(
+            DataPoint.getDataPoint(
+                schema.getType(),
+                schema.getMeasurementId(),
+                
Objects.requireNonNull(DataGenerator.generate(schema.getType(), (int) 
startValue))
+                    .toString()));
+        startValue++;
+      }
+    /**
+     * 写入数据
+     */
+    public void write(TSRecord record) throws IOException, 
WriteProcessException
+    ```
+
+3. 调用`close`方法来关闭文件,关闭后才能进行查询。
+
+    ```java
+    public void close() throws IOException
+    ```
+
+写入 TsFile 完整示例
+
+[构造 TSRecord 
来写入数据](../examples/src/main/java/org/apache/tsfile/TsFileWriteAlignedWithTSRecord.java)。
+
+[构造 Tablet 
来写入数据](../examples/src/main/java/org/apache/tsfile/TsFileWriteAlignedWithTablet.java)。
+
+
+#### 读取 TsFile
+
+* 构造查询条件
+```java
+/**
+ * 构造待读取的时间序列
+ * 时间序列由 deviceId.measurementId 的格式组成(deviceId内可以有.)
+ */
+List<Path> paths = new ArrayList<Path>();
+paths.add(new Path("device_1.sensor_1"));
+paths.add(new Path("device_1.sensor_3"));
+
+/**
+ * 构造一个时间范围过滤条件 
+ */
+IExpression timeFilterExpr = BinaryExpression.and(
+               new GlobalTimeExpression(TimeFilter.gtEq(15L)),
+    new GlobalTimeExpression(TimeFilter.lt(25L))); // 15 <= time < 25
+
+/**
+ * 构造完整的查询表达式
+ */
+QueryExpression queryExpression = QueryExpression.create(paths, 
timeFilterExpr);
+```
+
+* 读取数据
+
+```java
+/**
+ * 根据文件路径`filePath`构造一个`ReadOnlyTsFile`实例。
+ */
+TsFileSequenceReader reader = new TsFileSequenceReader(filePath);
+ReadOnlyTsFile readTsFile = new ReadOnlyTsFile(reader);
+
+/**
+ * 查询数据
+ */
+public QueryDataSet query(QueryExpression queryExpression) throws IOException
+```
+
+读取 TsFile 完整示例
+
+[查询数据](../examples/src/main/java/org/apache/tsfile/TsFileRead.java)
+
+[全文件读取](../examples/src/main/java/org/apache/tsfile/TsFileSequenceRead.java)
diff --git a/java/tsfile/README.md b/java/tsfile/README.md
index 2afa2fb9..3706dff7 100644
--- a/java/tsfile/README.md
+++ b/java/tsfile/README.md
@@ -19,7 +19,7 @@
 
 -->
 
-# TsFile Document
+# TsFile Java Document
 <pre>
 ___________    ___________.__.__          
 \__    ___/____\_   _____/|__|  |   ____  
@@ -28,36 +28,171 @@ ___________    ___________.__.__
   |____|/____  >\___  /   |__|____/\___  >  version 1.0.0
              \/     \/                 \/  
 </pre>
-## Abstract
 
-TsFile is a columnar storage file format designed for time series data, which 
supports efficient compression and query. It is easy to integrate TsFile into 
your IoT big data processing frameworks.
+## Building With Java
 
+### Prerequisites
 
-## Motivation
+To build TsFile wirh Java, you need to have:
 
-Nowadays, the implementation of IoT is becoming increasingly popular in areas 
such as Industry 4.0, Smart Home, wearables and Connected Healthcare. Comparing 
with traditional IT infrastructure usage monitoring scenarios, applications 
like intelligent control and alarm reporting stimulate more advanced analytics 
requirements on time series data generated by sensors. Especially when IoT 
dives into industrial Internet, intelligent equipments produce one to two 
orders of magnitudes of data m [...]
+1. Java >= 1.8 (1.8, 11 to 17 are verified. Please make sure the environment 
path has been set accordingly).
+2. Maven >= 3.6 (If you want to compile TsFile from source code).
 
-Recent advances in time series data management system are developed for data 
center monitoring. Currently there is not a file format optimized specifically 
for time series data in above scenarios. So TsFile was born. TsFile is a 
specially designed file format rather than a database. Users can open, write, 
read, and close a TsFile easily like doing operations on a normal file. 
Besides, more interfaces are available on a TsFile.
 
-The target of TsFile project is to support: high ingestion rate up to tens of 
million data points per second and rare updates only for the correction of low 
quality data; compact data packaging and deep compression for long-live 
historical data; traditional sequential and conditional query, complex 
exploratory query, signal processing, data mining and machine learning.
+### Build TsFile with Maven
 
-The features of TsFile is as follow:
+```
+mvn clean package -P with-java -DskipTests
+```
 
-* **Write**
-       * Fast data import
-       * Efficiently compression
-       * diverse data encoding types
-* **Read**
-       * Efficiently query 
-       * Time-sorted query data set
-* **Integration**
-       * HDFS
-       * Spark and Hive
-       * etc. 
+### Install to local machine
 
-## Online Documents
-* [Installation](https://github.com/thulab/tsfile/wiki/Installation)
-* [Get Started](https://github.com/thulab/tsfile/wiki/Get-Started)
-* [TsFile-Spark 
Connector](https://github.com/thulab/tsfile/wiki/TsFile-Spark-Connector)
+```
+mvn install -P with-java -DskipTests
+```
 
- 
+## Use TsFile
+
+### Add TsFile as a dependency in Maven
+
+The current release version is `1.0.0`
+
+```xml  
+<dependencies>
+    <dependency>
+      <groupId>org.apache.tsfile</groupId>
+      <artifactId>tsfile</artifactId>
+      <version>1.0.0</version>
+    </dependency>
+<dependencies>
+```
+
+The current SNAPSHOT version is `1.0.1-SNAPSHOT`, you can use it after Maven 
install
+
+```xml  
+<dependencies>
+    <dependency>
+      <groupId>org.apache.tsfile</groupId>
+      <artifactId>tsfile-java</artifactId>
+      <version>1.0.1-SNAPSHOT</version>
+    </dependency>
+<dependencies>
+```
+
+### TsFile Java API
+
+#### Write TsFile
+TsFile can be generated through the following three steps, and the complete 
code can be found in the "Write TsFile Example" section.
+
+1. Register Schema
+
+    you can make an instance of class `Schema` first and pass this to the 
constructor of class `TsFileWriter`
+    
+    The class `Schema` contains a map whose key is the name of one measurement 
schema, and the value is the schema itself.
+
+    Here are the interfaces:
+    
+    ```java
+
+    /**
+     * measurementID: The name of this measurement, typically the name of the 
sensor
+     * type: The data type, now support six types: `BOOLEAN`, `INT32`, 
`INT64`, `FLOAT`, `DOUBLE`, `TEXT`
+     * encoding: The data encoding
+     */
+    public MeasurementSchema(String measurementId, TSDataType type, TSEncoding 
encoding) // default use LZ4 Compression
+
+    // Initialize the schema using a predefined measurement list
+    public Schema(Map<String, MeasurementSchema> measurements)
+
+    /** 
+     * construct TsFileWriter for write
+     * file : The TsFile to write
+     * schema : The file schemas
+     */
+    public TsFileWriter(File file, Schema schema) throws IOException
+    ```
+
+2. use `TsFileWriter` write data.
+  
+    ```java
+    /**
+     * Use this interface to create a new `TSRecord`(a timestamp and device 
pair)
+     */
+    public TSRecord(long timestamp, String deviceId)
+
+    /**
+     * Then create a `DataPoint`(a measurement and value pair), and use the 
addTuple method to add the DataPoint to the correct TsRecord.
+     */
+      for (IMeasurementSchema schema : schemas) {
+        tsRecord.addTuple(
+            DataPoint.getDataPoint(
+                schema.getType(),
+                schema.getMeasurementId(),
+                
Objects.requireNonNull(DataGenerator.generate(schema.getType(), (int) 
startValue))
+                    .toString()));
+        startValue++;
+      }
+    /**
+     * write data
+     */
+    public void write(TSRecord record) throws IOException, 
WriteProcessException
+    ```
+
+3. call `close` to finish this writing process,Query can only be performed 
after close.
+
+    ```java
+    public void close() throws IOException
+    ```
+
+Write TsFile Example
+
+[Construct TSRecord Write 
Data](../examples/src/main/java/org/apache/tsfile/TsFileWriteAlignedWithTSRecord.java)。
+
+[Construct Tablet Write 
Data](../examples/src/main/java/org/apache/tsfile/TsFileWriteAlignedWithTablet.java)。
+
+
+#### Read TsFile
+
+* Construct Query Expression
+```java
+/**
+ * Construct a time series to be read
+ * The time series is composed of the format deviceId.measurementId (there can 
be.)
+ */
+List<Path> paths = new ArrayList<Path>();
+paths.add(new Path("device_1.sensor_1"));
+paths.add(new Path("device_1.sensor_3"));
+
+/**
+ * Construct Time Filter 
+ */
+IExpression timeFilterExpr = BinaryExpression.and(
+               new GlobalTimeExpression(TimeFilter.gtEq(15L)),
+    new GlobalTimeExpression(TimeFilter.lt(25L))); // 15 <= time < 25
+
+/**
+ * Construct Full Query Expression
+ */
+QueryExpression queryExpression = QueryExpression.create(paths, 
timeFilterExpr);
+```
+
+* Read Data
+
+```java
+/**
+ * Construct an instance of 'ReadOnlyTsFile' based on the file path 'filePath'.
+ */
+TsFileSequenceReader reader = new TsFileSequenceReader(filePath);
+ReadOnlyTsFile readTsFile = new ReadOnlyTsFile(reader);
+
+/**
+ * Query Data
+ */
+public QueryDataSet query(QueryExpression queryExpression) throws IOException
+```
+
+Read TsFile Example
+
+[Read Data](../examples/src/main/java/org/apache/tsfile/TsFileRead.java)
+
+[Sequence Read 
Data](../examples/src/main/java/org/apache/tsfile/TsFileSequenceRead.java)
\ No newline at end of file


Reply via email to