(tsfile) branch docs/dev updated: add to_dataframe. (#674)

jiangtian Tue, 23 Dec 2025 18:06:30 -0800

This is an automated email from the ASF dual-hosted git repository.

jiangtian pushed a commit to branch docs/dev
in repository https://gitbox.apache.org/repos/asf/tsfile.git



The following commit(s) were added to refs/heads/docs/dev by this push:
     new 889e78ec add to_dataframe. (#674)
889e78ec is described below

commit 889e78ec815420d76a343973c225cf57b9fbc44f
Author: Colin Lee <[email protected]>
AuthorDate: Wed Dec 24 10:06:17 2025 +0800

    add to_dataframe. (#674)
    
    * add to_dataframe.
    
    * update py datatype.
---
 .../InterfaceDefinition-Python.md                  | 74 ++++++++++++++++++++++
 .../develop/QuickStart/QuickStart-PYTHON.md        |  9 +++
 .../InterfaceDefinition-Python.md                  | 73 +++++++++++++++++++++
 .../latest/QuickStart/QuickStart-PYTHON.md         |  9 +++
 .../InterfaceDefinition-Python.md                  | 72 +++++++++++++++++++++
 .../develop/QuickStart/QuickStart-PYTHON.md        |  9 +++
 .../InterfaceDefinition-Python.md                  | 72 +++++++++++++++++++++
 .../latest/QuickStart/QuickStart-PYTHON.md         | 11 ++++
 8 files changed, 329 insertions(+)

diff --git 
a/src/UserGuide/develop/QuickStart/InterfaceDefinition/InterfaceDefinition-Python.md
 
b/src/UserGuide/develop/QuickStart/InterfaceDefinition/InterfaceDefinition-Python.md
index 4798efad..2ed66d8f 100644
--- 
a/src/UserGuide/develop/QuickStart/InterfaceDefinition/InterfaceDefinition-Python.md
+++ 
b/src/UserGuide/develop/QuickStart/InterfaceDefinition/InterfaceDefinition-Python.md
@@ -34,6 +34,9 @@ class TSDataType(IntEnum):
     FLOAT = 3
     DOUBLE = 4
     TEXT = 5
+    TIMESTAMP = 8
+    DATE = 9
+    BLOB = 10
     STRING = 11
 
 class ColumnCategory(IntEnum):
@@ -280,3 +283,74 @@ class ResultSet:
     def close(self)
 ```
 
+
+### to_dataframe
+
+```python
+
+def to_dataframe(file_path: str,
+                 table_name: Optional[str] = None,
+                 column_names: Optional[list[str]] = None,
+                 start_time: Optional[int] = None,
+                 end_time: Optional[int] = None,
+                 max_row_num: Optional[int] = None,
+                 as_iterator: bool = False) -> Union[pd.DataFrame, 
Iterator[pd.DataFrame]]:
+
+    """
+       Read data from a TsFile and convert it into a Pandas DataFrame or
+       an iterator of DataFrames.
+
+       This function supports both table-model and tree-model TsFiles.
+       Users can filter data by table name, column names, time range,
+       and maximum number of rows.
+
+       Parameters
+       ----------
+       file_path : str
+           Path to the TsFile to be read.
+
+       table_name : Optional[str], default None
+           Name of the table to query in table-model TsFiles.
+           If None and the file is in table model, the first table
+           found in the schema will be used.
+
+       column_names : Optional[list[str]], default None
+           List of column names to query.
+           - If None, all columns will be returned.
+           - Column existence will be validated in table-model TsFiles.
+
+       start_time : Optional[int], default None
+           Start timestamp for the query.
+           If None, the minimum int64 value is used.
+
+       end_time : Optional[int], default None
+           End timestamp for the query.
+           If None, the maximum int64 value is used.
+
+       max_row_num : Optional[int], default None
+           Maximum number of rows to read.
+           - If None, all available rows will be returned.
+           - When `as_iterator` is False, the final DataFrame will be
+             truncated to this size if necessary.
+
+       as_iterator : bool, default False
+           Whether to return an iterator of DataFrames instead of
+           a single concatenated DataFrame.
+           - True: returns an iterator yielding DataFrames in batches
+           - False: returns a single Pandas DataFrame
+
+       Returns
+       -------
+       Union[pandas.DataFrame, Iterator[pandas.DataFrame]]
+           - A Pandas DataFrame if `as_iterator` is False
+           - An iterator of Pandas DataFrames if `as_iterator` is True
+
+       Raises
+       ------
+       TableNotExistError
+           If the specified table name does not exist in a table-model TsFile.
+
+       ColumnNotExistError
+           If any specified column does not exist in the table schema.
+       """
+```
diff --git a/src/UserGuide/develop/QuickStart/QuickStart-PYTHON.md 
b/src/UserGuide/develop/QuickStart/QuickStart-PYTHON.md
index 30207bb5..e748ed50 100644
--- a/src/UserGuide/develop/QuickStart/QuickStart-PYTHON.md
+++ b/src/UserGuide/develop/QuickStart/QuickStart-PYTHON.md
@@ -148,6 +148,15 @@ with TsFileReader(table_data_dir) as reader:
             print(result.read_data_frame())
 ```
 
+use `to_dataframe` to read tsfile as dataframe.
+
+```Python
+import os
+import tsfile as ts
+table_data_dir = os.path.join(os.path.dirname(__file__), "table_data.tsfile")
+print(ts.to_dataframe(table_data_dir))
+```
+
 ## Sample Code
 
 The sample code of using these interfaces is 
in：https://github.com/apache/tsfile/blob/develop/python/examples/example.py
diff --git 
a/src/UserGuide/latest/QuickStart/InterfaceDefinition/InterfaceDefinition-Python.md
 
b/src/UserGuide/latest/QuickStart/InterfaceDefinition/InterfaceDefinition-Python.md
index 4798efad..849f1c00 100644
--- 
a/src/UserGuide/latest/QuickStart/InterfaceDefinition/InterfaceDefinition-Python.md
+++ 
b/src/UserGuide/latest/QuickStart/InterfaceDefinition/InterfaceDefinition-Python.md
@@ -34,6 +34,9 @@ class TSDataType(IntEnum):
     FLOAT = 3
     DOUBLE = 4
     TEXT = 5
+    TIMESTAMP = 8
+    DATE = 9
+    BLOB = 10
     STRING = 11
 
 class ColumnCategory(IntEnum):
@@ -280,3 +283,73 @@ class ResultSet:
     def close(self)
 ```
 
+### to_dataframe
+
+```python
+
+def to_dataframe(file_path: str,
+                 table_name: Optional[str] = None,
+                 column_names: Optional[list[str]] = None,
+                 start_time: Optional[int] = None,
+                 end_time: Optional[int] = None,
+                 max_row_num: Optional[int] = None,
+                 as_iterator: bool = False) -> Union[pd.DataFrame, 
Iterator[pd.DataFrame]]:
+
+    """
+       Read data from a TsFile and convert it into a Pandas DataFrame or
+       an iterator of DataFrames.
+
+       This function supports both table-model and tree-model TsFiles.
+       Users can filter data by table name, column names, time range,
+       and maximum number of rows.
+
+       Parameters
+       ----------
+       file_path : str
+           Path to the TsFile to be read.
+
+       table_name : Optional[str], default None
+           Name of the table to query in table-model TsFiles.
+           If None and the file is in table model, the first table
+           found in the schema will be used.
+
+       column_names : Optional[list[str]], default None
+           List of column/measurement names to query.
+           - If None, all columns will be returned.
+           - Column existence will be validated in table-model TsFiles.
+
+       start_time : Optional[int], default None
+           Start timestamp for the query.
+           If None, the minimum int64 value is used.
+
+       end_time : Optional[int], default None
+           End timestamp for the query.
+           If None, the maximum int64 value is used.
+
+       max_row_num : Optional[int], default None
+           Maximum number of rows to read.
+           - If None, all available rows will be returned.
+           - When `as_iterator` is False, the final DataFrame will be
+             truncated to this size if necessary.
+
+       as_iterator : bool, default False
+           Whether to return an iterator of DataFrames instead of
+           a single concatenated DataFrame.
+           - True: returns an iterator yielding DataFrames in batches
+           - False: returns a single Pandas DataFrame
+
+       Returns
+       -------
+       Union[pandas.DataFrame, Iterator[pandas.DataFrame]]
+           - A Pandas DataFrame if `as_iterator` is False
+           - An iterator of Pandas DataFrames if `as_iterator` is True
+
+       Raises
+       ------
+       TableNotExistError
+           If the specified table name does not exist in a table-model TsFile.
+
+       ColumnNotExistError
+           If any specified column does not exist in the table schema.
+       """
+```
\ No newline at end of file
diff --git a/src/UserGuide/latest/QuickStart/QuickStart-PYTHON.md 
b/src/UserGuide/latest/QuickStart/QuickStart-PYTHON.md
index 30207bb5..9aa9e571 100644
--- a/src/UserGuide/latest/QuickStart/QuickStart-PYTHON.md
+++ b/src/UserGuide/latest/QuickStart/QuickStart-PYTHON.md
@@ -148,6 +148,15 @@ with TsFileReader(table_data_dir) as reader:
             print(result.read_data_frame())
 ```
 
+Use `to_dataframe` to read tsfile as dataframe.
+
+```Python
+import os
+import tsfile as ts
+table_data_dir = os.path.join(os.path.dirname(__file__), "table_data.tsfile")
+print(ts.to_dataframe(table_data_dir))
+```
+
 ## Sample Code
 
 The sample code of using these interfaces is 
in：https://github.com/apache/tsfile/blob/develop/python/examples/example.py
diff --git 
a/src/zh/UserGuide/develop/QuickStart/InterfaceDefinition/InterfaceDefinition-Python.md
 
b/src/zh/UserGuide/develop/QuickStart/InterfaceDefinition/InterfaceDefinition-Python.md
index 7143a583..60515bff 100644
--- 
a/src/zh/UserGuide/develop/QuickStart/InterfaceDefinition/InterfaceDefinition-Python.md
+++ 
b/src/zh/UserGuide/develop/QuickStart/InterfaceDefinition/InterfaceDefinition-Python.md
@@ -33,6 +33,9 @@ class TSDataType(IntEnum):
     FLOAT = 3
     DOUBLE = 4
     TEXT = 5
+    TIMESTAMP = 8
+    DATE = 9
+    BLOB = 10
     STRING = 11
 
 class ColumnCategory(IntEnum):
@@ -262,3 +265,72 @@ class ResultSet:
 
 ```
 
+### to_dataframe
+
+```Python
+
+def to_dataframe(file_path: str,
+                 table_name: Optional[str] = None,
+                 column_names: Optional[list[str]] = None,
+                 start_time: Optional[int] = None,
+                 end_time: Optional[int] = None,
+                 max_row_num: Optional[int] = None,
+                 as_iterator: bool = False) -> Union[pd.DataFrame, 
Iterator[pd.DataFrame]]:
+        """
+       从 TsFile 中读取数据，并将其转换为 Pandas DataFrame
+       或 DataFrame 迭代器。
+
+       该函数同时支持表模型（table-model）和树模型（tree-model）的 TsFile。
+       用户可以通过表名、列名、时间范围以及最大行数对数据进行过滤。
+
+       Parameters
+       ----------
+       file_path : str
+           要读取的 TsFile 文件路径。
+
+       table_name : Optional[str], default None
+           表模型 TsFile 中要查询的表名。
+           如果为 None 且文件为表模型，
+           将使用 schema 中找到的第一个表。
+
+       column_names : Optional[list[str]], default None
+           要查询的列名/测点名列表。
+           - 如果为 None，则返回所有列。
+           - 在表模型 TsFile 中会校验列是否存在。
+
+       start_time : Optional[int], default None
+           查询的起始时间戳。
+           如果为 None，则使用 int64 的最小值。
+
+       end_time : Optional[int], default None
+           查询的结束时间戳。
+           如果为 None，则使用 int64 的最大值。
+
+       max_row_num : Optional[int], default None
+           读取的最大行数。
+           - 如果为 None，则返回所有可用数据。
+           - 当 `as_iterator` 为 False 时，
+             若结果行数超过该值，DataFrame 将被截断。
+
+       as_iterator : bool, default False
+           是否返回 DataFrame 迭代器，而不是单个合并后的 DataFrame。
+           - True：返回按批次生成 DataFrame 的迭代器
+           - False：返回单个 Pandas DataFrame
+
+       Returns
+       -------
+       Union[pandas.DataFrame, Iterator[pandas.DataFrame]]
+           - 当 `as_iterator` 为 False 时，返回 Pandas DataFrame
+           - 当 `as_iterator` 为 True 时，返回 Pandas DataFrame 迭代器
+
+       Raises
+       ------
+       TableNotExistError
+           当指定的表名在表模型 TsFile 中不存在时抛出。
+
+       ColumnNotExistError
+           当指定的列在表结构中不存在时抛出。
+    """
+
+```
+
diff --git a/src/zh/UserGuide/develop/QuickStart/QuickStart-PYTHON.md 
b/src/zh/UserGuide/develop/QuickStart/QuickStart-PYTHON.md
index b3aad522..543333e2 100644
--- a/src/zh/UserGuide/develop/QuickStart/QuickStart-PYTHON.md
+++ b/src/zh/UserGuide/develop/QuickStart/QuickStart-PYTHON.md
@@ -149,6 +149,15 @@ with TsFileReader(table_data_dir) as reader:
             print(result.read_data_frame())
 ```
 
+使用 `to_dataframe` 读取 TsFile 为 Dataframe.
+
+```Python
+import os
+import tsfile as ts
+table_data_dir = os.path.join(os.path.dirname(__file__), "table_data.tsfile")
+print(ts.to_dataframe(table_data_dir))
+```
+
 ## 示例代码
 
 
使用这些接口的示例代码可以在以下链接中找到：https://github.com/apache/tsfile/blob/develop/python/examples/example.py
diff --git 
a/src/zh/UserGuide/latest/QuickStart/InterfaceDefinition/InterfaceDefinition-Python.md
 
b/src/zh/UserGuide/latest/QuickStart/InterfaceDefinition/InterfaceDefinition-Python.md
index 7143a583..08a4b2f6 100644
--- 
a/src/zh/UserGuide/latest/QuickStart/InterfaceDefinition/InterfaceDefinition-Python.md
+++ 
b/src/zh/UserGuide/latest/QuickStart/InterfaceDefinition/InterfaceDefinition-Python.md
@@ -33,6 +33,9 @@ class TSDataType(IntEnum):
     FLOAT = 3
     DOUBLE = 4
     TEXT = 5
+    TIMESTAMP = 8
+    DATE = 9
+    BLOB = 10
     STRING = 11
 
 class ColumnCategory(IntEnum):
@@ -262,3 +265,72 @@ class ResultSet:
 
 ```
 
+
+### to_dataframe
+
+```Python
+
+def to_dataframe(file_path: str,
+                 table_name: Optional[str] = None,
+                 column_names: Optional[list[str]] = None,
+                 start_time: Optional[int] = None,
+                 end_time: Optional[int] = None,
+                 max_row_num: Optional[int] = None,
+                 as_iterator: bool = False) -> Union[pd.DataFrame, 
Iterator[pd.DataFrame]]:
+        """
+       从 TsFile 中读取数据，并将其转换为 Pandas DataFrame
+       或 DataFrame 迭代器。
+
+       该函数同时支持表模型（table-model）和树模型（tree-model）的 TsFile。
+       用户可以通过表名、列名、时间范围以及最大行数对数据进行过滤。
+
+       Parameters
+       ----------
+       file_path : str
+           要读取的 TsFile 文件路径。
+
+       table_name : Optional[str], default None
+           表模型 TsFile 中要查询的表名。
+           如果为 None 且文件为表模型，
+           将使用 schema 中找到的第一个表。
+
+       column_names : Optional[list[str]], default None
+           要查询的列名/测点名列表。
+           - 如果为 None，则返回所有列。
+           - 在表模型 TsFile 中会校验列是否存在。
+
+       start_time : Optional[int], default None
+           查询的起始时间戳。
+           如果为 None，则使用 int64 的最小值。
+
+       end_time : Optional[int], default None
+           查询的结束时间戳。
+           如果为 None，则使用 int64 的最大值。
+
+       max_row_num : Optional[int], default None
+           读取的最大行数。
+           - 如果为 None，则返回所有可用数据。
+           - 当 `as_iterator` 为 False 时，
+             若结果行数超过该值，DataFrame 将被截断。
+
+       as_iterator : bool, default False
+           是否返回 DataFrame 迭代器，而不是单个合并后的 DataFrame。
+           - True：返回按批次生成 DataFrame 的迭代器
+           - False：返回单个 Pandas DataFrame
+
+       Returns
+       -------
+       Union[pandas.DataFrame, Iterator[pandas.DataFrame]]
+           - 当 `as_iterator` 为 False 时，返回 Pandas DataFrame
+           - 当 `as_iterator` 为 True 时，返回 Pandas DataFrame 迭代器
+
+       Raises
+       ------
+       TableNotExistError
+           当指定的表名在表模型 TsFile 中不存在时抛出。
+
+       ColumnNotExistError
+           当指定的列在表结构中不存在时抛出。
+    """
+
+```
diff --git a/src/zh/UserGuide/latest/QuickStart/QuickStart-PYTHON.md 
b/src/zh/UserGuide/latest/QuickStart/QuickStart-PYTHON.md
index a4353d7f..4d24c410 100644
--- a/src/zh/UserGuide/latest/QuickStart/QuickStart-PYTHON.md
+++ b/src/zh/UserGuide/latest/QuickStart/QuickStart-PYTHON.md
@@ -149,6 +149,17 @@ with TsFileReader(table_data_dir) as reader:
             print(result.read_data_frame())
 ```
 
+
+使用 `to_dataframe` 读取 TsFile 为 Dataframe.
+
+```Python
+import os
+import tsfile as ts
+table_data_dir = os.path.join(os.path.dirname(__file__), "table_data.tsfile")
+print(ts.to_dataframe(table_data_dir))
+```
+
+
 ## 示例代码
 
 
使用这些接口的示例代码可以在以下链接中找到：https://github.com/apache/tsfile/blob/develop/python/examples/example.py

(tsfile) branch docs/dev updated: add to_dataframe. (#674)

Reply via email to