Hi Qiao, Thank you for sharing the exciting news about the TsFile 2.2.0 release and the potential directions for community collaboration. I am particularly interested in the "Viewer of a TsFile" initiative and believe it could significantly enhance the usability and accessibility of TsFile for developers and data analysts.
A dedicated TsFile Viewer would allow users to intuitively inspect the internal structure and data content of TsFile without writing code—for example, browsing devices, measurements, time-series data points, and metadata (e.g., schema, encoding, compression details). This is especially valuable for data validation, debugging, and exploratory analysis. Furthermore, i propose integrating format conversion capabilities (e.g., to/from Parquet, CSV) directly into the Viewer in the future. This would create a seamless workflow: users could open a TsFile, inspect its content, and then convert it to other common formats for further analysis in tools like Pandas or Spark—or vice versa—without switching between multiple tools. This aligns well with TsFile's goal of simplifying data operations across terminal, edge, and cloud environments. Best regards, Xuan Wang 发件人: Yuan Tian <[email protected]> 日期: 星期三, 2025年12月31日 09:53 收件人: [email protected] <[email protected]> 主题: Re: Future Directions of Apache TsFile Hi Qiao, So nice to see that. Recently, I have noticed that some data analysts, after using a specific file for data storage, hope to directly query this file using SQL syntax. Taking DuckDB as an example, for Parquet and CSV files, they can directly execute the following SQL query in DuckDB without the need to import the file into DuckDB first[1]. In fact, they are likely performing some simple pre-import analysis on the file to determine whether it should be ultimately imported into the database. However, before they finally confirm the import, they do not want to import the file, as this would risk contaminating the data in the database. ```sql SELECT * FROM read_parquet('input.parquet'); ``` Therefore, I am wondering if we can provide similar functionality in IoTDB. Based on my research, the read_parquet function is a dynamic table-valued function (TVF) in DuckDB. Since IoTDB's table model also supports dynamic table-valued functions, implementing this feature should be relatively straightforward. [1] https://aus01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fduckdb.org%2Fdocs%2Fstable%2Fguides%2Ffile_formats%2Fquery_parquet&data=05%7C02%7C%7C7b61ffbaccd54dcf6e2908de480f5193%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C639027427800189013%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=xJY6o1IS0OBOI0pcfgTQXUPp%2BEdR%2FALq0z8H%2F0o6gG4%3D&reserved=0<https://duckdb.org/docs/stable/guides/file_formats/query_parquet> On Tue, Dec 30, 2025 at 8:37 PM Jialin Qiao <[email protected]> wrote: > Hi all, > > With the release of TsFile 2.2.0, the project now offers > multi-language SDKs (Python, Java, C++, C), enabling seamless data > storage for terminal devices, real-time edge-side processing, and > cloud-based data analysis. Its support for table models further > simplifies data analysis and model training in Python. > > As AI continues to gain momentum, TsFile can serve as a foundational > format for building industrial time-series datasets in the AI era. > > Here are some potential work we could do > 1. Deeper alignment with the Python ecosystem, such as Pandas & DataFrame. > 2. Integration with HuggingFace Datasets. > 3. Viewer of a TsFile. > 4. Converter between other formats(such as Parquet, CSV, HDF5) and TsFile. > > Welcome further ideas to advance the TsFile community :-) > > Thanks, > Jialin Qiao >
