Hi Qiao, I am very concern about the deeper alignment with the Python ecosystem. TsFile already provides a comprehensive Python SDK for reading and writing data, with native support for conversion into Pandas DataFrame format. As we further extend the SDK’s compatibility with major Python analytical data structures, for instances: - NumPy arrays for direct numerical computation; - PyTorch tensors for seamless model input preparation; - Apache Arrow tables for efficient in-memory data sharing across ecosystems. I believe the TsFile could gradually become the first-class citizen for the analysis of time series data as we implement these enhancements.
On 2025/12/30 12:37:20 Jialin Qiao wrote: > Hi all, > > With the release of TsFile 2.2.0, the project now offers > multi-language SDKs (Python, Java, C++, C), enabling seamless data > storage for terminal devices, real-time edge-side processing, and > cloud-based data analysis. Its support for table models further > simplifies data analysis and model training in Python. > > As AI continues to gain momentum, TsFile can serve as a foundational > format for building industrial time-series datasets in the AI era. > > Here are some potential work we could do > 1. Deeper alignment with the Python ecosystem, such as Pandas & DataFrame. > 2. Integration with HuggingFace Datasets. > 3. Viewer of a TsFile. > 4. Converter between other formats(such as Parquet, CSV, HDF5) and TsFile. > > Welcome further ideas to advance the TsFile community :-) > > Thanks, > Jialin Qiao >
