Hi Jialin, I strongly support the integration with Hugging Face Datasets.
The primary bottleneck in Time-Series AI today is not a lack of data, but the lack of standardized, high-performance Data IO. Native integration would transform TsFile into a foundational infrastructure for the TS community, rather than just another file format. >From our experience developing the Sundial model, such a bridge would make >sharing datasets like TimeBench seamless. More importantly, it unlocks massive >industrial IoT data from IoTDB directly into AI training pipelines. Let's make TsFile the "first-class citizen" for Time-Series in the AI ecosystem. I'm eager to help define the technical requirements! Best, Caiyin Yang On 2025/12/30 12:37:20 Jialin Qiao wrote: > Hi all, > > With the release of TsFile 2.2.0, the project now offers > multi-language SDKs (Python, Java, C++, C), enabling seamless data > storage for terminal devices, real-time edge-side processing, and > cloud-based data analysis. Its support for table models further > simplifies data analysis and model training in Python. > > As AI continues to gain momentum, TsFile can serve as a foundational > format for building industrial time-series datasets in the AI era. > > Here are some potential work we could do > 1. Deeper alignment with the Python ecosystem, such as Pandas & DataFrame. > 2. Integration with HuggingFace Datasets. > 3. Viewer of a TsFile. > 4. Converter between other formats(such as Parquet, CSV, HDF5) and TsFile. > > Welcome further ideas to advance the TsFile community :-) > > Thanks, > Jialin Qiao >
