Hi, I create a issue on hugging face dataset project [1], we could also discuss here.
[1] https://github.com/huggingface/datasets/issues/7922 Jialin Qiao Tian Jiang <[email protected]> 于2025年12月31日周三 16:12写道: > > It is exciting to hear the new directions, most of which focus on the > integration with the AI eco-systems. > > > I personally do not participate in many AI-related works. Nevertheless, I > feel it interesting to apply TsFile in as many areas as possible. > > > If some detailed user cases can be provided, I am more than happy to join the > brainstorm of evolving TsFile to the next generation. > > > Best, > Tian Jiang > > > ---- Replied Message ---- > | From | Caiyin Yang<[email protected]> | > | Date | 12/31/2025 15:48 | > | To | <[email protected]> | > | Subject | Re: Future Directions of Apache TsFile | > Hi Jialin, > > I strongly support the integration with Hugging Face Datasets. > > The primary bottleneck in Time-Series AI today is not a lack of data, but the > lack of standardized, high-performance Data IO. Native integration would > transform TsFile into a foundational infrastructure for the TS community, > rather than just another file format. > > From our experience developing the Sundial model, such a bridge would make > sharing datasets like TimeBench seamless. More importantly, it unlocks > massive industrial IoT data from IoTDB directly into AI training pipelines. > > Let's make TsFile the "first-class citizen" for Time-Series in the AI > ecosystem. I'm eager to help define the technical requirements! > > Best, Caiyin Yang > > On 2025/12/30 12:37:20 Jialin Qiao wrote: > Hi all, > > With the release of TsFile 2.2.0, the project now offers > multi-language SDKs (Python, Java, C++, C), enabling seamless data > storage for terminal devices, real-time edge-side processing, and > cloud-based data analysis. Its support for table models further > simplifies data analysis and model training in Python. > > As AI continues to gain momentum, TsFile can serve as a foundational > format for building industrial time-series datasets in the AI era. > > Here are some potential work we could do > 1. Deeper alignment with the Python ecosystem, such as Pandas & DataFrame. > 2. Integration with HuggingFace Datasets. > 3. Viewer of a TsFile. > 4. Converter between other formats(such as Parquet, CSV, HDF5) and TsFile. > > Welcome further ideas to advance the TsFile community :-) > > Thanks, > Jialin Qiao >
