Hi Jialin,

I strongly support the integration with Hugging Face Datasets.

The primary bottleneck in Time-Series AI today is not a lack of data, but the 
lack of standardized, high-performance Data IO. Native integration would 
transform TsFile into a foundational infrastructure for the TS community, 
rather than just another file format.

>From our experience developing the Sundial model, such a bridge would make 
>sharing datasets like TimeBench seamless. More importantly, it unlocks massive 
>industrial IoT data from IoTDB directly into AI training pipelines.

Let's make TsFile the "first-class citizen" for Time-Series in the AI 
ecosystem. I'm eager to help define the technical requirements!

Best, Caiyin Yang

On 2025/12/30 12:37:20 Jialin Qiao wrote:
> Hi all,
> 
> With the release of TsFile 2.2.0, the project now offers
> multi-language SDKs (Python, Java, C++, C), enabling seamless data
> storage for terminal devices, real-time edge-side processing, and
> cloud-based data analysis. Its support for table models further
> simplifies data analysis and model training in Python.
> 
> As AI continues to gain momentum, TsFile can serve as a foundational
> format for building industrial time-series datasets in the AI era.
> 
> Here are some potential work we could do
> 1. Deeper alignment with the Python ecosystem, such as Pandas & DataFrame.
> 2. Integration with HuggingFace Datasets.
> 3. Viewer of a TsFile.
> 4. Converter between other formats(such as Parquet, CSV, HDF5) and TsFile.
> 
> Welcome further ideas to advance the TsFile community :-)
> 
> Thanks,
> Jialin Qiao
> 

Reply via email to