Hi all,

I would also like to give my +1 for a standalone viewer of TSFile payloads.
This year, as soon as I have finished porting the PLC4X drivers to the new 
architecture, my next goal will be to build PLC libraries for writing TSFile 
data on PLCs directly and then to forward them to a server in regular intervals 
via MQTT. So such a viewer component would be invaluable as a tool to monitor 
what’s being written and what’s going over the wire.

Regarging shifting focus more to Python. As long as this doesn’t have negative 
impact on the Java versions, I’m fine with that. But considering the 
performance benchmarks shared on this list I am not sure if using TSFile 
directly in Python is a good thing. I mean … its performance was always a tenth 
of that of Java, C++ etc. Making it super convenient to directly use in the 
Python toolchains, wouldn’t that make us use one of the key benefits of TSFile 
… it’s performance?

Chris


Von: Pengcheng Zheng <[email protected]>
Datum: Mittwoch, 31. Dezember 2025 um 15:06
An: [email protected] <[email protected]>
Betreff: Re: Future Directions of Apache TsFile

Hi all,

Great discussion :) I’d like to add a bit of context based on some
observations and discussions we’ve seen across industrial use cases,
academic perspectives, and recent community feedback around TsFile.

One direction that has emerged from these discussions is to view TsFile not
only as an efficient time-series file format, but also as a long-term
carrier for high-quality industrial time-series datasets, especially in
AI-related workflows.

In many industrial scenarios, the key challenge is no longer data
collection, but how time-series data can be preserved and reused across
different tools, languages, and modeling pipelines over a long lifecycle.
>From this perspective, clear time semantics, metadata, and efficient I/O
matter as much as raw read/write performance.

That’s why ideas like closer Python/DataFrame alignment, Hugging Face
integration, format converters, and lightweight viewers are interesting to
us. At the same time, we believe this should evolve incrementally and be
driven by concrete use cases.

Happy to continue the discussion.


Thanks,
Pengcheng


Am Mi., 31. Dez. 2025 um 16:32 Uhr schrieb Jialin Qiao <
[email protected]>:

> Hi,
>
> I create a issue on hugging face dataset project [1], we could also
> discuss here.
>
> [1] https://github.com/huggingface/datasets/issues/7922
>
> Jialin Qiao
>
> Tian Jiang <[email protected]> 于2025年12月31日周三 16:12写道:
> >
> > It is exciting to hear the new directions, most of which focus on the
> integration with the AI eco-systems.
> >
> >
> > I personally do not participate in many AI-related works. Nevertheless,
> I feel it interesting to apply TsFile in as many areas as possible.
> >
> >
> > If some detailed user cases can be provided, I am more than happy to
> join the brainstorm of evolving TsFile to the next generation.
> >
> >
> > Best,
> > Tian Jiang
> >
> >
> > ---- Replied Message ----
> > | From | Caiyin Yang<[email protected]> |
> > | Date | 12/31/2025 15:48 |
> > | To | <[email protected]> |
> > | Subject | Re: Future Directions of Apache TsFile |
> > Hi Jialin,
> >
> > I strongly support the integration with Hugging Face Datasets.
> >
> > The primary bottleneck in Time-Series AI today is not a lack of data,
> but the lack of standardized, high-performance Data IO. Native integration
> would transform TsFile into a foundational infrastructure for the TS
> community, rather than just another file format.
> >
> > From our experience developing the Sundial model, such a bridge would
> make sharing datasets like TimeBench seamless. More importantly, it unlocks
> massive industrial IoT data from IoTDB directly into AI training pipelines.
> >
> > Let's make TsFile the "first-class citizen" for Time-Series in the AI
> ecosystem. I'm eager to help define the technical requirements!
> >
> > Best, Caiyin Yang
> >
> > On 2025/12/30 12:37:20 Jialin Qiao wrote:
> > Hi all,
> >
> >  With the release of TsFile 2.2.0, the project now offers
> >  multi-language SDKs (Python, Java, C++, C), enabling seamless data
> >  storage for terminal devices, real-time edge-side processing, and
> >  cloud-based data analysis. Its support for table models further
> >  simplifies data analysis and model training in Python.
> >
> >  As AI continues to gain momentum, TsFile can serve as a foundational
> >  format for building industrial time-series datasets in the AI era.
> >
> >  Here are some potential work we could do
> >  1. Deeper alignment with the Python ecosystem, such as Pandas &
> DataFrame.
> >  2. Integration with HuggingFace Datasets.
> >  3. Viewer of a TsFile.
> >  4. Converter between other formats(such as Parquet, CSV, HDF5) and
> TsFile.
> >
> >  Welcome further ideas to advance the TsFile community :-)
> >
> >  Thanks,
> >  Jialin Qiao
> >
>

Reply via email to