Hi all, During the recent work on enabling batch read/write support for TsFile based on Apache Arrow, I noticed a divergence between Arrow and our current project in terms of supported C++ standards. This raises an important question regarding the technical strategy for TsFile as a common component across both embedded and cloud environments. According to the official Arrow documentation, starting from version 10.0.0, C++11 is no longer supported and C++17 is required (see issue [1]). Therefore, introducing Arrow-related capabilities inevitably requires a build environment that supports a newer C++ standard.
In embedded scenarios, although we already provide C-based interfaces to minimize dependency on C++ versions, in practice some embedded or legacy edge devices still only support relatively old C++ standards. In contrast, cloud environments or platforms used for Python-based training and data processing usually have more modern toolchains, where C++17 support is generally not an issue. Given these differences in usage scenarios, I propose adopting a tiered strategy for C++ standard support within the project: * The core and basic functionality of the project will continue to use C++11 as the minimum required standard, to ensure compatibility with resource-constrained or legacy platforms. * Features that depend on newer capabilities or third-party libraries such as Arrow will be provided as optional extension modules, and implemented and built using C++17. * Users can decide whether to enable these optional features based on their runtime environment and functional requirements, and accordingly choose the appropriate C++ standard for compilation. With this approach, we aim to preserve overall compatibility while still enabling the introduction of high-performance data processing features, and to better accommodate the differences between embedded and cloud environments in terms of toolchains and functional needs. [1] https://github.com/apache/arrow/issues/32415 Best regards, Colin.
