Re: [Tiff] Proposal for conversion of the codebase to C++ while retaining the existing C API

Roger Leigh via Tiff Wed, 31 Dec 2025 05:07:28 -0800

> On 30 Dec 2025, at 20:12, Olivier Paquet <[email protected]> wrote:
> Le mar. 30 déc. 2025, à 09 h 50, Roger Leigh via Tiff <[email protected] 
> <mailto:[email protected]>> a écrit :
>> On 29 Dec 2025, at 16:43, Olivier Paquet <[email protected] 
>> <mailto:[email protected]>> wrote:
>  
>> * Type-safe C++ field and pixel data access are header-only inline 
>> templates, which ultimately call into the existing unsafe code.  It’s all 
>> layered on top.
>  
> This is something we should have had ages ago but it is almost entirely 
> unrelated to the implementation language. The only link is that we'll likely 
> need some new safer (non vararg) C APIs and it might be easier to do those 
> with cleaner code inside.
>  
>> On top of this, there are some additional considerations.  I’ve mentioned 
>> them previously on the list I think.  You might have seen the reporting over 
>> the last couple of years about CISA and the phasing-out of unsafe code from 
>> January 2026 onwards:
>> 
>> https://www.cisa.gov/resources-tools/resources/product-security-bad-practices
>> https://www.cisa.gov/sites/default/files/2023-12/The-Case-for-Memory-Safe-Roadmaps-508c.pdf
>> https://www.cisa.gov/news-events/alerts/2025/06/24/new-guidance-released-reducing-memory-related-vulnerabilities
>  
> Not familiar with these but does C++ ever count as safe? Granted, it's 
> somewhat easier to write good code but it still has an awful lot of ways to 
> shoot yourself in the foot.


Absolutely, but it’s all relative, and if used well, we can use C++ to have a 
smaller, simpler, more robust codebase with user-facing interfaces which are 
much harder to use incorrectly.  It’s never going to achieve complete safety, 
but it can get close.

We can also get closer with C.  Not as close as with C++, but we can do a lot 
better than today if we wish to.  The primary concern is API and ABI breakage.  
Avoiding breakage on the C side was intentionally not part of the original 
proposal, because I wanted to minimise friction and inconvenience to existing 
users at all costs. If we have the scope to break things, then there is a lot 
of room for significant improvements on the C side as well.  That would also 
make it much easier to layer a good C++ API on top (or vice versa) as well.  

>> The point here isn’t just about the casts.  It’s that the codebase quality 
>> overall is not at all great, and C the language as well as legacy coding 
>> practices and compatibility concerns are a large part of the problem here.  
>> The vast majority of the problems causing the CVEs would not occur in other 
>> languages, even C++ with its much stricter type conversion rules.  While we 
>> could no doubt do a lot of work to do stricter static analysis etc, there 
>> are diminishing returns and so it’s worth asking the question of is it worth 
>> it, and where the breaking point is where it is not.
> 
> No need to sugar coat it, the code was written before security for image 
> libraries was a thing. It shows. The code of the tools is even worse and 
> seems to be causing many of the bug reports. The thing is, many of its users 
> don't really have security concerns.
>  
>> It’s also that the wider world is starting to make hard demands upon 
>> software developers regarding software development practices.  It’s been 
>> coming in earnest since 2023, and 2026 is when the mandates start to begin 
>> for real.  Whether we like it or not, unsafe code elimination is now being 
>> mandated by national bodies in the USA as well as being copied by other 
>> countries around the world, and this will preclude a lot of legacy C 
>> libraries.  And C++, it might be a lot safer than C but it’s still 
>> imperfect.  If you want to write an application using libtiff, you now need 
>> to justify why you’re using a C library and not a safer alternative.  And 
>> the strictness will increase over time.  At some point you won’t be able to 
>> justify using an unsafe C library at all.  So for me the question is not so 
>> much “do we do this”, as “where do we do this”.  Do we do it as part of 
>> libtiff, and bring the C API along to help service the existing users.  Or 
>> do we create a separate branch or separate repository in the libtiff group 
>> on GitLab and do it there?  Or do we split entirely and do it completely 
>> independently?  Whatever we choose to do, the rest of the world will move on 
>> with or without us, and this is the one of the most important points behind 
>> this work.  Planning to be a part of the future by doing the work to remain 
>> relevant in it.
>  
> All very good questions. Especially as the current C API has some 
> fundamentally unsafe parts which may remain a risk as long as we carry them. 
> A case could be made that it should be deprecated or maintained as a separate 
> "unsafe" project and projects which care about security should only use newer 
> APIs.
> 
>> So the core thing I’d want out of the discussion is an agreed plan.  One 
>> possibility would be as proposed, refactor the core to be C++ with a C 
>> compatibility layer.  But another could be to plan to later iterate on that 
>> and then replace the core with Rust, again with a C (and C++) compatibility 
>> layer.  Or go directly to Rust with a C API [not suggesting we should do 
>> this, but bringing it as a possibility].  This would then allow downstream 
>> users of libtiff to plan this in and use libtiff with the justification that 
>> we’re doing the work to become safe, even if we’re not safe right now.
> 
> For what it's worth, I think a safe API should come first. It will take the 
> longest to be adopted as it requires people to change their code.

For all of the points you made above about having a safer C API, I’ve done some 
exploration of some of the options available to us to make the existing C API 
safer.

https://gitlab.com/libtiff/libtiff/-/issues/772

I’ve looked at error handling, safety, exposing more of the internal API for 
testing and tools to use, user extensibility and parallelism and codec-related 
configuration.  For each of them, it goes into various approaches, from 
minimally-invasive and most-compatible changes with least breakage, to more 
invasive with more breakage (but more gains).  These are an exploration of the 
options for discussion, and we can dig into these and any other suggested areas 
in more detail.

> I'm not familiar enough with Rust to know which way is the more sensible to 
> get there or if we even should. Another thing to consider is it will likely 
> shrink the pool of contributors compared to C++. More is not always better 
> but libtiff has never exactly had an abundance of volunteers.

This is all true. I don’t know it much yet myself, but I am currently learning 
it.  As I work in a regulated industry on embedded systems with functional 
safety requirements, it’s highly likely both C and C++ will eventually end up 
being phased-out and forbidden.  Not this year, but in the next few years.  
Once the vendors support native Rust HALs, that’s the primary blocker.  Change 
and adaption is just the nature of things.  C has had a very impressive ~55 
year run, which is much greater than most languages manage, but with this CISA 
mandate, I’m afraid for applications developers such as myself, it’s reaching 
its end and the writing is now on the wall.  The FDA and other regulatory 
bodies will ultimately cease to allow products written in unsafe languages 
entirely, and that will be the end of it.  I’ve worked on numerous projects in 
C99 as well as C++ up to C++20, but I suspect in 5 years time it will be 
Rust/ferrocene for new projects.  So far, I’ve been very impressed with its 
capabilities.

I have not used libtiff in any of these projects (it’s clearly far too unsafe 
to sensibly validate in a safety-critical system as it stands), but I have used 
it in non-critical test tooling in previous projects.

We can certainly look at improving the C API, and I’ll be able to contribute 
work towards that.  It would be a good step forward.  But that alone will not 
be sufficient to address the longer-term factors at play.


Kind regards,
Roger

_______________________________________________
Tiff mailing list
[email protected]
https://lists.osgeo.org/mailman/listinfo/tiff

Re: [Tiff] Proposal for conversion of the codebase to C++ while retaining the existing C API

Reply via email to