* Demi Marie Obenour:
> On 2/27/26 14:39, Florian Weimer wrote:
>> * Alan Coopersmith:
>>
>>> https://sympa.inria.fr/sympa/arc/ocsf-ocaml-security-announcements/2026-02/msg00000.html
>>> announces:
>>>> From: Hannes Mehnert <[email protected]>
>>>> To: [email protected]
>>>> Subject: [ocsf-ocaml-security-announcements] OSEC-2026-01 in the OCaml
>>>> runtime: Buffer Over-Read in OCaml Marshal Deserialization
>>>> Date: Tue, 17 Feb 2026 15:16:54 +0100
>>>> Dear everyone,
>>>> it is my pleasure to announce the first security announcement of
>>>> this year,
>>>> and the first on this mailing list.
>>>> It should any moment now also appear at
>>>> https://osv.dev/list?q=OSEC-2026-01
>>>> Human link:
>>>> https://github.com/ocaml/security-advisories/tree/main/advisories/2026/OSEC-2026-01.md
>>
>> Surprised to read this. I think this comment from 2018 is still
>> appropriate:
>>
>> | Marshal should not used in contexts where an attacker can control the
>> | data. I don't believe it is, at least in any project I'm aware of, and
>> | if it were, it's unlikely that those project perform enough check on
>> | the result of Marshal to make the use safe anyway.
>>
>> <https://github.com/ocaml/ocaml/issues/7765#issuecomment-473076288>
>>
>> The demarshaller does not have access to type information from the
>> program, so it has the ability to construct an arbitrary object graph.
>
> That is indeed true. However, unlike in many other languages, this
> does not directly allow arbitrary code execution.
Not really.
This code
type x = A of int | B of int | C of int | D of int | E of int
let f x fA fB fC fD fE =
match x with
| A a -> fA a
| B b -> fB b
| C c -> fC c
| D d -> fD d
| E e -> fE e
gets compiled to:
0000000000000000 <camlBlah.f_5>:
0: 55 push %rbp
1: 48 89 e5 mov %rsp,%rbp
4: 49 89 c0 mov %rax,%r8
7: 49 89 d1 mov %rdx,%r9
a: 4d 3b 3e cmp (%r14),%r15
d: 76 51 jbe 60 <camlBlah.f_5+0x60>
f: 49 0f b6 40 f8 movzbq -0x8(%r8),%rax
14: 48 8d 15 00 00 00 00 lea 0x0(%rip),%rdx # 1b
<camlBlah.f_5+0x1b>
17: R_X86_64_PC32 .rodata-0x4
1b: 48 63 04 82 movslq (%rdx,%rax,4),%rax
1f: 48 01 c2 add %rax,%rdx
22: ff e2 jmp *%rdx
24: 49 8b 00 mov (%r8),%rax
27: 48 8b 3b mov (%rbx),%rdi
2a: 5d pop %rbp
2b: ff e7 jmp *%rdi
2d: 0f 1f 00 nopl (%rax)
30: 49 8b 00 mov (%r8),%rax
33: 48 8b 37 mov (%rdi),%rsi
36: 48 89 fb mov %rdi,%rbx
39: 5d pop %rbp
3a: ff e6 jmp *%rsi
3c: 49 8b 00 mov (%r8),%rax
3f: 48 8b 3e mov (%rsi),%rdi
42: 48 89 f3 mov %rsi,%rbx
45: 5d pop %rbp
46: ff e7 jmp *%rdi
48: 49 8b 00 mov (%r8),%rax
4b: 49 8b 39 mov (%r9),%rdi
4e: 4c 89 cb mov %r9,%rbx
51: 5d pop %rbp
52: ff e7 jmp *%rdi
54: 49 8b 00 mov (%r8),%rax
57: 48 8b 39 mov (%rcx),%rdi
5a: 48 89 cb mov %rcx,%rbx
5d: 5d pop %rbp
5e: ff e7 jmp *%rdi
60: e8 00 00 00 00 call 65 <camlBlah.f_5+0x65>
61: R_X86_64_PLT32 caml_call_gc-0x4
65: eb a8 jmp f <camlBlah.f_5+0xf>
67: 66 0f 1f 84 00 00 00 nopw 0x0(%rax,%rax,1)
6e: 00 00
Add offset 0x1b, there's the tag load, and this tag is used to index a
jump table without a bounds check.
Admittedly, This does not give full control over program execution
directly. One would have to search for a suitable gadget. There are
likely better ways to exploit unsafe demarshalling, this is just the
first approach I could think of.
Thanks,
Florian