GitHub user ppkarwasz created a discussion: Log4j 2.x deserialization hardening

Hi all,

Below is a list of LLM-discovered findings (tool: Claude Code, model: Claude 
Opus 4.8 (1M context)). **These are not security issues.** Per our [threat 
model](https://logging.apache.org/log4j/2.x/security.html) (CWE-502 section), 
safe deserialization is not something Log4j guarantees, and bypasses of the 
`FilteredObjectInputStream` allow-list are explicitly treated as opportunities 
for further hardening, not as vulnerabilities. Serialization is dropped 
entirely in Log4j 3, so this is really for legacy 2.x purposes only.

That said, if anyone is interested in improving the 
serialization/deserialization paths, these are good starting points, and you 
are welcome to open issues and PRs.

The prompt was roughly: audit every `Serializable` class in the tree for 
deserialization gadget surface reachable through the 
`FilteredObjectInputStream` / `DefaultObjectInputFilter` allow-list. Findings 
are ranked by hardening priority (not by CVSS / exploitability, since there is 
no supported use case being broken here).

---

## 1. What this hardening is (and is not)

Recap of the relevant threat-model points, so the framing below is unambiguous:

- Log4j does **not** deserialize data as part of normal operation. Several 
classes still implement `Serializable` purely for backward compatibility; 
deserializing them is discouraged.
- We provide **no guarantee** that deserializing a stream containing classes 
from these projects is safe, regardless of the source of the stream.
- Filtering such a stream by the `org.apache.logging` Java package is **not** 
sufficient to make deserialization safe.
- The hardening utilities we ship are **partial** and **not exhaustive**; 
bypasses are opportunities for further hardening, not vulnerabilities.
- The application performing the deserialization is responsible for ensuring 
that the byte stream originates from a **trusted source**.

So the purpose of `FilteredObjectInputStream` / `DefaultObjectInputFilter` is 
**damage limitation**, not a safety boundary. The scenario it assists is: an 
application that, for legacy reasons, must deserialize a Log4j log-event stream 
it believes to be trusted, but where an attacker has gained **partial influence 
over that nominally trusted input**. The allow-list narrows what such a 
partially-tampered stream can instantiate; it does not promise that a fully 
attacker-controlled stream is safe.

A concrete example of this "trusted-but-tampered stream" shape is CVE-2020-9484 
in Apache Tomcat (9.0.0.M1 to 9.0.34): when Tomcat is configured with a 
`PersistenceManager` backed by a `FileStore`, session state is written to disk 
as serialized Java objects and deserialized again on load. Tomcat trusts that 
store, but an attacker who could plant a file at a known path (and knew its 
relative name) could get the manager to deserialize an attacker-crafted session 
and reach RCE via a gadget chain. Notably, Tomcat's own mitigation is exactly 
the same shape as ours: a class-name allow-list, 
`sessionAttributeValueClassNameFilter`, which was `null` (allow-all) by default.

The follow-up CVE-2021-25329 (fixed in 9.0.42) shows the recurring lesson: such 
allow-list filtering is a partial, best-effort mitigation whose edge cases get 
chipped away over time, which is precisely why our threat model declines to 
treat it as a guarantee. If an application ever persisted or transported 
serialized Log4j log events the way Tomcat persists sessions, the findings 
below are what an attacker with that partial write access could still reach 
through the allow-list.

## 2. The filter mechanism

There are two enforcement mechanisms, chosen at runtime, both enforcing the 
same allow-list (`SerializationUtil.REQUIRED_JAVA_PACKAGES` / 
`REQUIRED_JAVA_CLASSES`):

| Runtime | Mechanism | Where |
|---------|-----------|-------|
| Java 8 | `FilteredObjectInputStream` (overrides `resolveClass`) | 
`log4j-api/.../util/FilteredObjectInputStream.java` |
| Java 9+ | `DefaultObjectInputFilter` (a JEP 290 `ObjectInputFilter`, 
installed via `setObjectInputFilter`) | 
`log4j-api-java9/.../util/internal/DefaultObjectInputFilter.java` |

Allow-list: packages `java.lang.`, `java.time.`, `java.util.`, 
`org.apache.logging.log4j.`; plus classes `java.math.BigDecimal`, 
`java.math.BigInteger`, `java.rmi.MarshalledObject`, and all primitives.

Because the filter applies to the whole object graph, a gadget chain must 
normally be built entirely from those allowed packages. The bulk of the 
serializable surface honours this correctly.

**What is already well defended (no action needed):**

- The immutable event types reject direct deserialization outright: 
`Log4jLogEvent`, `MutableLogEvent`, `RingBufferLogEvent`, and 
`ThreadDumpMessage` each have a `readObject` that calls 
`SerializationUtil.assertFiltered(...)` and then throws 
`InvalidObjectException("Proxy required")`.
- Nested untrusted objects are read back through 
`SerializationUtil.readWrappedObject`, which wraps the bytes in a fresh 
filtered stream and re-applies the same allow-list to the nested graph (used by 
`ObjectMessage`, `ParameterizedMessage`, `SortedArrayStringMap` values).
- Interface-typed fields (`Message`, `Marker`, `MessageFactory`, ...) exist on 
many serializable classes, but they are only stored during deserialization, 
never invoked, and their concrete classes must still pass the filter. Not 
exploitable.

---

## 3. Findings summary

| # | Priority | Class / file | Issue |
|---|----------|--------------|-------|
| 1 | High | `Log4jLogEvent.LogEventProxy` | `MarshalledObject.get()` 
deserializes with **no** filter (allow-list escape) |
| 5 | High (gap) | `FilteredObjectInputStream` | `resolveProxyClass` not 
overridden: dynamic proxies bypass the allow-list on Java 8 |
| 2 | Low | `LocalizedMessage` | `ResourceBundle.getBundle(baseName)` on a 
deserialized string (class-load on attacker data) |
| 3 | Low (DoS) | `SortedArrayStringMap` | attacker-controlled `capacity` 
drives array allocation before validation (OOM) |
| 4 | Info | `ObjectArrayMessage`, `LocalizedMessage` | array fields read via 
raw `readObject` instead of `readWrappedObject` |

Findings 1 and 5 are the material ones: the only two places where the 
allow-list is genuinely *escaped* rather than merely *stretched*. Again, per 
section 1 these are not vulnerabilities and are out of scope for the bug 
bounty; "priority" ranks how worthwhile the hardening would be.

---

## 4. Finding 1 (High) — `MarshalledObject.get()` is an allow-list escape

**File:** 
`log4j-core/src/main/java/org/apache/logging/log4j/core/impl/Log4jLogEvent.java`
 (class `LogEventProxy`).

- Field `private MarshalledObject<Message> marshalledMessage;` (line ~1136) is 
**non-transient**, so it is part of the serialized form and fully 
attacker-controlled.
- Deserializing a `LogEventProxy` runs `readResolve()` (line ~1243) which calls 
`message()` (line ~1267), which calls `marshalledMessage.get()` (line ~1270).

`java.rmi.MarshalledObject` is on the allow-list ("for Message delegate"), but 
`MarshalledObject.get()` deserializes its embedded bytes on its own private 
`ObjectInputStream` (`MarshalledObjectInputStream`), which does **not** inherit 
Log4j's per-stream filter:

- On Java 8 (the runtime baseline) that stream has **no filter at all**.
- On Java 9+ the embedded filter is whatever was serialized into the 
`MarshalledObject`; an attacker who crafts the payload controls it (can be 
`null`).

Unless the deploying JVM sets a global `jdk.serialFilter`, the inner bytes are 
deserialized completely unfiltered: any classic gadget, not just a `Message`. 
This is the same gadget surface (CWE-502, historically CVE-2017-5645) the 
allow-list is meant to narrow as a damage-limiting measure. It does not violate 
a safety guarantee (there is none), but it means the allow-list provides 
essentially **zero** damage limitation for the delegate `Message` of a tampered 
`LogEventProxy`: the one field an attacker is most likely to weaponise is 
exactly the one that escapes the filter. That makes it the single most valuable 
place to extend the hardening.

**Reachability:** the old `SocketServer` / `TcpSocketServer` receivers are gone 
from 2.x main sources, so no shipped component feeds bytes here. The concern is 
the documented assist use case: an application that deserializes a Log4j 
log-event stream through `FilteredObjectInputStream` for legacy reasons, from a 
source it trusts, that an attacker has partially tampered with (the Tomcat 
persistent-session class of issue). For that scenario the delegate-message 
escape means the allow-list buys the application almost nothing.

**Fix options (in order of preference):**

1. Serialize/deserialize the delegate `Message` through 
`SerializationUtil.writeWrappedObject` / `readWrappedObject` (a filtered 
`byte[]` wrapper) instead of `MarshalledObject`.
2. Drop `marshalledMessage` entirely and rely on the already-present 
`messageString` fallback.
3. At minimum, remove `java.rmi.MarshalledObject` from the allow-list.

Options 1 and 2 change the serialized wire format and would need a 
compatibility note.

---

## 5. Finding 5 (High, hardening gap) — proxies bypass the filter on Java 8

**File:** 
`log4j-api/src/main/java/org/apache/logging/log4j/util/FilteredObjectInputStream.java`.

`FilteredObjectInputStream` overrides only `resolveClass(ObjectStreamClass)` 
and installs **no** `ObjectInputFilter`. Dynamic proxy class descriptors 
(`TC_PROXYCLASSDESC`) do not flow through `resolveClass`; the JDK routes them 
through `resolveProxyClass(String[] interfaces)` 
(`ObjectInputStream.readProxyDesc`). Since `FilteredObjectInputStream` does not 
override it, the default implementation runs and the allow-list is never 
consulted for a proxy or its interfaces.

The two hardening paths are asymmetric:

- **Java 9+** (`DefaultObjectInputFilter` via `setObjectInputFilter`): 
`readProxyDesc` calls `filterCheck` on every interface (`for (Class<?> clazz : 
cl.getInterfaces()) filterCheck(clazz, -1);`) and on the proxy class itself. 
The proxy class name (`jdk.proxy1.$Proxy0`) and any non-allow-listed interface 
are `REJECTED`. Covered.
- **Java 8** (`FilteredObjectInputStream`, the actual per-stream mechanism): 
`resolveProxyClass` is not overridden and no `ObjectInputFilter` is set, so 
`filterCheck` has nothing to enforce. Proxy interface names bypass the 
allow-list entirely. Not covered.

**How far it goes today:** a serialized `Proxy` still carries its 
`InvocationHandler` (field `h`), which is read as an ordinary object and so 
must pass `resolveClass` (allow-listed packages only). The classic proxy-gadget 
handler `sun.reflect.annotation.AnnotationInvocationHandler` is blocked, and 
there is no `Serializable` `InvocationHandler` in Log4j or in the allow-listed 
`java.*` packages. So a full proxy gadget chain is not constructible with the 
allow-listed set today. That is why it ranks below Finding 1: the 
damage-limitation still mostly holds because the handler must be allow-listed, 
but proxy interfaces are a category the allow-list silently does not cover on 
Java 8, so it is one allow-listed (or future) gadget handler away from 
mattering. It also composes with Finding 1: once inside the unfiltered 
`MarshalledObject.get()` stream, proxies are unrestricted regardless of runtime.

**Fix** — override `resolveProxyClass` to apply the same allow-list per 
interface (or reject proxies outright, as most hardened filters do). 
Self-contained, no wire-format change, no public API change:

```java
@Override
protected Class<?> resolveProxyClass(final String[] interfaces)
        throws IOException, ClassNotFoundException {
    for (final String intf : interfaces) {
        if (!(isAllowedByDefault(intf) || allowedExtraClasses.contains(intf))) {
            throw new InvalidObjectException("Interface is not allowed for 
deserialization: " + intf);
        }
    }
    return super.resolveProxyClass(interfaces);
}
```

---

## 6. Lower-severity findings

**Finding 2 (Low) — `LocalizedMessage` resource-bundle load on attacker 
string.** `log4j-api/.../message/LocalizedMessage.java`. After deserialization, 
`getFormattedMessage()` reaches `getResourceBundle(...)` which calls 
`ResourceBundle.getBundle(baseName)` on a deserialized string. 
`ResourceBundle.getBundle` can load a `ResourceBundle` subclass by name from 
the context class loader. Requires such a class already on the classpath and 
only fires on later formatting, so weak, but it is attacker-string-driven class 
loading. It also reads `stringArgs = (String[]) in.readObject()` via raw 
(filter-constrained) `readObject`.

**Finding 3 (Low, DoS) — `SortedArrayStringMap` unbounded pre-allocation.** 
`log4j-api/.../util/SortedArrayStringMap.java` (readExternal path, ~lines 
497-527). A deserialized `capacity` is passed to `inflateTable(capacity)`, 
allocating `new String[capacity]` / `new Object[capacity]` before the entries 
are read; only `capacity < 0` is rejected. A crafted large `capacity` causes 
OOM. Not code execution. Fix: bound `capacity` against the declared entry count 
/ a sane maximum.

**Finding 4 (Info) — raw `readObject` array reads.** `ObjectArrayMessage` 
(`(Object[]) in.readObject()`) and `LocalizedMessage` (`String[]`) read arrays 
via raw `readObject` instead of `readWrappedObject`. Still bounded by the outer 
filter and no method is invoked on the elements at deserialization time, so no 
gadget fires. This is a deliberate design choice (see LOG4J2-3680, "Allow 
deserialization of arrays"); noted only for consistency.

---

## 7. Related prior / in-progress work

- **PR #4098** (merged) "Harden `readObject(ObjectInputStream)` method argument 
checks": added `SerializationUtil.assertFiltered()` to 
`ObjectArrayMessage.readObject()`, matching `ObjectMessage` / 
`ParameterizedMessage`. The security team triaged that family as "Informative — 
not a vulnerability, code-quality improvement welcomed," and it explicitly left 
`LocalizedMessage` and `FormattedMessage` as known follow-ups (overlaps Finding 
2).
- **LOG4J2-3680** (commit `71ca0865b8`) "Allow deserialization of arrays": the 
raw array reads in Finding 4 are intentional.
- No open issue or PR currently addresses `MarshalledObject.get()` (Finding 1) 
or `resolveProxyClass` (Finding 5); both appear to be new.

Note on `assertFiltered`: it only checks that the stream is a 
`FilteredObjectInputStream` or that the JDK exposes `setObjectInputFilter`. On 
Java 9+ it does not verify that an actual filter is installed on a plain 
`ObjectInputStream`, so it is a coarse gate rather than a guarantee. Secondary 
to the above.

---

## 8. Suggested starting points, if anyone wants to pick this up

1. Finding 5 first — override `resolveProxyClass` in 
`FilteredObjectInputStream`. Self-contained, no wire-format or API change, 
closes the Java 8 asymmetry. Cheap win.
2. Finding 1 — route the `LogEventProxy` message delegate through 
`readWrappedObject` (or drop `marshalledMessage`), and remove 
`java.rmi.MarshalledObject` from the allow-list. Requires a wire-compatibility 
note.
3. Finding 3 — bound `SortedArrayStringMap` `capacity` on read.
4. Finding 2 — fold `LocalizedMessage` / `FormattedMessage` into the same 
`assertFiltered` + `readWrappedObject` treatment the security team already 
anticipated in the #4098 discussion.

GitHub link: https://github.com/apache/logging-log4j2/discussions/4168

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to