JinwooHwang commented on PR #7941:
URL: https://github.com/apache/geode/pull/7941#issuecomment-3617666432
Hi @sboorlagadda,
Thank you so much for your thorough review and for raising these thoughtful
architectural concerns. I appreciate the time and care you’ve taken to consider
the design in depth. I would like to address each of your points with
supporting technical evidence and references to industry best practices.
## Why These Must Be Separate Systems
The architectural concern lists four issues:
- Configuration complexity (two separate whitelist systems)
- Operational burden (maintaining dual security policies)
- User confusion (different config for sessions vs regions)
- Potential security gaps (inconsistent policies)
These aren't problems - they're the expected and necessary characteristics
of layered security architecture. Each concern reflects proper separation of
security boundaries, not architectural weakness.
This separation is intentional. The two layers serve different purposes and
protect distinct attack surfaces with different threat models.
**Geode's infrastructure filter protects:**
- Cluster communication (peer-to-peer messages)
- Internal region operations
- Distributed system metadata
- Cache consistency protocols
- Threats: Malicious cluster members, corrupted internal data, version
incompatibilities
**Session filter must protect:**
- HTTP request boundary (untrusted user input)
- Application-level objects from web clients
- Session hijacking attempts
- Deserialization gadget chains in user data
- Threats: External attackers via HTTP, compromised client applications, RCE
exploits
These are fundamentally different security boundaries.
**Real-world analogy:**
Your firewall filters network packets. Your application still validates HTTP
input. Both are active simultaneously, both filter "data entering the system,"
but at different trust boundaries with different policies. You wouldn't merge
them just because they both filter.
Similarly:
- Network layer: Firewall blocks malicious IPs, port scans, DDoS
- Application layer: WAF blocks SQL injection, XSS, malformed requests
Both active. Different purposes. Different policies.
## Industry Validation: Defense-in-Depth
This approach isn’t something I invented; it’s a well-established industry
practice:
**Spring Framework (OWASP layered security):**
- Web layer: OncePerRequestFilter, CSRF protection, HTTP request filters,
secure headers
Handles HTTP-level security (authentication, session management, CSRF, input
validation)
Operates independently from service and data layers.
- Service layer: @PreAuthorize, @PostAuthorize, @Secured annotations,
method-level security
Enforces authorization and business rules at the service/business-logic
level.
Provides a separate control boundary independent of web filters.
- Data layer: Safe data-access patterns (prepared statements, parameter
binding in JDBC/JPA)
Protects against SQL injection when used correctly.
Each layer has independent security controls. Web-layer filters are not
merged with service or database security, enabling defense-in-depth and
reducing the impact of a single-layer compromise.
**AWS multi-layer architecture:**
- ALB/WAF: HTTP-level filtering
- Lambda/EC2: IAM roles, network security groups
- RDS/DynamoDB: Database-specific encryption, access controls
**Apache Kafka (CWE-502 context-specific filters):**
- Internal Protocol: Kafka's binary protocol, only Kafka internal classes
- User Data: Configurable deserializers (StringDeserializer,
ByteArrayDeserializer, or custom).
- Separation: User data serialization should be isolated from internal
messages.
**Apache Cassandra:**
- Internal cluster communication: Custom binary protocol, only Cassandra
classes
- User data (CQL queries): User-defined types (UDTs) and application objects
can be queried and stored.
**Hazelcast (direct competitor of our Apache Geode):**
```java
Config config = new Config();
// Internal cluster objects
config.getSerializationConfig()
.addDataSerializableFactory(100, new MyInternalFactory());
// Application/user objects
config.getSerializationConfig()
.addPortableFactory(200, new MyApplicationFactory())
.setAllowUnsafe(false);
```
Hazelcast explicitly separates internal vs application serialization.
**Microsoft SDL Trust Boundary Analysis (accurate, generalized):**
- Trust Boundary 1: External (HTTP clients) → apply safe deserialization and
input validation; use a whitelist of classes for deserialization.
- Trust Boundary 2: Internal (cluster members) → validate serialized objects
exchanged between nodes; apply appropriate deserialization safeguards.
- Principle: Different trust boundaries have different threat models,
requiring independent controls.
**Real-world CVE fixes validate this:**
- CVE-2015-7501 (JBoss): Fixed by safe deserialization and class
whitelisting in HTTP/EJB entry points.
- CVE-2016-0638 (WebLogic): Fixed by restricting allowed classes for T3
protocol and HTTP deserialization.
- CVE-2017-5645 (Log4j): Fixed by patching JMS deserialization logic and
enforcing allowed class lists.
Principle: These CVEs demonstrate the importance of context-aware,
layer-specific deserialization safeguards, rather than generic “layered
filters.”
Industry consensus: Multi-layered, context-specific filters are the standard.
## Addressing the Specific Concerns
**"Configuration complexity (two separate whitelist systems)"**
This is defense-in-depth, not complexity. Each boundary needs its own policy:
- Geode allowlist: 485 classes for internal operations (ResourceManager,
StatisticDescriptor, DistributionMessage, etc.)
- Session allowlist: 80 classes for user data (String, Integer, ArrayList,
HashMap, etc.)
**"Operational burden (maintaining dual security policies)"**
The policies serve different purposes and must be maintained separately:
Geode's 485-class allowlist includes:
- org.apache.geode.internal.cache.tier.sockets.command.*
- org.apache.geode.distributed.internal.*
- org.apache.geode.management.internal.*
- Classes needed for cluster operations
Session's 80-class allowlist includes:
- java.lang.String, Integer, Boolean
- java.util.ArrayList, HashMap, HashSet
- java.util.Date, Calendar
- Classes safe for user-facing web applications
These lists have zero overlap in purpose.
**"User confusion (different config for sessions vs regions)"**
This is expected and correct. Different architectural layers have different
configuration:
**Infrastructure layer (managed by cluster admins):**
- Geode cluster setup
- Region replication policies
- `validate-serializable-objects` configuration
- Internal class allowlist (485 Geode classes)
**Application layer (managed by application developers):**
- Session timeout settings
- Custom session class registration via
`SafeDeserializationFilter.allowAdditionalClasses()`
- Application-specific allowlist (80 JDK classes + custom classes)
Having different configuration at different layers isn't confusion - it's
separation of concerns.
**"Potential security gaps (inconsistent policies)"**
The "gap" is actually the security boundary. Inconsistent policies are
correct here - they enforce isolation:
- HTTP boundary: Strict 80-class allowlist (only safe user data types)
- Cluster boundary: 485-class allowlist (internal operations need more
classes)
The security gap exists when boundaries don't have appropriate policies.
PR-7941 adds the missing policy at the HTTP boundary.
## Discussion Points Answered
**1. Should we extend existing Geode serialization infrastructure?**
I don't think we should, because they protect different things:
- Geode infrastructure: Protects cluster integrity (peer communication,
internal operations)
- Session filter: Protects against external attackers via HTTP
The fix must be at the HTTP boundary with an HTTP-appropriate policy.
**2. How do we provide unified configuration for users?**
I don't believe we do, because unification breaks isolation. The
configurations serve different purposes and are typically managed by different
roles:
**3. What's the migration path for existing session deployments?**
Zero migration for standard use cases:
- PR-7941 includes 80 common JDK classes in the hardcoded allowlist
- Covers String, Integer, Collections, Date, etc.
Existing sessions stored in regions continue working. The filter only
applies during deserialization, and legitimate classes pass through once added
to the allowlist.
**4. How do we handle the different threat models?**
By maintaining appropriate policies at appropriate boundaries:
**Web application threat model:**
- Attacker: External, untrusted HTTP clients
- Attack vector: Crafted session attributes with gadget chains
- Required defense: Strict allowlist of safe user-facing classes (80 classes)
- Boundary: HTTP session setAttribute/getAttribute
**Distributed cache threat model:**
- Attacker: Compromised cluster member, corrupted internal data
- Attack vector: Malicious region data, corrupted metadata
- Required defense: Allowlist of internal operation classes (485 classes)
- Boundary: Cluster communication, region operations
The correct approach: Different threats require different policies at
different boundaries. This is standard security architecture.
## The Complete Architecture: Two Filters, Two Boundaries
PR-7941 implements defense-in-depth with filters at both security boundaries:
```
┌──────────────────────────────────────────────────────────────────┐
│ External Attacker (HTTP Client) │
└────────────────┬─────────────────────────────────────────────────┘
│
▼
┌──────────────────────────────┐
│ SafeDeserializationFilter │ ← NEW: PR-7941
│ (80 JDK classes) │ HTTP Boundary Protection
└──────────────┬───────────────┘
│ ✓ String, Integer, ArrayList allowed
│ ✗ ResourceManager, CacheConfig BLOCKED
▼
┌──────────────────────────────┐
│ Session Object │
│ (validated data) │
└──────────────┬───────────────┘
│
▼
┌──────────────────────────────┐
│ Geode Region │
│ (serialized bytes) │
└──────────────┬───────────────┘
│
┌──────────────▼───────────────┐
│ Region Replication │
│ Internal Operations │
│ Cluster Communication │
└──────────────┬───────────────┘
│
▼
┌──────────────────────────────┐
│ validate-serializable- │ ← EXISTING: Geode Infrastructure
│ objects │ Cluster Boundary Protection
│ (485 Geode classes) │
└──────────────┬───────────────┘
│ ✓ BucketAdvisor, DistributionMessage allowed
│ ✗ Untrusted application classes BLOCKED
▼
┌──────────────────────────────┐
│ Cluster Member │
│ (peer node) │
└──────────────────────────────┘
```
**Two independent security layers designed for different boundaries:**
1. **SafeDeserializationFilter** (HTTP → Session) - **NEW, Always Active:**
- Protects against external attackers sending malicious session attributes
- Prevents gadget chain RCE attacks via HTTP (CVSS 9.8 vulnerability)
- Blocks internal Geode classes (ResourceManager, CacheConfig) from HTTP
input
- Allows only safe user-facing types (String, Integer, Collections)
2. **validate-serializable-objects** (Cluster Operations) - **Existing,
Opt-in (disabled by default):**
- When enabled, protects against corrupted internal data during
replication
- Prevents malicious cluster members from injecting bad classes
- Blocks arbitrary application classes from internal operations
- Allows only Geode infrastructure classes needed for cluster integrity
**Why separate filters at separate boundaries:**
- **Session data CAN flow through both filters** - but at different points
in its lifecycle:
- Filter 1 (SafeDeserializationFilter) applies when user calls
`session.setAttribute()` (HTTP boundary) - **always active**
- Filter 2 (validate-serializable-objects) applies when Geode replicates
session region to other nodes (cluster boundary) - **only if enabled by admin**
- **Different attack vectors require different defenses:**
- HTTP attacker tries to inject malicious classes via session → Filter 1
blocks (new protection)
- Compromised cluster member tries to inject malicious classes via
replication → Filter 2 blocks (if enabled)
- **Critical insight: PR-7941 adds the FIRST layer of defense:**
- **Before PR-7941:** Zero protection at HTTP boundary (CVSS 9.8 RCE
vulnerability)
- **After PR-7941:** HTTP boundary protected with strict 80-class allowlist
- **Geode's cluster filter:** Exists but disabled by default - users can
enable for additional defense-in-depth
**This is the same pattern every distributed system uses:**
- **Spring Security:** FilterChainProxy with multiple filters for different
URL patterns and layers
- **AWS:** Security groups (network) + IAM policies (identity) + S3 bucket
policies (data)
- **Kubernetes:** NetworkPolicies (network) + RBAC (API) +
PodSecurityPolicies (runtime)
- **Database systems:** Connection authentication + SQL authorization +
row-level security
The architectural principle: **Match security controls to trust boundaries,
not to minimize filter count.**
**Current state and impact of PR-7941:**
- **Before PR-7941:**
- HTTP boundary (most exposed): No protection
- Cluster boundary: Optional protection (disabled by default)
- **After PR-7941:**
- HTTP boundary: **Always protected** with SafeDeserializationFilter
- Cluster boundary: Optional protection (unchanged - users can enable if
needed)
This PR closes the critical vulnerability by adding mandatory protection at
the HTTP boundary where the attack occurs. The cluster-level filter remains
available as an optional additional layer for users who want defense-in-depth
against internal threats.
Before this PR, Geode had zero protection at the HTTP boundary - the most
exposed attack surface. This fix adds the missing layer where the vulnerability
exists.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]