Data Oriented Programming, Beyond Records

Brian Goetz Tue, 13 Jan 2026 13:53:33 -0800

Here's a snapshot of where my head is at with respect to extending therecord goodies (including pattern matching) to a broader range ofclasses, deconstructors for classes and interfaces, and compatibleevolution of records. Hopefully this will unblock quite a few things.


As usual, let's discuss concepts and directions rather than syntax.




# Data-oriented Programming for Java: Beyond records

Everyone loves records; they allow us to create shallowly immutable dataholderclasses -- which we can think of as "nominal tuples" -- derived from aconcise

state description, and to destructure records through pattern matching.  But

records have strict constraints, and not all data holder classes fitinto the

restrictions of records.  Maybe they have some mutable state, or derived or

cached state that is not part of the state description, or theirrepresentation

and their API do not match up exactly, or they need to break up their state
across a hierarchy.  In these classes, even though they may also be “data
holders”, the user experience is like falling off a cliff.  Even a small

deviation from the record ideal means one has to go back to a blankslate and

write explicit constructor declarations, accessor method declarations, and

Object method implementations -- and give up on destructuring throughpattern

matching.

Since the start of the design process for records, we’ve kept in mindthe goalof enabling a broader range of classes to gain access to the "recordgoodies":

reduced declaration burden, participating in destructuring, and soon,

[reconstruction](https://openjdk.org/jeps/468). During the design ofrecords, wealso explored a number of weaker semantic models that would allow forgreater

flexibility. While at the time they all failed to live up to the goals _for
records_, there is a weaker set of semantic constraints we can impose that

allows for more flexibility and still enables the features we want,along withsome degree of syntactic concision that is commensurate with thedistance from

the record-ideal, without fall-off-the-cliff behaviors.

Records, sealed classes, and destructuring with record patternsconstitute thefirst feature arc of "data-oriented programming" for Java. Afterconsidering

numerous design ideas, we're now ready to move forward with the next "data
oriented programming" feature arc: _carrier classes_ (and interfaces.)

## Beyond record patterns

Record patterns allow a record instance to be destructured into itscomponents.

Record patterns can be used in `instanceof` and `switch`, and when a record

pattern is also exhaustive, will be usable in the upcoming [_patternassignmentstatement_](https://mail.openjdk.org/pipermail/amber-spec-experts/2026-January/004306.html)feature.

In exploring the question "how will classes be able to participate inthe same

sort of destructuring as records", we had initially focused on a new form of

declaration in a class -- a "deconstructor" -- that operated as aconstructor inreverse. Just as a constructor takes component values and produces anaggregate

instance, a deconstructor would take an aggregate instance and recover its
component values.

But as this exploration played out, the more interesting question turnedout to

be: which classes are suitable for destructuring in the first place? And the
answer to that question led us to a different approach for expressing

deconstruction. The classes that are suitable for destructuring arethose that,like records, are little more than carriers for a specific tuple ofdata. This

is not just a thing that a class _has_, like a constructor or method, but
something a class _is_.  And as such, it makes more sense to describe

deconstruction as a top-level property of a class. This, in turn, leadsto a

number of simplifications.

## The power of the state description

Records are a semantic feature; they are only incidentally concise. Butthey

_are_ concise; when we declare a record

    record Point(int x, int y) { ... }

we automatically get a sensible API (canonical constructor, deconstruction
pattern, accessor methods for each component) and implementation (fields,

constructor, accessor methods, Object methods.) We can explicitlyspecify mostof these (except the fields) if we like, but most of the time we don'thave to,

because the default is exactly what we want.

A record is a shallowly-immutable, final class whose API andrepresentation are

_completely defined_ by its _state description_.  (The slogan for records is

"the state, the whole state, and nothing but the state.") The statedescriptionis the ordered list of _record components_ declared in the record'sheader. Acomponent is more than a mere field or accessor method; it is an APIelement on

its own, describing a state element that instances of the class have.

The state description of a record has several desirable properties:

- The components in the order specified, are the _canonical_description of the

   record's state.
 - The components are the _complete_ description of the record’s state.
 - The components are _nominal_; their names are a committed part of the
   record's API.

Records derive their benefits from making two commitments:

- The _external_ commitment that the data-access API of a record(constructor, deconstruction pattern, and component accessor methods) is definedby the

   state description.
 - The _internal_ commitments that the _representation_ of the record (its
   fields) is also completely defined by the state description.

These semantic properties are what enable us to derive almost everythingaboutrecords. We can derive the API of the canonical constructor because thestate

description is canonical.  We can derive the API for the component accessor
methods because the state description is nominal.  And we can derive a

deconstruction pattern from the accessor methods because the statedescriptionis complete (along with sensible implementations for the state-related`Object`

methods.)

The internal commitment that the state description is also therepresentation

allows us to completely derive the rest of the implementation. Records get a
(private, final) field for each component, but more importantly, there is a

clear mapping between these fields and their corresponding components,which is

what allows us to derive the canonical constructor and accessor method
implementations.

Records can additionally declare a _compact constructor_ that allows usto elidethe boilerplate aspects of record constructors -- the argument list andfieldassignments -- and just specify the code that is _not_ mechanicallyderivable.

This is more concise, less error-prone, and easier to read:

    record Rational(int num, int denom) {
        Rational {
            if (denom == 0)

throw new IllegalArgumentException("denominator cannotbe zero");

        }
    }

is shorthand for the more explicit

    record Rational(int num, int denom) {
        Rational(int num, int denom) {
            if (denom == 0)

throw new IllegalArgumentException("denominator cannotbe zero");

            this.num = num;
            this.denom = denom;
        }
    }

While compact constructors are pleasantly concise, the more importantbenefit isthat by eliminating the mechanically derivable code, the "moreinteresting" code

comes to the fore.

Looking ahead, the state description is a gift that keeps on giving.  These

semantic commitments are enablers for a number of potential futurelanguage and

library features for managing object lifecycle, such as:

- [Reconstruction](https://openjdk.org/jeps/468) of record instances,allowing

   the appearance of controlled mutation of record state.
 - Automatic marshalling and unmarshalling of record instances.
 - Instantiating or destructuring record instances identifying components
   nominally rather than positionally.

### Reconstruction

JEP 468 proposes a mechanism by which a new record instance can bederived froman existing one using syntax that is evocative of direct mutation, via a`with`

expression:

    record Complex(double re, double im) { }
    Complex c = ...
    Complex cConjugate = c with { im = -im; };

The block on the right side of `with` can contain any Java statements,not justassignments. It is enhanced with mutable variables (_componentvariables_) foreach component of the record, initialized to the value of that componentin therecord instance on the left, the block is executed, and a new recordinstance iscreated whose component values are the ending values of the componentvariables.

A reconstruction expression implicitly destructures the record instanceusing

the canonical deconstruction pattern, executes the block in a scope enhanced

with the component variables, and then creates a new record using thecanonicalconstructor. Invariant checking is centralized in the canonicalconstructor, soif the new state is not valid, the reconstruction will fail. JEP 468has been

"on hold" for a while, primarily because we were waiting for sufficient
confidence that there was a path to extending it to suitable classes before

committing to it for records. The ideal path would be for those classesto also

support a notion of canonical constructor and deconstruction pattern.

Careful readers will note a similarity between the transformation block of a
`with` expression and the body of a compact constructor.  In both cases, the

block is "preloaded" with a set of component variables, initialized tosuitable

starting values, the block can mutate those variables as desired, and upon
normal completion of the block, those variables are passed to a canonical
constructor to produce the final result.  The main difference is where the

starting values come from; for a compact constructor, it is from theconstructor

parameters, and for a reconstruction expression, it is from the canonical
deconstruction pattern of the source record to the left of `with`.

### Breaking down the cliff

Records make a strong semantic commitment to derive both their API and

representation from the state description, and in return get a lot ofhelp from

the language.  We can now turn our attention to smoothing out "the cliff" --

identifying weaker semantic commitments that classes can make that wouldstillallow classes to get _some_ help from the language. And ideally, theamount of

help you give up would be proportional to the degree of deviation from the
record ideal.

With records, we got a lot of mileage out of having a complete, canonical,
nominal state description.  Where the record contract is sometimes too
constraining is the _implementation_ contract that the representation aligns

exactly with the state description, that the class is final, that thefields are

final, and that the class may not extend anything but `Record`.

Our path here takes one step back and one step forward: keeping the external

commitment to the state description, but dropping the internalcommitment thatthe state description _is_ the representation -- and then _adding back_a simplemechanism for mapping fields representing components back to theircorresponding

components, where practical.  (With records, because we derive the

representation from the state description, this mapping can be safelyinferred.)

As a thought experiment, imagine a class that makes the externalcommitment to a

state description -- that the state description is a complete, canonical,
nominal description of its state -- but is on its own to provide its

representation. What can we do for such a class? Quite a bit,actually. Forall the same reasons we can for records, we can derive the APIrequirement for acanonical constructor and component accessor methods. From there, wecan derive

both the requirement for a canonical deconstruction pattern, and also the

implementation of the deconstruction pattern (as it is implemented interms of

the accessor methods). And since the state description is complete, we can

further derive sensible default implementations of the Object methods`equals`,`hashCode`, and `toString` in terms of the accessor methods as well. Andgiventhat there is a canonical constructor and deconstruction pattern, it canalso

participate in reconstruction.  The author would just have to provide the

fields, accessor methods, and canonical constructor. This is goodprogress, but

we'd like to do better.

What enables us to derive the rest of the implementation for records(fields,constructor, accessor methods, and Object methods) is the knowledge ofhow the

representation maps to the state description.  Records commit to their state

description _being_ the representation, so is is a short leap from thereto a

complete implementation.

To make this more concrete, let's look at a typical "almost record" class, a

carrier for the state description `(int x, int y, Optional<String> s)`but which

has made the representation choice to internally store `s` as a nullable
`String`.

```
class AlmostRecord {
    private final int x;
    private final int y;
    private final String s;                                 // *

    public AlmostRecord(int x, int y, Optional<String> s) {
        this.x = x;
        this.y = y;
        this.s = s.orElse(null);                            // *
    }

    public int x() { return x; }
    public int y() { return y; }
    public Optional<String> s() {
        return Optional.ofNullable(s);                      // *
    }

public boolean equals(Object other) { ... } // derived fromx(), y(), s()

    public int hashCode() { ... }                   //    "
    public String toString() { ... }                //    "
}
```

The main differences between this class and the expansion of its recordanalogueare the lines marked with a `*`; these are the ones that deal with thedisparitybetween the state description and the actual representation. It wouldbe niceif the author of this class _only_ had to write the code that wasdifferent fromwhat we could derive for a record; not only would this be pleasantlyconcise,

but it would mean that all the code that _is_ there exists to capture the
differences between its representation and its API.

## Carrier classes

A _carrier class_ is a normal class declared with a state description. As witha record, the state description is a complete, canonical, nominaldescription ofthe class's state. In return, the language derives the same APIconstraints asit does for records: canonical constructor, canonical deconstructionpattern,

and component accessor methods.

   class Point(int x, int y) {                // class, not record!
       // explicitly declared representation

       ...

       // must have a constructor taking (int x, int y)
       // must have accessors for x and y
       // supports a deconstruction pattern yielding (int x, int y)
   }

Unlike a record, the language makes no assumptions about the object's
representation; the class author has to declare that just as with any other
class.

Saying the state description is "complete" means that it carries all the

“important” state of the class -- if we were to extract this state andrecreatethe object, that should yield an “equivalent” instance. As withrecords, thiscan be captured by tying together the behavior of construction,accessors, and

equality:

```
Point p = ...
Point q = new Point(p.x(), p.y());
assert p.equals(q);
```

We can also derive _some_ implementation from the information we have sofar; wecan derive sensible implementations of the `Object` methods (implementedin termsof component accessor methods) and we can derive the canonicaldeconstructionpattern (again in terms of the component accessor methods). And fromthere, wecan derive support for reconstruction (`with` expressions.)Unfortunately, wecannot (yet) derive the bulk of the state-related implementation: thecanonical

constructor and component accessor methods.

### Component fields and accessor methods

One of the most tedious aspects of data-holder classes is the accessormethods;there are often many of them, and they are almost always pureboilerplate. Eventhough IDEs can reduce the writing burden by generating these for us,readersstill have to slog through a lot of low-information code -- just tolearn thatthey didn't actually need to slog through that code after all. We canderive

the implementation of accessor methods for records because records make the

internal commitment that the components are all backed with individualfields

whose name and type align with the state description.

For a carrier class, we don't know whether _any_ of the components aredirectlybacked by a single field that aligns to the name or type of thecomponent. Butit is a pretty good bet that many carrier class components will doexactly this

for at least _some_ of their fields.  If we can tell the language that this
correspondence is not merely accidental, the language can do more for us.

We do so by allowing suitable fields of a carrier class to be declared as

`component` fields. (As usual at this stage, syntax is provisional, butnotcurrently a topic for discussion.) A component field must have the samenameand type as a component of the current class (though it need not be`private` or

`final`, as record fields are.)  This signals that this field _is_ the
representation for the corresponding component, and hence we can derive the
accessor method for this component as well.

```
class Point(int x, int y) {
    private /* mutable */ component int x;
    private /* mutable */ component int y;

    // must have a canonical constructor, but (so far) must be explicit
    public Point(int x, int y) {
        this.x = x;
        this.y = y;
    }

    // derived implementations of accessors for x and y
    // derived implementations of equals, hashCode, toString
}
```

This is getting better; the class author had to bring the representationand the

mapping from representation to components (in the form of the `component`
modifier), and the canonical constructor.

### Compact constructors

Just as we are able to derive the accessor method implementation if we are

given an explicit correspondence between a field and a component, we cando the

same for constructors.  For this, we build on the notion of _compact
constructors_ that was introduced for records.

As with a record, a compact constructor in a carrier class is ashorthand for acanonical constructor, which has the same shape as the statedescription, butwhich is freed of the responsibility of actually committing the endingvalue of

the component parameters to the fields.  The main difference is that for a

record, _all_ of the components are backed by a component field, whereasfor a

carrier class, only some of them might be.  But we can generalize compact
constructors by freeing the author of the responsibility to initialize the

_component_ fields, while leaving them responsible for initializing therest ofthe fields. In the limiting case where all components are backed bycomponent

fields, and there is no other logic desired in the constructor, the compact
constructor may be elided.

For our mutable `Point` class, this means we can elide nearlyeverything, except

the field declarations themselves:

```
class Point(int x, int y) {
    private /* mutable */ component int x;
    private /* mutable */ component int y;

    // derived compact constructor
    // derived accessors for x, y
    // derived implementations of equals, hashCode, toString
}
```

We can think of this class as having an implicit empty compact constructor,

which in turn means that the component fields `x` and `y` areinitialized fromtheir corresponding constructor parameters. There are also implicitlyderived

accessor methods for each component, and implementations of `Object` methods
based on the state description.

This is great for a class where all the components are backed by fields, but
what about our `AlmostRecord` class?  The story here is good as well; we can

derive the accessor methods for the components backed by componentfields, and

we can elide the initialization of the component fields from the compact

constructor, meaning that we _only_ have to specify the code for theparts that

deviate from the "record ideal":

```
class AlmostRecord(int x,
                   int y,
                   Optional<String> s) {

    private final component int x;
    private final component int y;
    private final String s;

    public AlmostRecord {
        this.s = s.orElse(null);
        // x and y fields implicitly initialized
    }

    public Optional<String> s() {
        return Optional.ofNullable(s);
    }

    // derived implementation of x and y accessors
    // derived implementation of equals, hashCode, toString
}
```

Because so many real-world almost-records differ from their record ideal in

minor ways, we expect to get a significant concision benefit for mostcarrier

classes, as we did for `AlmostRecord`.  As with records, if we want to

explicitly implement the constructor, accessor methods, or `Object`methods, we

are still free to do so.

### Derived state

One of the most frequent complaints about records is the inability to derive
state from the components and cache it for fast retrieval.  With carrier

classes, this is simple: declare a non-component field for the derivedquantity,

initialize it in the constructor, and provide an accessor:

```
class Point(int x, int y) {
    private final component int x;
    private final component int y;
    private final double norm;

    Point {
        norm = Math.hypot(x, y);
    }

    public double norm() { return norm; }

    // derived implementation of x and y accessors
    // derived implementation of equals, hashCode, toString
}
```

### Deconstruction and reconstruction

Like records, carrier classes automatically acquire deconstructionpatterns thatmatch the canonical constructor, so we can destructure our `Point` classas if

it were a record:

    case Point(var x, var y):

Because reconstruction (`with`) derives from a canonical constructor and

corresponding deconstruction pattern, when we support reconstruction ofrecords,

we will also be able to do so for carrier classes:

    point = point with { x = 3; }

## Carrier interfaces

A state description makes sense on interfaces as well. It makes thestatementthat the state description is a complete, canonical, nominal descriptionof the

interface's state (subclasses are allowed to add additional state), and

accordingly, implementations must provide accessor methods for thecomponents.

This enables such interfaces to participate in pattern matching:

```
interface Pair<T,U>(T first, U second) {
    // implicit abstract accessors for first() and second()
}

...

if (o instanceof Pair(var a, var b)) { ... }
```

Along with the upcoming feature for pattern assignment in foreach-loopheaders,if `Map.Entry` became a carrier interface (which it will), we would beable to

iterate a `Map` like:

    for (Map.Entry(var key, var val) : map.entrySet()) { ... }

It is a common pattern in libraries to export an interface that issealed to a

single private implementation.  In this pattern, the interface and
implementation can share a common state description:

```
public sealed interface Pair<T,U>(T first, U second) { }

private record PairImpl<T, U>(T first, U second) implements Pair<T, U> { }
```

Compared to the old way of doing this, we get enhanced semantics, bettertype

checking, and more concision.

### Extension

The main obligation of a carrier class author is to ensure that thefundamental

claim -- that the state description is a complete, canonical, nominal

description of the object's state -- is actually true. This does notrule out

having the representation of a carrier class spread out over a hierarchy, so

unlike records, carrier classes are not required to be final orconcrete, nor

are they restricted in their extension.

There are several cases that arise when carrier classes can participate in
extension:

 - A carrier class extends a non-carrier class;
 - A non-carrier class extends a carrier class;

- A carrier class extends another carrier class, where all of thesuperclass

   components are subsumed by the subclass state description;
 - A carrier class extends another carrier class, but there are one or more
   superclass components that are not subsumed by the subclass state
   description.

Extending a non-carrier class with a carrier class will usually bemotiviated bythe desire to "wrap" a state description around an existing hierarchywhich wecannot or do not want to modify directly, but we wish to gain thebenefits ofdeconstruction and reconstruction. Such an implementation would have toensure

that the class actually conforms to the state description, and that the
canonical constructor and component accessors are implemented.

When one carrier class extends another, the more straightforward case isthat it

simply adds new components to the state description of the superclass.  For
example, given our `Point` class:

```
class Point(int x, int y) {
    component int x;
    component int y;

    // everything else for free!
}
```

we can use this as the base class for a 3d point class:

```
class Point3d(int x, int y, int z) extends Point {
    component int z;

    Point3d {
        super(x, y);
    }
}
```

In this case -- because the superclass components are all part of thesubclassstate description -- we can actually omit the constructor as well,because we

can derive the association between subclass components and superclass

components, and thereby derive the needed super-constructor invocation. So we

could actually write:

```
class Point3d(int x, int y, int z) extends Point {
    component int z;

    // everything else for free!
}
```

One might think that we would need some marking on the `x` and `y`components of`Point3d` to indicate that they map to the corresponding components of`Point`,as we did for associating component fields with their correspondingcomponents.But in this case, we need no such marking, because there is no way thatan `intx` component of `Point` and an `int x` component of its subclass couldpossibly

refer to different things -- since they both are tied to the same `int x()`

accessor methods. So we can safely infer which subclass components aremanaged

by superclasses, just by matching up their names and types.

In the other carrier-to-carrier extension case, where one or more superclass

components are _not_ subsumed by the subclass state description, it isnecessaryto provide an explicit `super` constructor call in the subclassconstructor.

A carrier class may be also declared abstract; the main effect of thisis thatwe will not derive `Object` method implementations, instead leaving thatfor the

subclass to do.

### Abstract records

This framework also gives us an opportunity to relax one of therestrictions onrecords: that records can't extend anything other than`java.lang.Record`. We

can also allow records to be declared `abstract`, and for records to extend
abstract records.

Just as with carrier classes that extend other carrier classes, thereare twocases: when the component list of the superclass is entirely containedwithin

that of the subclass, and when one or more superclass components are derived
from subclass components (or are constant), but are not components of the
subclass itself.  And just as with carrier classes, the main difference is
whether an explicit `super` call is required in the subclass constructor.

When a record extends an abstract record, any components of the subclassthatare also components of the superclass do not implicitly get componentfields inthe subclass (because they are already in the superclass), and theyinherit the

accessor methods from the superclass.

### Records are carriers too

With this framework in place, records can now be seen to be "just" carrier

classes that are implicitly final, extend `java.lang.Record`, thatimplicitlyhave private final component fields for each component, and can have noother

fields.

## Migration compatibility

There will surely be some existing classes that would like to become carrier

classes. This is a compatible migration as long as none of the mandatedmembers

conflict with existing members of the class, and the class adheres to the
requirement that the state description is a complete, canonical, and nominal
description of the object state.

### Compatible evolution of records and carrier classes

To date, libraries have been reluctant to use records in public APIs because
of the difficulty of evolving them compatibly.  For a record:

```
record R(A a, B b) { }
```

that wants to evolve by adding new components:

```
record R(A a, B b, C c, D d) { }
```

we have several compatibility challenges to manage.  As long as we are only

adding and not removing/renaming, accessor method invocations willcontinue towork. And existing constructor invocations can be allowed to continuework by

explicitly adding back a constructor that has the old shape:

```
record R(A a, B b, C c, D d) {

    // Explicit constructor for old shape required
    public R(A a, B b) {
        this(a, b, DEFAULT_C, DEFAULT_D);
    }

}
```

But, what can we do about existing uses of record _patterns_? While the

translation of record patterns would make adding componentsbinary-compatible,

it would not be source-compatible, and there is no way to explicitly add a
deconstruction pattern for the old shape as we did with the constructor.

We can take advantage of the simplification offered by there being_only_ thecanonical deconstruction pattern, and allow uses of deconstructionpatterns to

supply nested patterns for any _prefix_ of the component list.  So for the
evolved record R:

    case R(P1, P2)

would be interpreted as:

    case R(P1, P2, _, _)

where `_` is the match-all pattern. This means that one can compatiblyevolve a

record by only adding new components at the end, and adding a suitable
constructor for compatibility with existing constructor invocations.

Data Oriented Programming, Beyond Records

Reply via email to