msfroh opened a new issue, #15191: URL: https://github.com/apache/lucene/issues/15191
### Description I was reviewing a PR that dealt with some Protobuf today and saw that their `ByteString` class (which is pretty similar to Lucene's `BytesRef`) is abstract with a pair of concrete subclasses -- `LiteralByteString` and `BoundedByteString`. The `BoundedByteString` has length and offset members, while `LiteralByteString` is just a wrapper around `byte[]`. As a result, `LiteralByteString` is a whole 8 bytes smaller. Woohoo! This got me thinking -- how many `BytesRef` instances out there have `offset == 0` and `length == bytes.length`? Within a lot of "hot" Lucene code, I believe the answer is "not many", since we do a **very** good job of reusing `BytesRef` instances forever. That said, all of the `Term` constructors end up producing `BytesRef`s of known (fixed) length. So the potential benefit is clearly non-zero. (Maybe close to zero?) I'm thinking of trying to make `BytesRef` `abstract` and `sealed` with a pair of subclasses, similar to the Protobuf approach. Obviously, this means replacing direct field access with getters (and setters), but I think those can be bimorphically inlined. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
