On Thu, 2 Feb 2023 12:54:06 GMT, Boris Ulasevich <[email protected]> wrote:

> This is another pull request to replace 
> https://github.com/openjdk/jdk/pull/10025 change which was blocked as not 
> acceptable (see 
> https://github.com/openjdk/jdk/pull/10025#pullrequestreview-1228216330)
> 
> The objections to change #10025 were:
> - specialized algorithm for given data complicates things, makes it hard to 
> learn, test and support
> - algorithm is changed for DebugInfo, and the benefit is only for one type of 
> data
> - statistics of the debug info data can (will) change, breaking the 
> optimization
> 
> The suggestion was:
> - don't change the core algorithm, but add one on top or underneath the 
> existing one, or reuse off-the-shelf zero-reduction schemes such as Cap'n 
> Proto
> 
> With this change I propose a different approach. Instead of bit coding, the 
> sequence of zeros in a data stream is encoded with a special character that 
> normally never appears in Unsinged5 encoding, followed by a byte containing a 
> number of zeros. In this way the updated algorithm is a pure extension of the 
> existing encoding algorithm: data encoded without the zero-reduction trick is 
> unpacked in the same way as before.
> 
> Currently there are several datasets affected by this change: Dependencies 
> info, OopMap info, LineNumber info, Debug info. Only Debug info has a large 
> number of zeros and gets a significant benefit. I experimented with the Cap'n 
> Proto and lz4 algorithms on DebugInfo. The Unsinged5 algorithm has a better 
> compression rate than these.
> 
> DebugInfo data size is reduced by ~20% (actually, 10-30%, depending on the 
> application). Total nmethod size reduction is ~3%.
> 
> Performance impact: Renaisance and DaCapo benchmarks do not show any 
> difference.

src/hotspot/share/code/compressedStream.hpp line 118:

> 116:   bool handle_zero(juint value) {
> 117:     if (value == 0) {
> 118:       _zero_count = (_zero_count == 0xFF) ? 0 : _zero_count;

The case of `_zero_count` overflow is not clear. Apparently, I'm missing 
something here.
Current code is just clearing the previously counted `_zero_count`.
I'd expect some action like storing the current number of zeros or advancing 
the `_position`.
Do you have a test for this?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/12387#discussion_r1254988216

Reply via email to