On Tue, 16 Mar 2021 00:20:34 GMT, Chris Plummer <cjplum...@openjdk.org> wrote:

>> Lin Zang has updated the pull request with a new target base due to a merge 
>> or a rebase. The incremental webrev excludes the unrelated changes brought 
>> in by the merge/rebase. The pull request contains five additional commits 
>> since the last revision:
>> 
>>  - Merge branch 'master' into sf
>>  - Revert "reduce memory consumption"
>>    
>>    This reverts commit 70e43ddd453724ce36bf729fa6489c0027957b8e.
>>  - reduce memory consumption
>>  - code refine
>>  - 8262386: resourcehogs/serviceability/sa/TestHeapDumpForLargeArray.java 
>> timed out
>
> src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/utilities/HeapHprofBinWriter.java
>  line 618:
> 
>> 616:             int bytesToWrite = (int) (longBytes);
>> 617:             
>> hprofBufferedOut.fillSegmentSizeAndEnableWriteThrough(bytesToWrite);
>> 618:         }
> 
> It seems to me this is the key part of the fix, and all other changes and 
> driven by this change. What I don't understand is why enabling `writeThrough` 
> is done here in `calculateArrayMaxLength()`, especially since this same code 
> might be execute more than once for the same segment (thus "enabling" 
> `writeThrough` when it is already enabled). What is the actual trigger for 
> wanting `writeThrough` mode? Is it really just seeing an array for the first 
> time in a segment?

Dear Chris, 

> What I don't understand is why enabling `writeThrough` is done here in 
> `calculateArrayMaxLength()`

This is the place where the array size could be calculated by the writer before 
the array element data are written. So there is no need to cache the array in 
the internal buffer of segmentedOutputStream.

> especially since this same code might be execute more than once for the same 
> segment (thus "enabling" `writeThrough` when it is already enabled). 

IMO, it is not possible to execute write-through multiple times. in 
`fillSegmentSizeAndEnableWriteThrough()` there is a flush that will write all 
buffered data in to underlying output stream first and then start write-through 
mode. 
And from the implementation in 
https://github.com/openjdk/jdk/blob/c484d8904285652246c3af212a4211b9a8955149/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/utilities/AbstractHeapGraphWriter.java#L62
the method ` writeHeapRecordEpilogue()` would be called for every object 
iterated. And in this method, the `exitSegmentMode()` will be called which will 
disable write-through mode. 

Briefly, the segment could contain multiple objects. But the array object would 
be treated as a new standalone segment, which is written to the underlying 
output stream directly by write-through mode.  

> What is the actual trigger for wanting writeThrough mode? Is it really just 
> seeing an array for the first time in a segment?

The root cause for requiring writeThrough mode is to avoid caching data in the 
internal buffer of `SegmentedOutputStream`, which would cause memory 
consumption and also performance issues, which is the also the root cause of 
this bug.

Moreover, the reason that need to "cache data" in internal buffer before data 
written to file is that current implementation of SA heap dumper need to fill 
the size slot in segment header only after the segment data are written.  Which 
is hard to do after introducing gzipped heap dump because the header and the 
data are compressed when written to file and not easy to be modified.

However, the current implementation of dumping array object first calculates 
the array size, so the size of segment data is known when creating segment 
header. And this fix uses that info to avoid cache array data.

-------------

PR: https://git.openjdk.java.net/jdk/pull/2803

Reply via email to