On Mon, 10 May 2021 23:19:37 GMT, Chris Plummer <cjplum...@openjdk.org> wrote:

>> Hi Chris (@plummercj )
>> 
>> After several tries, I think maybe the current patch is the better way to 
>> fix this issue. 
>> For detail, there are three issues that cause the problem. 
>> - The incorrect calculated array length when segmented heap dump is enabled, 
>> it can be fixed easily by modifying code in method 
>> `calculateArrayMaxLength()`. It is already in this PR.
>> - The internal segment buffer used for segmented heap dump. Plus the resize 
>> of this buffer can cause memory consumption by doing byte[] copy. 
>> - Also because the internal segment buffer is byte array, there is a 
>> limitation that the buffer length can not be larger than Integer.MAX_VALUE, 
>> which also cause the problem when dumping large array, as there can be 
>> integer overflow and also not align with the specification (as explained 
>> blow)
>> 
>> Here are the possible solutions I have tried to avoid using `unbufferedMode`:
>> 
>> - For segment heap dump, add limitation in `calculateArrayMaxLength()` to 
>> guarantee at most `Interger.MAX_VALUE` bytes can be filled into the internal 
>> byte array buffer. So there is no need to use unbufferedMode. But I found it 
>> can cause large memory usage with internal buffer resize to hold all array 
>> data, and also there is OOM thrown when I use `jhsdb jmap --binaryheap` 
>> command. Moreover it also changes the behavior of dumping large array:  The 
>> original limitation is  MAX_U4_VALUE, which follows the hprof spec that a 
>> segment in heap dump has a 32bit slot to hold the segment size.  But after 
>> the change , it becomes to `Integer.MAX_VALUE`, which is 2^31-1, just the 
>> half of MAX_U4_VALUE, the behavior doesn't follow the spec exactly. 
>> The change also introduce lots of overflow check when size of internal 
>> segment buffer is used, which IMHO is not clean.
>> 
>> - I also tried to avoid resize of internal segment buffer by using a linked 
>> list to hold all data to be flushed, this can avoid the overflow check. But 
>> it also cause OOM since the whole data are hold in the list. And it also 
>> introduce more codes for the linked list logic.
>> 
>> And the main reason for caching data in internal segment buffer is that the 
>> current implentation needs to update the `size slot` in the segment header, 
>> which must be done before the whole data flushed to file, otherwise it would 
>> cause problems when gzip heap dump is used. 
>> 
>> The way to avoid caching is to calculate the object size previously and then 
>> update the size slot at begining. But it would cause other changes in the 
>> code, which IMO is not suitable to introduce in this PR. That is why I 
>> mentioned to re-write the implementation of `HeapHprofBinWriter.java` in 
>> previous comments.
>> 
>> Having tried all these ways, I think maybe current solution with 
>> `unbufferedMode` is better. Because it is easy to calculate the array length 
>> in `calculateArrayMaxLength()`  and hence know the size of the data to 
>> written. Then it can fill the `size` slot before data are seen, and avoid 
>> using the internal segment buffer by write large array data directly to file.
>> 
>> What do you think?
>> 
>> Thanks,
>> Lin
>
>> Having tried all these ways, I think maybe current solution with 
>> `unbufferedMode` is better. Because it is easy to calculate the array length 
>> in `calculateArrayMaxLength()` and hence know the size of the data to 
>> written. Then it can fill the `size` slot before data are seen, and avoid 
>> using the internal segment buffer by write large array data directly to file.
> 
> I think that makes sense. Is this now fully implemented in this PR and it is 
> ready for a final review?

Hi @plummercj, 
I think it is now ready for final review. Thanks a lot!
-Lin

-------------

PR: https://git.openjdk.java.net/jdk/pull/2803

Reply via email to