On Mon, 10 May 2021 23:19:37 GMT, Chris Plummer <cjplum...@openjdk.org> wrote:
>> Hi Chris (@plummercj ) >> >> After several tries, I think maybe the current patch is the better way to >> fix this issue. >> For detail, there are three issues that cause the problem. >> - The incorrect calculated array length when segmented heap dump is enabled, >> it can be fixed easily by modifying code in method >> `calculateArrayMaxLength()`. It is already in this PR. >> - The internal segment buffer used for segmented heap dump. Plus the resize >> of this buffer can cause memory consumption by doing byte[] copy. >> - Also because the internal segment buffer is byte array, there is a >> limitation that the buffer length can not be larger than Integer.MAX_VALUE, >> which also cause the problem when dumping large array, as there can be >> integer overflow and also not align with the specification (as explained >> blow) >> >> Here are the possible solutions I have tried to avoid using `unbufferedMode`: >> >> - For segment heap dump, add limitation in `calculateArrayMaxLength()` to >> guarantee at most `Interger.MAX_VALUE` bytes can be filled into the internal >> byte array buffer. So there is no need to use unbufferedMode. But I found it >> can cause large memory usage with internal buffer resize to hold all array >> data, and also there is OOM thrown when I use `jhsdb jmap --binaryheap` >> command. Moreover it also changes the behavior of dumping large array: The >> original limitation is MAX_U4_VALUE, which follows the hprof spec that a >> segment in heap dump has a 32bit slot to hold the segment size. But after >> the change , it becomes to `Integer.MAX_VALUE`, which is 2^31-1, just the >> half of MAX_U4_VALUE, the behavior doesn't follow the spec exactly. >> The change also introduce lots of overflow check when size of internal >> segment buffer is used, which IMHO is not clean. >> >> - I also tried to avoid resize of internal segment buffer by using a linked >> list to hold all data to be flushed, this can avoid the overflow check. But >> it also cause OOM since the whole data are hold in the list. And it also >> introduce more codes for the linked list logic. >> >> And the main reason for caching data in internal segment buffer is that the >> current implentation needs to update the `size slot` in the segment header, >> which must be done before the whole data flushed to file, otherwise it would >> cause problems when gzip heap dump is used. >> >> The way to avoid caching is to calculate the object size previously and then >> update the size slot at begining. But it would cause other changes in the >> code, which IMO is not suitable to introduce in this PR. That is why I >> mentioned to re-write the implementation of `HeapHprofBinWriter.java` in >> previous comments. >> >> Having tried all these ways, I think maybe current solution with >> `unbufferedMode` is better. Because it is easy to calculate the array length >> in `calculateArrayMaxLength()` and hence know the size of the data to >> written. Then it can fill the `size` slot before data are seen, and avoid >> using the internal segment buffer by write large array data directly to file. >> >> What do you think? >> >> Thanks, >> Lin > >> Having tried all these ways, I think maybe current solution with >> `unbufferedMode` is better. Because it is easy to calculate the array length >> in `calculateArrayMaxLength()` and hence know the size of the data to >> written. Then it can fill the `size` slot before data are seen, and avoid >> using the internal segment buffer by write large array data directly to file. > > I think that makes sense. Is this now fully implemented in this PR and it is > ready for a final review? Thanks @plummercj for review and approve this PR. Dear All, May I get more review? Thanks! ------------- PR: https://git.openjdk.java.net/jdk/pull/2803