Hi Thomas,

ok from my end.

Best regards
Christoph

From: Thomas Stüfe [mailto:[email protected]]
Sent: Donnerstag, 19. Oktober 2017 15:22
To: Peter Levart <[email protected]>; Langer, Christoph 
<[email protected]>
Cc: Alan Bateman <[email protected]>; [email protected]; 
[email protected]; Java Core Libs 
<[email protected]>
Subject: Re: RFR(xs): (aix but affects shared code too) 8186665: buffer 
overflow in Java_java_nio_MappedByteBuffer_isLoaded0

Hi Peter, Christoph,

if you have no objections, I'd like to push this change. As I explained in my 
earlier mail, I'd prefer not to change MappedByteBuffer::load(), and if you are 
fine with the change in its current form 
(http://cr.openjdk.java.net/~stuefe/webrevs/8186665-buffer-overflow-in-mincore/webrev.02/webrev/),
 I'd like to push it.

Thanks, Thomas


On Wed, Oct 18, 2017 at 12:24 PM, Thomas Stüfe 
<[email protected]<mailto:[email protected]>> wrote:
Hi Peter, Christoph,

Thank you both for reviewing.

New webrev: 
http://cr.openjdk.java.net/~stuefe/webrevs/8186665-buffer-overflow-in-mincore/webrev.02/webrev/

@Peter:

>Shouldn't the following line:
>
>  47     return len2 + pagesize - 1 / pagesize;
>
>..be written as:
>
>           return (len2 + pagesize - 1) / pagesize;

You are right. Did not cause the mincore output buffer to be unnecessarily 
large. Thanks for catching this.

As for your other concern:


On Wed, Oct 18, 2017 at 10:32 AM, Peter Levart 
<[email protected]<mailto:[email protected]>> wrote:
--
In Java_java_nio_MappedByteBuffer_isLoaded0, we call mincore(2) to retrieve the 
paging status of an address range.

The size of the output buffer for mincore(2) depends on the number of pages in 
*system page size* in the given memory range (this is spelled out more or less 
explicitly on AIX and Linux, nothing is said on BSD/OSX, but I assume the 
same). The number of pages in the memory range is calculated by 
MappedByteBuffer.isLoaded() and handed down to 
Java_java_nio_MappedByteBuffer_isLoaded0() together with the memory range to 
test.

MappedByteBuffer.isLoaded() calculates this number of pages based on 
jjdk.internal.misc.Unsafe.pagesize(), which ultimately comes down to 
os::vm_page_size().

For AIX, os::vm_page_size() may return a page size larger than the system page 
size of 4K. The reason for this is that on AIX, memory can be backed by 
different page sizes, usually either 4K or 64K - e.g. posix thread stacks may 
have 4K pages, java heap (system V shared memory) with 64K pages, but mmap 
memory is always 4K page backed...

If this is true and Unsafe.pagesize() returns a value > 4K, then perhaps also 
the MappedByteBuffer.load() method is wrong for AIX?

    public final MappedByteBuffer load() {
        checkMapped();
        if ((address == 0) || (capacity() == 0))
            return this;
        long offset = mappingOffset();
        long length = mappingLength(offset);
        load0(mappingAddress(offset), length);

        // Read a byte from each page to bring it into memory. A checksum
        // is computed as we go along to prevent the compiler from otherwise
        // considering the loop as dead code.
        Unsafe unsafe = Unsafe.getUnsafe();
        int ps = Bits.pageSize(); // << LOOK HERE
        int count = Bits.pageCount(length);
        long a = mappingAddress(offset);
        byte x = 0;
        for (int i=0; i<count; i++) {
            x ^= unsafe.getByte(a);
            a += ps; // << AND HERE
        }
        if (unused != 0)
            unused = x;

        return this;
    }

...this loop reads a byte from the start of each block in lumps of 
Bits.pageSize(). Should it always read in lumps of 4K for AIX? Do we rather 
need a special Unsafe.mmappedPageSize() method instead of just a hack in 
isLoaded0 ?

Yes, I considered this too. In effect, on AIX, we only touch every 16th page, 
thereby reducing MappedByteBuffer::load() to something like 
load_every_16th_page... However, this bug is very old (even our 1.4 VM already 
does this when the touching was still implemented in MappedByteBuffer.c) and 
did not cause any problems AFAIK.

The story behind this is long and quite boring :) basically, 64k pages are used 
for the java heap and give a large performance bonus over 4K paged java heap. 
But we cannot switch all memory regions to 64K pages, so we live with multiple 
page sizes and above us we have a ton of code which assumes one consistent page 
size for everything. So we lie about the page size to everyone - claiming 
system page size to be 64k - and except for very rare cases like this one get 
away with this.

I would like to keep lying consistently. There is not a hard reason for it, 
just that I am afraid that starting to publish a different page size to parts 
of the VM will confuse things and may introduce errors further down the line.

I think a proper solution would be to keep page size on a per-ByteBuffer base, 
because ByteBuffers may be allocated in different memory regions - they are now 
allocated with mmap() in system page size, but that may change in the future. 
That is assuming that one byte buffer cannot span areas of multiple page sizes, 
which would complicate matters further.

Btw, I also wondered whether other platforms could have a clash between the 
real memory page size and MappedByteBuffer's notion of that size - e.g. whether 
it is possible to have MappedByteBuffers with huge pages on Linux. But all 
cases I could think of are cases where the page size the JDK would assume is 
smaller than the actual page size, which would not be a problem for both 
mincore and load/touch. In the above example (huge pages on Linux), pages would 
be pinned anyway, so load() and isLoaded() would be noops.


@Christoph:

> apart from the point that Peter found, I’d also think it would look better if 
> the typedef section (line 51-56) would be placed before the AIX only function 
> (line 41-49).

Sure. I moved the section up, below the includes.

Kind Regards, Thomas


Reply via email to