Fully agree that using Unsafe makes one sad.

I'm just about to send out a new webrev with Alan's and Peter's comments, once I have that done I will give using the NIO-Buffer API a second try to see if using IntBuffer and LongBuffer is able to achieve similar performance.

As I noted in my reply the second goal after adding this API will is to create intrinsics that make use of the crc32c instructions available on both x86 and SPARC which will bump the performance even further. So one thing I try to do is make sure the implementation makes it easy to do that without having to completely rewrite it again.

Regards,
Staffan

On 10/17/2014 11:44 AM, Stanimir Simeonoff wrote:
Actually I was under the impression I had included the list. Getting it done w now.

Overall if you want speed you go unsafe (bit sad) as the JIT may not remove the bound checks off the ByteBuffer. Unsafe is pretty ugly overall, though and personally I try to avoid it giving up performance.

On 64bit machines you might be better off with java.nio.LongBuffer instead.

Stanimir


On Fri, Oct 17, 2014 at 9:06 PM, Staffan Friberg <staffan.frib...@oracle.com <mailto:staffan.frib...@oracle.com>> wrote:

    Hi Stanimir,

    Thanks for you idea, do you mind re-replying with reply all so it
    goes to the list? Or I can simply reply to myself and include your
    comment.

    I forgot to add in my previous email that one of the follow up
    items for this will be to implement intrinsics that uses available
    hardware instructions, having a method that takes the buffer
    address will make that much easier which is another reason to stay
    with the current implementation.

    Thanks,
    Staffan


    On 10/17/2014 10:06 AM, Stanimir Simeonoff wrote:
    Hi Staffan

    The actual "trick" is using ByteBuffer.asIntBuffer() while the
    address is properly aligned. It's not the most intuitive approach
    but you get the read stuff.

    Cheers
    Stanimir


    On Fri, Oct 17, 2014 at 7:47 PM, Staffan Friberg
    <staffan.frib...@oracle.com <mailto:staffan.frib...@oracle.com>>
    wrote:

        On 10/17/2014 01:46 AM, Peter Levart wrote:


            On 10/17/2014 03:42 AM, Staffan Friberg wrote:

                Hi,

                This RFE adds a CRC-32C class. It implements Checksum
                so it will have the same API CRC-32, but use a
                different polynomial when calculating the CRC checksum.

                CRC-32C implementation uses slicing-by-8 to achieve
                high performance when calculating the CRC value.

                A part from adding the new class,
                java.util.zip.CRC32C, I have also added two default
                methods to Checksum. These are methods that were
                added to Adler32 and CRC32 in JDK 8 but before
                default methods were added, which was why they were
                only added to the implementors and not the interface.

                Bug: https://bugs.openjdk.java.net/browse/JDK-6321472
                Webrev:
                http://cr.openjdk.java.net/~sfriberg/JDK-6321472/webrev.00
                <http://cr.openjdk.java.net/%7Esfriberg/JDK-6321472/webrev.00>

                I have started a CCC request for the changes, but was
                asked to get feedback from the core libs group before
                finalizing the request in case there are any API or
                Javadoc changes suggested.

                Thanks,
                Staffan


            Hi Staffan,

            I can see CRC32C.reflect(int) method reverses the bits in
            32 bit int value. You could use Integer.reverse(int) instead.

            The CRC32C.swap32(int) method is (almost) exactly the
            same as Integer.reverseBytes(int) and equivalent.

            I wonder if handling ByteBuffer could be simplified. You
            could leverage it's own byte order manipulation by
            temporarily setting (and resetting afterwards)
            ByteBuffer.order() and then use ByteBuffer.getInt() to
            extract 32 bits at a time for your algorithm. This could
            get you the optimal variant of algorithm for both kinds
            of buffers (direct or byte[] based). Perhaps even the
            byte[] based variant of algorithm could be implemented by
            wrapping the array with ByteBuffer, passing it to common
            private method, and relying on the escape analysis of
            Hotspot to allocate the HeapByteBuffer wrapper object on
            stack.

            Regards, Peter

        Hi Peter,

        Thanks for reviewing.

        I have switched to the Integer methods. Was looking through
        that API but I was too stuck with the reflect and swap names
        so I missed the reverse methods... :)

        As Vitaly noted in his email the wrapped case runs much
        slower. Going through the generated code it looks like the
        getInt method actually read four bytes and then builds and
        int from them, unless we have some intrinsic replacing that code.

        Bits.java
            static int getIntL(long a) {
                return makeInt(_get(a + 3),
                               _get(a + 2),
                               _get(a + 1),
                               _get(a    ));
            }

            static private int makeInt(byte b3, byte b2, byte b1,
        byte b0) {
                return (((b3       ) << 24) |
                        ((b2 & 0xff) << 16) |
                        ((b1 & 0xff) <<  8) |
                        ((b0 & 0xff)      ));
            }

        It looks like the same holds true for DirectByteBuffers
        unless you are on x86 which supports unaligned reads. So I
        think aligning and using Unsafe is the best option here for
        performance.

        DirectByteBuffer.java
            private int getInt(long a) {
                if (unaligned) {
                    int x = unsafe.getInt(a);
                    return (nativeByteOrder ? x : Bits.swap(x));
                }
                return Bits.getInt(a, bigEndian);
            }

        Bits.java
            static boolean unaligned() {
                if (unalignedKnown)
                    return unaligned;
                String arch = AccessController.doPrivileged(
                    new
        sun.security.action.GetPropertyAction("os.arch"));
                unaligned = arch.equals("i386") || arch.equals("x86")
                    || arch.equals("amd64") || arch.equals("x86_64");
                unalignedKnown = true;
                return unaligned;
            }

        Regards,
        Staffan





Reply via email to