Hi David,
On 24/05/2013 2:06 AM, David Chase wrote:
So where do we stand on this?
Can we call it a bug and eligible for inclusion?
And are there other issues to deal with?
I think this form of the optimization is more amenable to inclusion.
I would be happy to discuss exactly what is going on in the mysterious
intrinsic-ful code, if that is an issue for anyone (and sooner is better than
later, else I'll forget the exciting details).
I'm not going to attempt to understand any of the details of this. That
said I still have an issue with this code:
147 if (buf) {
148 /* Don't know for sure how big an unsigned long is,
therefore
149 copy one at a time. */
150 int i;
151 for (i = 0; i < 256; i++) buf[i] = (jint) (crc_table[i]);
152 (*env)->ReleasePrimitiveArrayCritical(env, b, buf, 0);
153 }
buf is an array of 32-bit values; crc_table is either 32-bit or 64-bit
depending on platform. So on 64-bit we are truncating all the values in
crc_table. So presumably these values all fit in 32-bits anyway?
Minor nit:
src/share/classes/java/util/zip/CRC32.java
+ import java.lang.reflect.Field;
I don't think this import is needed now.
---
test/java/util/zip/TimeChecksum.java
You added eg:
time(adler32, data, 2*iters, 16); // warmup short case, too
which is presumably to ensure both of the primary paths are hit during
the warm up. But it isn't obvious to me that the actual test will hit
the short case ??
Thanks,
David
David
On 2013-05-21, at 5:15 PM, David Chase <david.r.ch...@oracle.com> wrote:
http://cr.openjdk.java.net/~drchase/7088419/webrev.03/
Newer, slimmer webrev. No fork/join, the related code is removed (except the
native init routine still returns a boolean, which is currently ignored,
indicating if
it supports the faster crc32).
What remains is:
1) no-JNI fast-path for short CRCs and Adlers
2) reformulated faster-Adler for byte-at-a-time
3) For 32/64 Linux/Solaris/MacOS x86 Sandy Bridge or better (with CLMUL and
AVX),
fast code for CRCs.
For now it uses assembly language, the C-with-intrinsics is included, and
compiles
with current clang and gcc. It tickles a bug in Solaris Studio 12.3 which has
been reported.
Optimization for Windows is disabled because the assembler syntax is too
different
The code has been tested by-hand across Linux (32), Solaris (32 and 64), MacOS
(64)
and has also passed JPRT (which is to say, the failures were unrelated, hand
examination
of the runs showed that the new CRCandAdler test was successful. Anyone
checking my
most recent run will notice that I am not very good at specifying tests to run,
but there is
an earlier run. I'm trying again, just to see if I can figure out how to make
it behave.)
There is a companion webrev on the hotspot side that sets the secret property
that is used and reset by this code.
I hope this will be more easily regarded as a "bug fix" and less as an
interface change.
David
On 2013-05-20, at 8:34 PM, David Holmes <david.hol...@oracle.com> wrote:
On 21/05/2013 3:45 AM, Alan Bateman wrote:
On 20/05/2013 14:49, David Chase wrote:
Suppose I split this bug (i.e., file a new bug) into the
Intel-acceleration part and the fork-join part.
:
This make sense.
I agree.
I also don't have an issue with eliding the optimization on Windows for the
time being.
David
-Alan.