Any decent compiler will inline those functions, but that said I still need to verify the performance with the fix. Here's the latests/greatest webrev:

http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.01/webrev/

There are just a few minor changes compared to webrev.00 - primarily to get it to build on Windows and the addition of some helpful javadoc comments for the methods. Feedback appreciated, but I'll get back with an official RFR when I have done some benchmarking.

Cheers,
Mikael

On 2015-11-05 17:58, Brian Burkhalter wrote:
This does the same thing but is more elegant than a correct but verbose fix I was playing with.

Would changing read_value() and write_value() into macros be better for performance?

Thanks,

Brian

On Nov 5, 2015, at 5:04 PM, Mikael Vidstedt <mikael.vidst...@oracle.com <mailto:mikael.vidst...@oracle.com>> wrote:

I've played around a bit with this today to see if we can fix the problem and still have gcc generate the nice, vectorized loop it does today (but without the movdqa of course), and this is what I have so far:

http://cr.openjdk.java.net/~mikael/webrevs/8141491/webrev.00/webrev/ <http://cr.openjdk.java.net/%7Emikael/webrevs/8141491/webrev.00/webrev/>

I have not done any benchmarking to see what the effects are, nor have I tried it on any platform except linux-x86_64 so far, but at least it passes the unit tests there.

Feedback appreciated.

Cheers,
Mikael

On 2015-11-05 10:46, Brian Burkhalter wrote:
The follow-on issue which was filed to track the underlying issue is this:

https://bugs.openjdk.java.net/browse/JDK-8141491

As can be seen it is an alignment problem.


Reply via email to