Hi John,

Thanks for the thoughts and advice.

At this point i plan to proceed with the current patch (i will add tests for 
buffer views), and then follow up more more analysis and recommendations for 
SPARC. The situation is a little murky after some further performance analysis, 
non-optimal loop unrolling and address calculations may be dominating. Without 
eyeballing generated code and shedding some light on this i am fumbling around 
in the crepuscule.

Paul.

> On 31 Mar 2016, at 08:33, John Rose <john.r.r...@oracle.com> wrote:
> 
> On Mar 30, 2016, at 2:36 AM, Paul Sandoz <paul.san...@oracle.com 
> <mailto:paul.san...@oracle.com>> wrote:
>> 
>> When access is performed in loops this can cost, as the alignment checks are 
>> not hoisted out. Theoretically could for regular 2, 4, 8 strides through the 
>> buffer contents. For such cases alignment of the base address can be 
>> checked. Not sure how complicated that would be to support.
>> 
>> I lack knowledge of the SPARC instruction set to know if we could do 
>> something clever as an intrinsic.
> 
> A couple of partial thoughts:
> 
> If we had bitfield type inference, we would be able to deduce that the low 
> bits of p and p+8 are the same.  Graal has this (because I gave them the 
> formulae[1]).  C2 may be too brittle to add it into TypeInt.  Bitfield 
> inference on expressions of the form p&7 and (p+8)&7 would allow commoning 
> tests in an unrolled loop, hoisting the alignment logic to the top of the 
> loop body, and (perhaps) through the phi to the loop head.
> 
> [1]: 
> http://hg.openjdk.java.net/graal/graal-core/file/ea5cc66ec5f2/graal/com.oracle.graal.compiler.common/src/com/oracle/graal/compiler/common/type/IntegerStamp.java#l460
>  
> <http://hg.openjdk.java.net/graal/graal-core/file/ea5cc66ec5f2/graal/com.oracle.graal.compiler.common/src/com/oracle/graal/compiler/common/type/IntegerStamp.java#l460>
> 
> An intrinsic that would guide the JIT more explicitly would (I think) need an 
> extra argument.  Something like:  getLongUnaligned(p, uo, ao), where uo and 
> ao are both longs, but only ao is required to be naturally aligned 
> ((ao&7)==0).  Yuck.  Maybe this could be pattern-matched in C2; it would be a 
> kludge.
> 
> — John

Reply via email to