Hi Jim and Tim, We are implementing Neon instructions by format to allow MC Hammer validation. As Tim pointed out, corresponding intrinsics are implemented in parallel with the instructions definitions.
We have about 20% of the NEON instructions implemented. Here is a summary: - Finished instruction formats: AdvSIMD three same , AdvSIMD modified immediate. - Finished instruction classes (some of these instructions belong to yet unfinished formats): Vector Arithmetic, Vector Immediate. - Several clang changes (mostly in NeonCodeEmitter and CGBuiltin) to support ARM v8 intrinsics. Thanks, Ana. -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Tim Northover Sent: Monday, January 07, 2013 2:05 PM To: Jim Grosbach Cc: llvm-commits; [email protected] Subject: Re: [llvm-commits] The AArch64 LLVM (& Clang) target Hi Jim, Glad to hear from you. Your input will certainly be useful. > NEON: Are the instructions themselves there, but no intrinsics > support, or are the instruction definitions missing as well? These two facets are being implemented together by Ana Pazos. It's largely the case that if an instruction exists, the corresponding intrinsic will also be implemented, but "normal" selection is slightly less likely. > PIC: Is there a non-PIC mode? I would have expected (naively?) that > this arch would be such that there'd be negligible benefit to a non-PIC model. There is a non-PIC mode, but it (currently) uses PC-relative instructions anyway. The only real difference is that in PIC mode actual global variable accesses must go via a GOT because the address may not be known to a static linker. So to load a variable in non-PIC, you'd execute: adrp x0, some_var ldr x0, [x0, #:lo12:some_var] whereas in PIC it would be: adrp x0, :got:some_var ldr x0, [x0, #:got_lo12:some_var] ldr x0, [x0] So it's not so massive a gain as some architectures, but it's still useful. > MCJIT: Why not? Given direct-to-object-file integrated assembler > support (which it sounds like you have), enabling this should be pretty trivial. I believe so too. And (for me at least) a good fun side-project! Unfortunately, we've had other priorities during development. > Memory model: Why 4GB and not the full 64-bit address space? This is down to efficiency of the loading an address again. I believe the +/- 4GB was chosen precisely because that's the addressing limit of the ADRP instruction, which means any variable can be accessed via an adrp/add or adrp/ldr pair. If you go beyond that you may need up to 4 movz/movk instructions to materialise a 64-bit value. An interesting consequence of that is that debugging/exception info has to use 64-bit addresses anyway, simply because +/- 4GB is 33 bits. According to the group developing GCC, the full addressing models aren't actually needed for any software, except possibly the Linux kernel (there was some uncertainty even there when I asked a while back). You could still access the full address space, but most of it will be via the heap (and possibly stack, I suppose) rather than global variables. > Testing: I assume this is done via a simulator/emulator? What's the > status of the LLVM test-suite running on it? Everything passing? Yep. All testing is via emulator at the moment. I ran the test-suite, in the LNT configuration from the guide (though I'm still not sure whether it actually ran llc tests, or just clang ones). I believe the results were that all failures could be traced to problems in the tests. Most commonly, the signedness of char in was assumed; most interestingly the rounding of sqrt in a very pretty raytracing test. Don't take this as gospel though, it's not a test we run all the time. Cheers. Tim _______________________________________________ llvm-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits _______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
