bug#14599: An option to make vector allocation aligned

Jan Schukat Wed, 12 Jun 2013 14:15:13 -0700

Thought a bit about it, and it would really be nice to have an aligneduniform vector API.

ATM all are 8 byte aligned, so you probably would want also to be ableto have at least 16 and 32 byte alignment (intel's AVX has 256bitregisters that better work aligned).But even 64 and and more could be useful for cache line alignment,although that would require this to be a separate alignment, because thebenefits of cache line alignment are kind of defeated if the header isin a different cache line.

So I guess just one alignment, namely that of the first element isfeasible without wasting whole cache lines. If you really need that youcan still use the take_*vector functions, and it's pretty rare to dosuch things anyway. But being able to control the alignment of the firstelement allows you to properly use simd instructions on those vectors.

You don't even really need any more space to store alignmentinformation, since that can be directly inferred from the bytevectorcontent pointer, although the bytevector flags still have more thanenough space to store it.

Extending the programming api to support this is a bit more tricky. Iguess most straightforward and backward compatible would be to just at aset of make-aligned-*vector and aligned-*vector and *->aligned-*vectorfunctions and their scm_* versions with an additional alignmentparameter. Optional alignment parameters on the old functions could benice too, but I guess that is just asking for compatibility trouble.

The other question is the read syntax (one of the primary reasons I'mdoing all this). If alignment is something that should be preserved inthe permanent representation, you also need to store it in the flags,since the content pointer can be aligned by coincidence. I haven'tlooked at the compiling of bytevectors yet, to see if alignment can behandled easily there.

As for the text representation, I think the simplest way is to addanother reserved character with the alignment number that works foruniform vectors and arrays like #vu8>8(1 2 3 4 5 6) to have the firstelement at 8byte alignment (right now the allocation pretty much ensures4 byte alignment of the first element on 32 bit machines and 8 byte at64bit machines, because gc_malloc returns 8byte aligned blocks, but thearray starts at cell word 3. Any 64 bit type vector like double and longis already guaranteed to be misaligned on 32 bit platforms. Which wouldbe even more unfortunate on linux x32 abi systems that uses efficient 64bit ints with 32 bit pointers, but cell size is determined by pointer size.

Or to construct simd 4 element arrays #2f32:2:4>16((1 2 3 4)(1 2 3 4)).Maybe even have a default alignment of 16 when you just use > without anumber so #2f32:2:4>((1 2 3 4)(1 2 3 4)) is the same thing. Or even moreconvenient #m128((1 2 3 4)(1.0 1.0 1.0 1.0) (2.0 2.0)) where you canfreely mix the underlying types and the size of the elements is inferredby the amount of them in each group.

So if there is interest for something like this in the main guile, Iwill make the patches. If not, I'll just stick to my crude hack for nowand see if I need the full shebang :).



Regards

Jan Schukat


On 06/12/2013 04:59 PM, Ludovic Courtès wrote:

severity 14599 wishlist
thanks

Hi!

Jan Schukat <[email protected]> skribis:

If you want to access native uniform vectors from c, sometimes you
really want guarantees about the alignment.

[...]

This isn't necessarily true for vectors created from pre-existing
buffers (the take_*vector functions), but there you have control over
the pointer you pass, so you can make it true if needed.

So if there is interest, maybe this could be integrated into the build
system as a configuration like this:


--- libguile/bytevectors.c    2013-04-11 02:16:30.000000000 +0200
+++ bytevectors.c    2013-06-12 14:45:16.000000000 +0200
@@ -223,10 +223,18 @@

        c_len = len * (scm_i_array_element_type_sizes[element_type] / 8);

+#ifdef SCM_VECTOR_ALIGN
+      contents = scm_gc_malloc_pointerless
(SCM_BYTEVECTOR_HEADER_BYTES + c_len + SCM_VECTOR_ALIGN,
+                        SCM_GC_BYTEVECTOR);
+      ret = PTR2SCM (contents);
+      contents += SCM_BYTEVECTOR_HEADER_BYTES;
+      contents += (addr + (SCM_VECTOR_ALIGN - 1)) & -SCM_VECTOR_ALIGN;
+#else
        contents = scm_gc_malloc_pointerless
(SCM_BYTEVECTOR_HEADER_BYTES + c_len,
                          SCM_GC_BYTEVECTOR);
        ret = PTR2SCM (contents);
        contents += SCM_BYTEVECTOR_HEADER_BYTES;
+#endif

        SCM_BYTEVECTOR_SET_LENGTH (ret, c_len);
        SCM_BYTEVECTOR_SET_CONTENTS (ret, contents);

I don’t think it should be a compile-time option, because it would be
inflexible and inconvenient.

Instead, I would suggest using the scm_take_ functions if allocating
from C, as you noted.

In Scheme, I came up with the following hack:

--8<---------------cut here---------------start------------->8---
(use-modules (system foreign)
              (rnrs bytevectors)
              (ice-9 match))

(define (memalign len alignment)
   (let* ((b (make-bytevector (+ len alignment)))
          (p (bytevector->pointer b))
          (a (pointer-address p)))
     (match (modulo a alignment)
       (0 b)
       (padding
        (let ((p (make-pointer (+ a (- alignment padding)))))
          ;; XXX: Keep a weak reference to B or it can be collected
          ;; behind our back.
          (pointer->bytevector p len))))))
--8<---------------cut here---------------end--------------->8---

Not particularly elegant, but it does the job.  ;-)

Do you think there’s additional support that should be provided?

Thanks,
Ludo’.

bug#14599: An option to make vector allocation aligned

Reply via email to