Attached are two more files which should be included in the Broadwell
directory. They should also be faster on Skylake, though Skylake will
automatically look in there.

I don't think they can work on Haswell, but this could be checked.

Also, I forgot one step in the above post:

* Make a tuning file for Broadwell after everything is running (it
shouldn't be necessary to do this for Haswell, as logops don't feature in
tuning, but it should also be done for Skylake when the files attached to
this post are added).

The tuning file for broadwell is in
mpn/x86_64/haswell/broadwell/gmp-mparam.h

To generate one, configure on a broadwell machine and do:

cd tune
make tune

copy the output into the relevant file.

Also, I should point out the ajs superoptimiser that was written for ODK:

https://github.com/akruppa/ajs

There's also a version here:

https://github.com/alexjbest/ajs

I'm not sure which is more up-to-date.

Unfortunately, setting up a machine so this can be used is very hard (it
requires root access and a lot of configuration of the kernel, as Intel
have made getting cycle accurate timings really, really hard nowadays).
This is not an automatic process, so expect to spend weeks working on it if
you decide to do it.

The files attached to this post have already been superoptimised for
Skylake (but not for Broadwell).

-- 
You received this message because you are subscribed to the Google Groups 
"mpir-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to mpir-devel+unsubscr...@googlegroups.com.
To post to this group, send email to mpir-devel@googlegroups.com.
Visit this group at https://groups.google.com/group/mpir-devel.
For more options, visit https://groups.google.com/d/optout.

Attachment: addsub_n.as
Description: Binary data

Attachment: subadd_n.as
Description: Binary data

Reply via email to