[gentoo-user] Re: OT: GCC 5 Offloading

2015-09-15 Thread james
Fernando Rodriguez  outlook.com> writes:


> > > Do you know of any plans to enable offloading on the gentoo toolchain?

> > > I was able to build the offloading compiler using crossdev with a few
hacks 

> The link was for the emulator ebuild[1]. I got distracted with other 
> stuff and didn't make the host compiler, will do it this week.

I've been noodling around with DAGs, Tup, CheckInstall and Ninja. for this
sort of build effort, but nothing is solid (yet). I really want a flexible
DAG mechanism for building thing outside of gentoo proper devmanual, for
quick code testing and hacking.

Besides, I'm not sure you have to. Look there is an overlay for gcc-5 &
gcc-6:: [1]Down the page a bit::

gcc-5.2.0
multislot regression-test vanilla altivec debug nopie nossp doc gcj awt
hardened multilib objc objc-gc libssp objc++ fixed-point go graphite cilk
+nls +nptl +cxx +fortran +openmp +sanitize


> I built the offloading  (accel) compiler as follows:

Sorry about pruning the message:: gmane is a picky_pain some times
about length of post vs length of reply. Those instructions  you
posted are keenly appreciated, as well as the other info. Give me
some time to digest/test/regress/obsess on all of this information
and let's start a new thread. 

THANKS!
James

[1] http://gpo.zugaina.org/sys-devel/gcc




[gentoo-user] Re: OT: GCC 5 Offloading

2015-09-12 Thread Fernando Rodriguez
On Thursday, September 10, 2015 12:20:39 PM james wrote:
> You are taking a very conservative view of things. Codes being worked
> out now for clusters, will find their way to expand the use of the
> video card resources, for general purpose things. Most of this will
> occur as compiler enhancements, not rewriting by hand or modifying 
> algorithmic designs of existing codes. Granted they are going to
> mostly apply to multi-threaded application codes.

Your being over-optimistic. It seems to me all they're hoping for is to define 
a standardized and portable high-level interface for programming accelerators. 
The ones that will benefit the most is the same applications that can benefit 
from lower level technologies like CUDA. Scientific/number crunching 
applications, some kinds of clustering, etc.

With no synchronization most existing multithreaded designs cannot benefit from 
it. And obviously code running on the accelerator cannot branch into the CPU, 
so no system or library calls. That leaves only purely number crunching loops. 
There's little of that on desktop and few of them can be fully optimized for 
parallelization. And to be worth the overhead of offloading the CPU needs to be 
maxed out. That leaves only the few applications I mentioned before.

I'm looking at Intel MICs[1] and those look a lot more promising though still 
of limited use for desktops. It uses OpenMP so it has a lot less restrictions 
than OpenACC (a few ebuilds in the tree can already benefit from it with minor 
patches) and you can even offload whole proccesses. You can even ssh to the MIC 
since it runs Linux. It's not for the average desktop but they're not too 
expensive either. It may be worth it for high-end gentoo workstation (you can 
offload compile jobs with distcc) and I got a project on the backburner that 
can 
benefit from it.

Do you know of any plans to enable offloading on the gentoo toolchain? I was 
able to build the offloading compiler using crossdev with a few hacks and wrote 
an ebuild for Intel's simulator[2]. I will work on enabling the host compiler 
tomorrow and may open a feature request and post patches once I get it 
working. The changes needed to enable it on the host are pretty trivial.

[1] 
https://software.intel.com/en-us/articles/intel-xeon-phi-coprocessor-codename-knights-corner

-- 
Fernando Rodriguez