Hi Matthias, all, Matthias Noack <[email protected]> writes: > the following talk from IWOCL'17 has some numbers on PoCL on Xeon (HSW), > and Xeon Phi (KNL) for two applications using PoCL and other SDKs: > > http://www.iwocl.org/wp-content/uploads/iwocl2017_matthias-noack-good-bad-ugly.pdf > > The numbers for the comet simulation (slide 18, pdf 53), show some weird > outliers every 32 work-items, when increasing the overall number of > work-items (particles). > > The HEOM code reveals a large performance gap in comparison with the > Intel SDK on KNL, while is looks quite competitive on HSW. This is > especially strange, as the Intel SDK has only AVX2 support, while PoCL > should be able to generate AVX-512 code using LLVM. I compiled PoCL and > LLVM on the target architectures, and made sure the result is an actual > Haswell and KNL built. > > Is anyone interested in working on improving these issues?
I use pocl extensively as my reference CL implementation on which all my numerical research codes are based. I naturally have an interest in making those run as fast as possible. While I (sadly) likely won't be able to contributed myself, I very likely might be able to contribute time in the form of student projects. If someone could identify "student-sized" subprojects of this endeavor, that'd be hugely helpful. :) Andreas ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ pocl-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/pocl-devel
