Re: DConf 2017 Berlin - Streaming ?
On Saturday, 6 May 2017 at 22:16:34 UTC, سليمان السهمي (Soulaïman Sahmi) wrote: Recordings hidden/removed? do we need to wait another month to get them? Supposedly just until Monday. I really don't get why they couldn't leave the full recordings up while they're in the process of slicing them up, though.
Re: Mir GLAS is a C library and passes Natlib's test suite! And questions :-)
On Saturday, 29 October 2016 at 11:25:17 UTC, Nicholas Wilson wrote: On Saturday, 29 October 2016 at 10:21:02 UTC, Guillaume Piolat wrote: On Saturday, 29 October 2016 at 01:43:03 UTC, Nicholas Wilson wrote: If you have any experience with either OpenCL or CUDA we'd love to have your input. Have experience with both, more CUDA than OpenCL though. Feel free to contact me. Great! I'll let you know when the compiler stuff is merged, probably in mir's public gitter. I have experience with both CUDA and OpenCL. As soon as the compiler stuff is in, I'd be happy to port some of my standard microbenchmarks (mostly computational fluid dynamics stuff) and see how it stacks up in both performance/ease of use and get you some feedback. I'm following the git repo and occasionally checking these forums, but feel free to contact me via e-mail or any other medium.
Re: Mir GLAS vs Intel MKL: which is faster?
First of all, awesome work. It's great to see that it's possible to match or even exceed the performance of hand-crafted assembly implementations with generic code. I would suggest adding more information on how the Eigen results were obtained. Unlike OpenBLAS, Eigen performance does often vary by compiler and varies greatly depending on the kind of preprocessor macros that are defined. In particular, EIGEN_NO_DEBUG is defined by default and reduces performance, EIGEN_FAST_MATH is not defined by default but can often increase performance and EIGEN_STACK_ALLOCATION_LIMIT matters greatly for performance on very small matrices (where MKL and especially OpenBLAS are very inefficient). It's been a while since I've used Eigen, so I may have forgotten one or two. It may also be worth noting in the blog post that these are all single threaded comparisons and multithreaded implementations are on the way. This is obvious to anyone who's followed the development of Mir, but a general audience on Reddit will likely point it out as a deficiency unless stated upfront.