Hey community, after last weeks success with channel construction, this week is calmer. It involves a steep learning curve for SIMD. So I was able to create my first VOLK kernels [3]. There are two new kernels for 8bit packing and unpacking. In case someone wants to pack 8 bytes with the LSB active into one byte, there's a new VOLK kernel to do this for you. At first, I thought, this is as simple as doing a load+movemask operation. Unfortunately, endianness stopped me from doing so. Thus it involves shuffling and AND, COMPARE operations too. Without Shuffling it should have worked with SSE2 but since shuffle is involved SSSE3 is required. I'm reading through all the docs and websites which target SIMD and find new ways to do things all the time. So, I guess it is a long way to go until I have some decent knowledge about SIMD instructions. Though, I could achieve a 7x speedup for packing bits compared to the generic implementation. Also, I created a kernel for unpacking. I wasn't very successful here. SSSE3 implementation is slower than the generic one for now. Maybe someone can give me a hint on what is going wrong here. I named those two new kernels 'volk_8u_pack8_8u' and 'volk_8u_unpack8_8u'. I hope this explains there operation. Suggestions on alternative names are welcome here. I tried to integrate my VOLK kernels into VOLKS test framework, but that is quite tough. It seems like it doesn't expect any rate changing kernels.
My aim for next week is to come up with a kernel for polar code encoding. This will include interleaving a lot of bits which is the actual issue to overcome. More info and current project progress can be found in [1], [2] and [3]. Cheers Johannes [1] https://github.com/jdemel/gnuradio [2] https://github.com/jdemel/socis-proposal [3] https://github.com/jdemel/volk _______________________________________________ Discuss-gnuradio mailing list [email protected] https://lists.gnu.org/mailman/listinfo/discuss-gnuradio
