Hello, I am Ajai V George from BITS Pilani, Goa Campus, India. My major is Electrical and Electronics Engineering. I currently work with Professor Santonu Sarkar from Centre for High Performance and Dependable Systems (Department of Computer Science and Information Systems). My project is Building 2D Algorithms with shared memory in Thrust which is a CUDA library. I am implementing Berkeley Structured Grid Dwarf by extending Thrust.
Due to this project, I have significant experience with CUDA and with implementing STL like data structures in CUDA. I have created a block_2d data structure for Thrust for 2d data storage. I created a Thrust and thus STL compatible iterator for this. I also created an abstraction layer for accessing windows (as described in structured grid dwarf) within this data structure and an accompanying iterator which lazily generates these windows. I have created highly optimized for_each, transform, reduce, and transform_reduce functions for both the thrust 1D device vector and my block_2d data structure. These algorithms use shared memory efficiently with proper memory access coalescence and no shared memory bank conflicts. The reduction algorithm also uses a highly optimized tree based approach. I have also created this framework such that it can be extended to be used with cuFFT and cuBLAS libraries easily. Due to the above background I believe, I can work on the Parallel Algorithms for HPX project for GSoC 2017, specifically on extending algorithms like scan, reduce, transform, for_each, fill, etc to work with hpx::vector. I have already cloned the repo and have built hpx and am currently looking through the source code to see what would be impacted by the project and what changes would be required. I request help from the community in crafting my proposal, and in understanding hpx codebase. Regards, Ajai V George
_______________________________________________ hpx-users mailing list [email protected] https://mail.cct.lsu.edu/mailman/listinfo/hpx-users
