Hello, everyone.

I'm an undergraduate student who's interested in taking part in the GSoC 2022 
program, and I'd like to discuss an idea that was not listed on the Summer of 
Code Ideas list.

GPU is used in a lot of machine learning packs to accelerate training and 
inference, however, mlpack does not support GPU acceleration so far. I found 
the plan of adding CUDA support to mlpack on 
<https://www.mlpack.org/papers/vision.pdf>, and I'm interested in it. 
Therefore, I want to spend the summer adding GPU support to mlpack. This plan 
includes:

1. Add a mlpack::mat class. Currently, we are using arma::mat in mlpack. 
However, if we want to compute on GPUs, we have to use coot::mat. The 
conversion between arma::mat and coot::mat is easy, but doing this conversion 
manually every time when users want to convert from one to another is 
bothering, thus a wrapper class that manages the device related information is 
useful. The lower level implementation can use bandicoot and armadillo, we only 
need to provide APIs that have the same semantics and names as arma::mat and 
provide device related functions ( such as torch.Tensor.to(device) ). This 
looks like a big change, and I don't think I am capable of designing a perfect 
class. So I want to be discreet, only to implement this class after a proper 
discussion.

2. Implement ANN layers with bandicoot. No matter the mlpack::mat class will be 
implemented or not, we can still implement ANN layers with bandicoot and 
benefit from GPU acceleration. E.g., currently the naive_convolution 
iteratively multiplies the corresponding elements in the filter and the input, 
then add the result to the output. This can be parallelized easily on GPUs. I 
want to start with some most used layers, implement their GPU version.

3. Contribute to bandicoot. Bandicoot is still an unstable library, thus we may 
encounter unexpected situations like necessary functions are not implemented or 
bugs. For example, some CUDA kernels in bandicoot can work correctly only if 
the shape of the input matrix is a factor of 2. Therefore, I want to implement 
functions and kernels and fix bugs I found during the implementation of layers 
to help bandicoot release-ready.

Thank your time for reading this email, and I'm looking forward to hearing 
valuable feedback from you. Have a good day!

Regards
Liu, Zhuojin
_______________________________________________
mlpack mailing list
[email protected]
http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack

Reply via email to