Hello Liu,

> On Apr 6, 2022, at 6:14 AM, Zhuojin Liu <[email protected]> wrote:
> 
> Hello, everyone.
> 
> I'm an undergraduate student who's interested in taking part in the GSoC 2022 
> program, and I'd like to discuss an idea that was not listed on the Summer of 
> Code Ideas list.
> 
> GPU is used in a lot of machine learning packs to accelerate training and 
> inference, however, mlpack does not support GPU acceleration so far. I found 
> the plan of adding CUDA support to mlpack on 
> <https://www.mlpack.org/papers/vision.pdf 
> <https://www.mlpack.org/papers/vision.pdf>>, and I'm interested in it. 
> Therefore, I want to spend the summer adding GPU support to mlpack. This plan 
> includes:
> 
> 1. Add a mlpack::mat class. Currently, we are using arma::mat in mlpack. 
> However, if we want to compute on GPUs, we have to use coot::mat. The 
> conversion between arma::mat and coot::mat is easy, but doing this conversion 
> manually every time when users want to convert from one to another is 
> bothering, thus a wrapper class that manages the device related information 
> is useful. The lower level implementation can use bandicoot and armadillo, we 
> only need to provide APIs that have the same semantics and names as arma::mat 
> and provide device related functions ( such as torch.Tensor.to 
> <https://torch.tensor.to/>(device) ). This looks like a big change, and I 
> don't think I am capable of designing a perfect class. So I want to be 
> discreet, only to implement this class after a proper discussion.

I think the better approach is to make sure every mlpack method uses a template
type that specifies the matrix type to use. Some of the mlpack implementations
already do that:

https://github.com/mlpack/mlpack/blob/de94eee6881e33f1145d9d4a8b5380a5be8af36a/src/mlpack/methods/det/dtree.hpp#L44-L46

which allows me to construct my DTree with:

DTree<arma::mat>, DTree<arma::fmat> or DTree<coot::mat> without having another
wrapper class around the armadillo and bandicoot matrix types. However other
implementations don't support the same interface:

https://github.com/mlpack/mlpack/blob/de94eee6881e33f1145d9d4a8b5380a5be8af36a/src/mlpack/methods/pca/pca.hpp#L55-L58

is one example which needs to be adapted.

> 
> 2. Implement ANN layers with bandicoot. No matter the mlpack::mat class will 
> be implemented or not, we can still implement ANN layers with bandicoot and 
> benefit from GPU acceleration. E.g., currently the naive_convolution 
> iteratively multiplies the corresponding elements in the filter and the 
> input, then add the result to the output. This can be parallelized easily on 
> GPUs. I want to start with some most used layers, implement their GPU version.

Do you think a better approach would be to use arma::conv2 instead, which would
allow us to implement a fast coot::conv2 version and use the same code in mlpack
for CPU and GPU, without falling back to a specific implementation?

> 
> 3. Contribute to bandicoot. Bandicoot is still an unstable library, thus we 
> may encounter unexpected situations like necessary functions are not 
> implemented or bugs. For example, some CUDA kernels in bandicoot can work 
> correctly only if the shape of the input matrix is a factor of 2. Therefore, 
> I want to implement functions and kernels and fix bugs I found during the 
> implementation of layers to help bandicoot release-ready.
> 
> Thank your time for reading this email, and I'm looking forward to hearing 
> valuable feedback from you. Have a good day!
> 
> Regards
> Liu, Zhuojin
> _______________________________________________
> mlpack mailing list
> [email protected]
> http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack

Thanks
Marcus


_______________________________________________
mlpack mailing list
[email protected]
http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack

Reply via email to