Re: [mlpack] GSoC Proposal: Graph Convolutional Networks (GCN)

2020-03-16 Thread Ryan Curtin
On Sun, Mar 15, 2020 at 11:21:48AM +0530, Hemal Mamtora wrote:
> Hello Mentors,
> 
> Thank you for your reviews regarding the project on the IRC. The've helped
> me shape the proposal well.

Hey there Hemal,

Glad to hear that things have been going well.  The basic idea of what
you wrote seems reasonable to me.  The use of arma::sp_mat to represent
graphs *should* be a reasonable approach, but it might be worth some
simple validation tests to make sure that arma::sp_mat is fast enough.
(Sometimes, a sparse representation of data can be slow.  However, I
think that arma::sp_mat *should* be fine... still, good to check.)

> *One query:*
> The Graph datasets available online have different formats like GraphML,
> Adjacency List in a CSV, Edge List in a CSV, etc.
> They would have to be converted to Adjacency Matrix.
> Most of the time the conversion is *straightforward* in O(E) time, where E
> is the number of edges.
> What are your views on having this functionality of ToAdjacencyMatrix() in
> MLpack.
> I think it should not be in MLpack. We could say that MLpack expects
> AdjacencyMatrix (Sparse or Dense) for Graph Networks.
> But if you think that this functionality should exist, I could start
> implementing it, asap.

Personally, I think that we could provide support in data::Load() to
load GraphML or adjacency lists or whatever sparse format, and I think
this could be useful to put into the mlpack codebase.  As it stands, I
think Armadillo supports loading coordinate list CSV files, so probably
a good bit of the support is already there.

Hope the input helps.

Thanks!

Ryan

-- 
Ryan Curtin| "It's too bad she won't live!  But then again, who
r...@ratml.org | does?" - Gaff
___
mlpack mailing list
mlpack@lists.mlpack.org
http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack


[mlpack] GSoC Proposal: Graph Convolutional Networks (GCN)

2020-03-14 Thread Hemal Mamtora
Hello Mentors,

Thank you for your reviews regarding the project on the IRC. The've helped
me shape the proposal well.

The following is the flow of the proposal for Graph Convolutional Networks.
I've tried to keep it as concise as possible. We could discuss in depth
regarding each or any of the parts of the proposal, if required.
Please provide suggestions or any comments for the same.
Thanks.

*Proposal:*
Since Graph Convolutional Networks work on graphs, it would be a good
extension to MLpack's breadth.
The reference paper link: https://arxiv.org/abs/1609.02907

*Preparatory work:*
1. Explore Sparse Matrices and Multiplication with Dense-Sparse matrices
with respect to Armadillo and MLpack
2. Graphs could be represented as Sparse matrices (Adjacency Matrix) so the
Input/Output of graph data would not be an issue.

*API structure:*
The API for GCN would be very similar to MLpack's regular Convolution
Operation, Convolution Layer and CNN.
I've gone through the source code for Convolution in MLpack repository.

*Phase 1:*
1. Implement Graph Convolutional *Operation*: H(l + 1) = σ(D̂-½ Ã D̂-½ H(l)
W(l)) [refer the paper: https://arxiv.org/abs/1609.02907]
2. Testing & Documentation for Graph Convolutional *Operation*

*Phase 2:*
1. Implement Graph Convolutional *Layer*: Forward(), Backward(),
Serialize(), Gradient(), etc.
2. Testing & Documentation for the same.

*Phase 3: *
1. Implement Graph Convolutional *Network* in mlpack/models
2. Testing Documenting & Benchmarking of Graph Convolutional *Network*.
Benchmarking with respect to tf and keras implementation provided by author
of the paper.


*Future Scope:*1. Spatio Temporal Graph Networks (ST-GCN, DCRNN)
2. Graph Attention Networks (GAT)

*References:*
1. (curated repositories) https://github.com/Jiakui/awesome-gcn
2. (survey paper) https://arxiv.org/abs/1901.00596

*One query:*
The Graph datasets available online have different formats like GraphML,
Adjacency List in a CSV, Edge List in a CSV, etc.
They would have to be converted to Adjacency Matrix.
Most of the time the conversion is *straightforward* in O(E) time, where E
is the number of edges.
What are your views on having this functionality of ToAdjacencyMatrix() in
MLpack.
I think it should not be in MLpack. We could say that MLpack expects
AdjacencyMatrix (Sparse or Dense) for Graph Networks.
But if you think that this functionality should exist, I could start
implementing it, asap.

Thank you once again!

Regards,
Hemal Mamtora
Final Year, Computer Engineering
Sardar Patel Institute of Technology, Mumbai
Contact: +91 75061 89728
___
mlpack mailing list
mlpack@lists.mlpack.org
http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack