MaximilianSchreff commented on code in PR #1941: URL: https://github.com/apache/systemds/pull/1941#discussion_r1386178071
########## scripts/nn/layers/graph_conv.dml: ########## @@ -0,0 +1,262 @@ +#------------------------------------------------------------- +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +# +#------------------------------------------------------------- + +/* + * A graph convolutional layer as presented in 'Semi-Supervised Classification with Graph Convolutional Networks' + * by Kipf and Welling + */ + +forward = function(matrix[double] X, matrix[double] edge_index, matrix[double] edge_weight, + matrix[double] W, matrix[double] b, boolean add_self_loops) + return (matrix[double] X_out) +{ + /* Forward pass of the Graph Convolutional Layer. It transforms the node feature matrix + * with linear weights W and then executes the message passing according to the edges. + * The message passing is normalized by spectral normalization, i.e. for edge (v, w) the + * normalization factor is 1 / sqrt(degree(v) * degree(w)). + * + * n: number of nodes. + * m: number of edges. + * f_in: number of input features per node. + * f_out: number of output features per node. + * + * Inputs: + * - X: node features, matrix of shape (n, f_in). + * - edge_index: directed edge list specifying the out-node (first column) and the + * in-node (second column) of each edge, matrix of shape (m, 2). + * - edge_weight: weights of edges in edge_index, matrix of shape (m, 1). + * This should be all 1s if there should be no edge weights. + * - W: linear weights, matrix of shape (f_in, f_out). + * - b: bias, matrix of shape (1, f_out). + * - add_self_loops: boolean that specifies whether self loops should be added. + * If TRUE new self loops will be added only for nodes that do + * not yet have a self loop. Added self loops will have weight 1. + * + * Outputs: + * - X_out: convolved and transformed node features, matrix of shape (n, f_out). + */ + n = nrow(X) + m = nrow(edge_index) + + # transform + X_hat = X %*% W Review Comment: That is valid point, but I believe, the fact that my implementation was 10x slower than PyTorch is not down to my implementation but down to SystemDS' performance out of the box. It might be that this implementation is indeed slow but with the linear matrix multiplication alone being 3x slower than the whole layer in PyTorch it doesn't matter how fast the convolution part will get. It won't be acceptable anyway. So trying out the sparse matrices would be very effective with the current configuration. I would suggest running the stress test on a correct and fast configuration to see whether even the matrix multiplication alone will be faster than PyTorch. From there, it is easier to know what actually needs to be improved. About the machines that I used for testing - I tested it on 2 different machines. One was 96-core, 190GB RAM, x86 CPU and the other was a 14-core, 16GB RAM, ARM CPU. Since both of them don't have a GPU, SystemDS should be very competitive. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@systemds.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org