MaximilianSchreff opened a new pull request, #2177: URL: https://github.com/apache/systemds/pull/2177
This PR introduces the **Gaussian Error Linear Unit (GELU)** activation function to SystemDS as a built-in operation. The implementation uses the widely adopted approximate formulation (https://arxiv.org/abs/1606.08415). This PR is part of a series of PRs to support famous Transformer architectures in SystemDS. The GELU activation the most commonly used activation functions in models like BERT and GPT. Includes - Forward pass - Backward pass ### Testing: Added two simple test cases comparing the **forward pass** and **backward pass** results against PyTorch's implementation for correctness. - The tests validate: - Forward pass against PyTorch's `torch.nn.functional.gelu`. - Backward pass against PyTorch's `torch.autograd.grad`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@systemds.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org