rickzx opened a new pull request, #16188:
URL: https://github.com/apache/tvm/pull/16188
The gelu_tanh computation (e.g. the gelu_new activation in huggingface API)
is incorrect in the current relax code. This PR fixes the bug.
Correctness testing passes (which fails previously):
```
inp = torch.randn((2, 4), dtype=torch.float32)
def tanh_gelu(input):
return 0.5 * input * (1.0 + np.tanh(np.sqrt(2.0 / np.pi) * (input +
0.044715 * torch.pow(input, 3.0))))
out1 = tanh_gelu(inp)
class TanhGelu(nn.Module):
def __init__(self):
pass
def forward(self, x: nn.Tensor):
return op.gelu(x, "tanh")
forward_spec = {"forward": {"x": spec.Tensor([2, 4], dtype="float32")}}
gelu = TanhGelu().jit(forward_spec)
out2 = gelu['forward'](inp)
assert torch.allclose(out1, out2)
```
Updated unit tests
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]