rickzx opened a new pull request, #16188:
URL: https://github.com/apache/tvm/pull/16188

   The gelu_tanh computation (e.g. the gelu_new activation in huggingface API) 
is incorrect in the current relax code. This PR fixes the bug.
   
   Correctness testing passes (which fails previously):
   
   ```
   inp = torch.randn((2, 4), dtype=torch.float32)
   
   def tanh_gelu(input):
       return 0.5 * input * (1.0 + np.tanh(np.sqrt(2.0 / np.pi) * (input + 
0.044715 * torch.pow(input, 3.0))))
   
   out1 = tanh_gelu(inp)
   
   class TanhGelu(nn.Module):
       def __init__(self):
           pass
   
       def forward(self, x: nn.Tensor):
           return op.gelu(x, "tanh")
   
   forward_spec = {"forward": {"x": spec.Tensor([2, 4], dtype="float32")}}
   gelu = TanhGelu().jit(forward_spec)
   out2 = gelu['forward'](inp)
   
   assert torch.allclose(out1, out2)
   ```
   
   Updated unit tests


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to