PatrikPerssonInceptron opened a new pull request, #17505:
URL: https://github.com/apache/tvm/pull/17505

   # Problem
   The shape names in an ONNX model can contain expressions such as the shape 
`int64[batch_size,past_sequence_length + sequence_length]` of the attention 
mask of an LLM. In this case, the second dimension contains an expression 
`past_sequence_length + sequence_length` where `past_sequence_length` and 
`sequence_length` should be individual variables added together. However, 
currently, a new variable named `"past_sequence_length + sequence_length"` is 
instead created when translating the graph. 
   
   # Fix
   I added a simple parser that creates individual size variables for the 
variable names and generates the resulting prim expression. Note, in order to 
keep the parser simple, it evaluates expressions left to right. Not accounting 
for operator precedence. 
   
   # Test
   I added regression tests to verify that the onnx shape dim expression are 
evaluated correctly.
   
   # Additional small fixes
   In the case when PrimValues are encountered in the BinaryBase, they are not 
always fully extracted before turning them into numpy arrays. I added an 
additional check that extracts the value from IntImm and FloatImm types before 
converting them to numpy arrays.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to