PatrikPerssonInceptron opened a new pull request, #17505: URL: https://github.com/apache/tvm/pull/17505
# Problem The shape names in an ONNX model can contain expressions such as the shape `int64[batch_size,past_sequence_length + sequence_length]` of the attention mask of an LLM. In this case, the second dimension contains an expression `past_sequence_length + sequence_length` where `past_sequence_length` and `sequence_length` should be individual variables added together. However, currently, a new variable named `"past_sequence_length + sequence_length"` is instead created when translating the graph. # Fix I added a simple parser that creates individual size variables for the variable names and generates the resulting prim expression. Note, in order to keep the parser simple, it evaluates expressions left to right. Not accounting for operator precedence. # Test I added regression tests to verify that the onnx shape dim expression are evaluated correctly. # Additional small fixes In the case when PrimValues are encountered in the BinaryBase, they are not always fully extracted before turning them into numpy arrays. I added an additional check that extracts the value from IntImm and FloatImm types before converting them to numpy arrays. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
