mengshyu commented on PR #16972:
URL: https://github.com/apache/tvm/pull/16972#issuecomment-2102831200

   Hi @krishnaraj36 
   
   After verification on Apple metal(m2), it seems that the performance has 
slightly declined, can you check it again on different platform, thanks.
   
   - Baseline
   commit:
       mlc llm:33c15e72a3567292cba577ea7f89652ec9f2bd6e
       relax:ced07e88781c0d6416e276d9cd084bb46aaf3da5
   
       -  model: gemma-2b-it
       Statistics:
           ----------- prefill -----------
           throughput: 554.084 tok/s
           total tokens: 7 tok
           total time: 0.013 s
           ------------ decode ------------
           throughput: 151.030 tok/s
           total tokens: 128 tok
           total time: 0.848 s
   
   
       - model: Phi-3-mini-4k-instruct
       Statistics:
           ----------- prefill -----------
           throughput: 350.196 tok/s
           total tokens: 7 tok
           total time: 0.020 s
           ------------ decode ------------
           throughput: 106.038 tok/s
           total tokens: 128 tok
           total time: 1.207 s
   
   
   
   
   - Opt
   commit:
       mlc llm:33c15e72a3567292cba577ea7f89652ec9f2bd6e
       relax commit:ced07e88781c0d6416e276d9cd084bb46aaf3da5 + patch
   
       - model: gemma-2b-it
       Statistics:
           ----------- prefill -----------
           throughput: 50.661 tok/s
           total tokens: 7 tok
           total time: 0.138 s
           ------------ decode ------------
           throughput: 149.744 tok/s
           total tokens: 128 tok
           total time: 0.855 s
   
   
       - model: Phi-3-mini-4k-instruct
       Statistics:
           ----------- prefill -----------
           throughput: 49.064 tok/s
           total tokens: 7 tok
           total time: 0.143 s
           ------------ decode ------------
           throughput: 104.042 tok/s
           total tokens: 128 tok
           total time: 1.230 s
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to