Hi TVM folks,                                                                 
                                                                       
                                                                                
                                                                       
  I wanted to share a small language-runtime experiment and ask whether systems 
like this fit naturally into the compiler/runtime co-design conversation.       
                                                                                
                                                 
                                                                                
                                                                       
  We built a public demo line called Engram and deployed it on a commodity 
ESP32-C3.                                                                   
                                                                                
                                                                       
  Current public numbers:                                                       
                                                                       
                                                                                
                                                                       
  * Host-side benchmark capability                                              
                                                                       
    * `LogiQA = 0.392523`                                                       
                                                                       
    * `IFEval = 0.780037`                                                       
                                                                       
                                                                                
                                                                       
  * Published board proof                                                       
                                                                       
    * `LogiQA 642 = 249 / 642 = 0.3878504672897196`                             
                                                                       
    * `host_full_match = 642 / 642`                                             
                                                                       
    * runtime artifact size = `1,380,771 bytes`                                 
                                                                       
                                                                                
                                                                       
  Important scope note:                                                         
                                                                       
                                                                                
                                                                       
  This is **not** presented as unrestricted open-input native LLM generation on 
MCU.                                                                   
                                                                                
                                                                       
  The board-side path is closer to a flash-resident, table-driven runtime with: 
                                                                       
                                                                                
                                                                       
  * packed token weights                                                        
                                                                       
  * hashed lookup structures                                                    
                                                                       
  * fixed compiled probe batches                                                
                                                                       
  * streaming fold / checksum style execution over precompiled structures       
                                                                       
                                                                                
                                                                       
  So this is not a conventional dense graph compiler story. It is closer to a 
task-specialized language runtime whose behavior has been pushed into a compact 
executable form under severe memory constraints.                                
                                                             
                                                                                
                                                                       
  Repo:                                                                         
                                                                       
  https://github.com/Alpha-Guardian/Engram                                      
                                                                       
                                                                                
                                                                       
  What I’m curious about is whether people here would think of systems like 
this as:                                                                   
  * outside the normal ML compiler scope                                        
                                                                       
  * an adjacent compiler/runtime co-design problem                              
                                                                       
  * or evidence that some language-task systems may want a very different 
compiled execution form than standard graph runtimes                         
                                                                                
                                                                       
  Would be very interested in any thoughts.





---
[Visit 
Topic](https://discuss.tvm.apache.org/t/how-would-you-classify-a-flash-resident-tiny-language-runtime-from-a-compiler-runtime-co-design-perspective/18961/1)
 to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.apache.org/email/unsubscribe/3945c36f637679a5f8c81f2478ba5d714dcd6f423c78cbbfff75add9bd7a1806).

Reply via email to