I've finished on a port of Andrej Karpathy's microgpt.py to the D programming language. For those unfamiliar, it is a minimal, educational GPT implementation, great for understanding transformer architecture.

The D port stays faithful to the original: same architecture, same training logic. Just D instead of Python. D's operator overloading and struct semantics make it a surprisingly natural fit for this kind of low-level ML work.

Find it here: https://gist.github.com/DannyArends/12704c9207797a64338a5be4f1010bcf

Reply via email to