As this thread is already going down the "inner workings of AI" road...
I found this video pretty informative. A retired Microsoft engineer explains
how neural networks work... on a PDP-11, no less!
https://www.youtube.com/watch?v=OUE3FSIk46g
Sent with [Proton Mail](https://proton.me/mail/home) secure email.
On Wednesday, May 27th, 2026 at 5:47 PM, Kirn Gill II via Freedos-devel
<[email protected]> wrote:
> Due to the inherent nature of AI models, such citations are fundamentally
> impossible (at the very least, every four bytes of useful AI model weights
> would need tends of kilobytes of attributional metadata, good luck figuring
> out how to properly cite anything this way) and thus the ONLY sensible thing
> is to shut them out entirely for anything where citations matter.
>
> An AI model does not (internally) use a database, tagged or otherwise, I
> don't know where this myth keeps coming from. The data is converted into
> vectors and translated into intensities (weights) and this is a lossy process.
>
> I'd strongly recommend reading up on their operational mechanisms; it's
> certainly interesting.
>
> Here's a high level crash course:
>
> There's no chat or active work session. Everything going into the AI is
> basically a single flat text file that's dictionary-compressed (and not
> further processed, i.e. tokenized), using a precomputed (token) dictionary.
> There's an array of vectors and the dictionary index for each token serves as
> the array index for the corresponding vector. This vector is then slightly
> modified based on the input token's position within the whole input context
> window. A bunch of matrix multiplication is applied to that. A dot product
> operation called "softmax" is used, resulting in an output array that's as
> wide as the token dictionary. This array is full of various floating-point
> values called "logits." Then, some sampling algorithm either grabs the one
> with the highest number and uses its index as the dictionary index for the
> next bit of text to be used as the next "prediction". Then, that newly
> predicted output token is added to the end of the input tokens for the next
> processing round, over and over, until an "end of sequence" token is emitted
> or the output limit is reached (which usually involves the model getting cut
> off mid-sentence, like The Sopranos.)
>
> And there you have it: explain an LLM badly. It's a big statistical engine
> that works word-by-word. Unfortunately, it cannot provide citations for where
> it gets each word due to how the matrix multiplication/softmax stuff is
> exploited.
>
> Incidentally, this means it doesn't "know" what it's "doing", or that it's
> "doing" anything at all: Tool calls are just output tagged with specific
> output tokens. The illusion of chat, the tool calls, and it all is a metadata
> language called ChatML. That's really all the LLM does, reads a ChatML file
> and adds a new stanza at the end.
>
> --
> Kirn Gill II
> Mobile: [+1 813-300-2330](tel:+18133002330)
> VoIP: [+1 813-704-0420](tel:+18137040420)
> Email: [email protected]
> LinkedIn: http://www.linkedin.com/pub/kirn-gill/32/49a/9a6
>
> On Wed, May 27, 2026 at 4:18 PM Louis Santillan via Freedos-devel
> <[email protected]> wrote:
>
>> I believe AI needs to be treated like a tool like any other search
>> engine or automation or compiler. If a contribution includes work
>> performed with AI, it needs to cite its sources academically (like
>> IEEE, ACM, or CSE style), and legally (for compliance with GPL, LGPL,
>> BSD, MIT, et al). Additionally, the AI contribution needs to attempt
>> to be reproducible and/or its production needs to be documented
>> (Model/version used, chat log, prompts, input artifacts, etc.) and
>> these records should be accompanied with the contribution as part of
>> the 'source'.
>>
>> I think, otherwise, we risk worthwhile contributions and opportunities
>> for contribution. This would be akin to "requiring" every that
>> produces a contribution for FreeDOS must use 'OpenWatcom C' or 'hand
>> write binary machine code' or some other similarly silly requirement.
>> No one questions whether you use Borland or MS or OpenWatcom or GNU
>> assemblers or compilers today. Because they're tools that make
>> producing the work possible.
>>
>> _______________________________________________
>> Freedos-devel mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/freedos-devel
_______________________________________________
Freedos-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/freedos-devel