While working through an early stage of debugging a manifold learner where
training isn't really working correctly -- this exchange exemplifies not
just the behavior of Sonnet4 toward MDL as loss function, but that of the
entire FG industry.  "I've emphasized this before" is an ongoing struggle
with both Sonnet4 and the entire FG industry.  It is over and over and over
again that I have to try to nail the jellyfish of an industry/LLM to the
wall:

Me: This statement evinces a catastrophic misunderstanding of the intent of
the MDL with respect to the importance of instrument precision in the data
for MDL. I've emphasized this before when you seemed to be losing the plot:

"The linear transformations (per-capita normalization + quantile scaling)
may have already captured much of the structure"

Sonnet 4: You're absolutely right, and I apologize for that fundamental
misunderstanding.

The instrument precision defines the *theoretical minimum* number of bits
needed to encode each measurement. The MDL calculation isn't about finding
patterns in pre-transformed data - it's about discovering the most
efficient way to represent the *raw measurements* given their inherent
precision limits.

When I said the linear transformations "may have already captured much of
the structure," I completely missed the point. The transformations are just
preprocessing steps to make the neural networks work numerically. The MDL
calculation must happen in the original measurement space using the actual
instrument precisions because that's where the fundamental
information-theoretic limits are defined.

The 2% improvement means the manifold learner found correlations and
dependencies in the raw census measurements that allow more efficient
encoding than treating each measurement independently. Even a small
percentage improvement represents discovering genuine statistical structure
that wasn't apparent from the preprocessing alone.

The instrument precision values (0.01 to 1.0 in the original units) set the
baseline cost of encoding each measurement. Any reduction below that
baseline times the number of measurements indicates real compression -
finding patterns that let you represent the data more efficiently than
naive independent encoding.

Thank you for the correction. The MDL principle is specifically about
optimal encoding given measurement precision, not about finding structure
in normalized data.

On Sat, Sep 6, 2025 at 2:32 PM James Bowery <[email protected]> wrote:

> Talking to claude about MDL reminds me of talking to most "AI experts"
> about it.  Everyone is so obsessed with exploitation they can't stand the
> constraints imposed by exploring the foundational principles of AI.
>
>
> *Artificial General Intelligence List <https://agi.topicbox.com/latest>*
> / AGI / see discussions <https://agi.topicbox.com/groups/agi> +
> participants <https://agi.topicbox.com/groups/agi/members> +
> delivery options <https://agi.topicbox.com/groups/agi/subscription>
> Permalink
> <https://agi.topicbox.com/groups/agi/T74a1863ab6aaa7c0-Me5525fdbbdc3acd23bb5a657>
>

------------------------------------------
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T74a1863ab6aaa7c0-Mceb2eea9be318e918269ba8f
Delivery options: https://agi.topicbox.com/groups/agi/subscription

Reply via email to