Re: [GIT PULL] bcachefs fixes for 6.15-rc4

Linus Torvalds Fri, 25 Apr 2025 09:36:25 -0700

On Thu, 24 Apr 2025 at 21:52, Kent Overstreet <kent.overstr...@linux.dev> wrote:
>
> And the attitude of "I hate this, so I'm going to partition this off as
> much as I can and spend as little time as I can on this" has all made
> this even worse - the dcache stuff is all half baked.


No. The dcache side is *correct*.

The thing is, you absolutely cannot make the case-insensitive lookup
be the fast case.

So it's partitioned off not because people don't want to deal with it
(which also admittedly _is_ true), but because partitioning off is a
firewall against the code generation garbage case that simply *cannot*
be done well and allows the proper cases to be properly optimized.

Now, if filesystem people were to see the light, and have a proper and
well-designed case insensitivity, that might change. But I've never
seen even a *whiff* of that. I have only seen bad code that
understands neither how UTF-8 works, nor how unicode works (or rather:
how unicode does *not* work - code that uses the unicode comparison
functions without a deeper understanding of what the implications
are).

Your comments blaming unicode is only another sign of that.

Because no, the problem with bad case folding isn't in unicode.

It's in filesystem people who didn't understand - and still don't,
after decades - that you MUST NOT just blindly follow some external
case folding table that you don't understand and that can change over
time.

The "change overr time" part is particularly vexing to me, because it
breaks one of the fundamental rules that unicode was *supposed* to
fix: no locale garbage.

And the moment you think you need "unicode versioning", you have
basically now created a locale with a different name, and you MISSED
THE WHOLE %^$*ING POINT OF IT ALL.

And yes, *those* problems come from people thinking it's "somebody
else's problem that they solved for me" without actually understanding
that no, that wasn't the case at all. Many of the unicode rules were
about *glyphs*, and simply cannot be used for filesystems or equality
comparisons.

Which isn't to say that Unicode doesn't have problems, but the real
problem is then using it without understanding the problems.

So the real issue with unicode is that it's very complicated, and it
tried to solve many different problems, and that then resulted in
people not understanding that not all of it was appropriate for
*their* use.

Part of it is the "CS disease": thinking that a generic solution is
always "better". Not so. Being overrly generic is often much much
worse than having a targeted solution to a intentionally limited
problem.

   "Everything Should Be Made as Simple as Possible, But Not Simpler".

and involving unicode in case folding is antithetical to that
fundamental concept.

What I personally strongly feel should have been done is to just limit
case folding knowingly to a very strict subset, and people should have
said "we're being backwards compatible with FAT" or something like
that. Instead of extending the problem space to the point where it
becomes a huge problem, re-introduces "locales" in a different guise,
and creates security issues because people don't understand just *how*
big they made the problem space.

Oh well. Rant over.

                Linus

Re: [GIT PULL] bcachefs fixes for 6.15-rc4

Reply via email to