On Mon, May 25, 2026 at 09:11:25PM +0100, Gavin Smith wrote:
> On Mon, May 25, 2026 at 09:56:35PM +0200, Patrice Dumas wrote:
> > On Mon, May 25, 2026 at 07:27:09PM +0100, Gavin Smith wrote:
> > > I still don't know what is supposed to make two strings sort in a 
> > > predictable
> > > order if they differ only by case.  I checked that the sort keys are 
> > > identical
> > > in that case.  Patrice, do you remember anything?
> > 
> > The sorting does not only use the sort key, but also the number of the
> > index entry in index and the index names sort order (needed for merged
> > indices), as seen in Perl in Indices.pm _sort_index_entries.  Therefore,
> > the order should be predictable even if the upper and lower case letters
> > have the same sort key.
> 
> I understand now why the index entries are output in a predicatable order.
> However, this means that changing the order of index entries changes
> the order in the index, if entries differ only by letter case.

Indeed.

> > That being said, I do not know exactly why the strings are upper-cased
> > before being sorted.  Maybe this is relevant if there is no
> > Unicode::Collate sorting (presumably, the lowercase/uppercase sorting is
> > done well with Unicode::Collate), as it allows the upper-case and lower
> > case letter to be nearby in sort in that case.
> 
> Yes, exactly, although it wouldn't make upper case and lower case variants
> sort in a consistent order.  There may be ways to make that happen using
> strcmp comparison: something like:
> 
> sort key = uppercase(index entry) . '\x01' . index entry
> 
> - i.e., concatenate the uppercased index entry with the original index
> entry, with a low valued byte in between.  But it is not that important.

Is the '\x01' really needed?

Anyway, should this be added in the TODO?  Or do we consider that it is
ok and then I can simply add a comment in the code?

-- 
Pat

  • CI: ... Bruno Haible via Bug reports for the GNU Texinfo documentation system
    • ... Patrice Dumas
      • ... Gavin Smith
        • ... Patrice Dumas
    • ... Gavin Smith
      • ... Gavin Smith
        • ... Patrice Dumas
          • ... Gavin Smith
            • ... Patrice Dumas
              • ... Gavin Smith
        • ... Patrice Dumas

Reply via email to