Eli, You're not missing anything. This is a bug in the documentation of decomps.txt. Initially, added decompositions for the DUCET default weights were all tagged as <sort>. This results in a distinct *tertiary* weight in the initial collation weight values in DUCET. Later on, there turned up cases where an added decomposition for the DUCET input worked better *without* a distinct tertiary weight. In particular, this applies to the large collection of combining marks whose secondary weights are now collapsed into a smaller set of distinct values. It also applies to the o with stroke character you cite below. The documentation for decomps.txt just needs to be updated to reflect that new pattern.
--Ken On 2/21/2016 8:32 AM, Eli Zaretskii wrote:
# 3. In some cases a new decomposition is added for a character which # has no decomposition mapping in UnicodeData.txt. In this third case, # a new decomposition tag "<sort>" is introduced, to distinguish these # introduced decompositions from those derived from UnicodeData.txt. However, I see in decomps.txt entries that seem to belong to neither of the 3 classes described above. Here are 2 notable examples: 00F8;;006F 0338 # LATIN SMALL LETTER O WITH STROKE => LATIN SMALL LETTER O + COMBINING LONG SOLIDUS OVERLAY 0142;;006C 0335 # LATIN SMALL LETTER L WITH STROKE => LATIN SMALL LETTER L + COMBINING SHORT STROKE OVERLAY In both these cases, UnicodeData.txt defines no decomposition properties, but the "<sort>" tag I expected to see is absent from decomps.txt. Is there something I'm missing here?

