On Tue, Aug 12, 2025 at 9:09 AM Chao Li <li.evan.c...@gmail.com> wrote:

[bringing this back to the original thread]

> So, I compared 2000 ucm with 2005 ucm also compared 2005 ucm with 2022 ucm. 
> Then I found that some changed in 2005 is reverted in 2022, that why diff 
> between 2000 and 2022 is small. For example, the following mappings

Yes, this was mentioned in the "disruptive changes" document linked in
my first email in this thread:

"The 2005 edition included 6 characters with double mappings. The 2022
edition removes the
double mappings.
The 2005 edition included 9 characters from the CJK Compatibility
Ideographs block. In
Unicode/10646, these all have canonical decomposition mappings to
characters in the URO. In
the 2022 edition, these nine compatibility characters are removed."

> So, for how to create patch 2, I think we have 3 options:
>
> 1. As planned, update to the latest version of 2000 ucm, then skip 2005 and 
> directly upgrade to 2022 in patch 3. This way, we just honor 2000 ucm 
> regardless that the change is actually introduced by 2005.
>
> 2. Skip the latest version of 2000 ucm and upgrade to 2005 ucm. This way will 
> clearly show the upgrade path 2000->2005->2022. Downside is that 2005 
> introduced some changes that are reverted in 2022, which will cause some 
> unnecessary changes in map files.
>
> 3. Skip patch 2, directly go to patch 3. So that, patch 3 will include 
> changes introduced by both 2005 and 2022. This way makes minimum changes to 
> map files.

#3 is what I had in mind to begin with unless we found some reason not
to. Minimizing churn is a lucky side effect that reinforces that
choice.

Before getting to that, I thought I'd bring this up to the community:

+# Copyright (C) 2000-2009, International Business Machines
Corporation and others.
+# All Rights Reserved.

The previous XML file didn't contain a copyright notice -- does anyone
want to make a case for not checking unicode-org's source file into
our tree because of this? The 2022 update changes it to

# Copyright (C) 2016 and later: Unicode, Inc. and others.
# License & terms of use: http://www.unicode.org/copyright.html
# Copyright (C) 2000-2012, International Business Machines Corporation
and others.
# All Rights Reserved.

...and the above links to https://www.unicode.org/license.txt

--
John Naylor
Amazon Web Services


Reply via email to