Hi Denis, Colleagues, As you requested, the Database team will prepare a thorough investigation (impact analysis) of UTF-8 in the RIPE Database, as a starting point for further discussion.
Regards Ed Shryane RIPE NCC > On 5 Aug 2020, at 14:46, [email protected] wrote: > > Colleagues > > We have a problem with UTF-8. Many people keep saying you want it, we should > have it, lets do it...But every time we get to these difficult, non technical > questions every one goes silent. This is why we have never implemented UTF-8 > since it was first mentioned many years ago. No one in the community seems to > know how to answer these questions. > > So I have a suggestion. The RIPE NCC has the manpower with the expertise to > investigate these issues. I propose we put a task on the RIPE NCC to do a > thorough investigation of UTF-8 in the RIPE Database from all possible angles > and report back to the community. This can be a starting point to a more > meaningful discussion. > > We need to know what impact having non Latin1 characters in different parts > of the data set will have on the RIPE Registry, the RIPE NCC members, the > different user groups of the RIPE Database and the social, legal and > political impact of such a change. Which parts of the data set > can/should/shouldn't be allowed to be in other character sets. Who really > needs access to this data and what parts of it need to be understandable or > interpreted. Which does bring into question the whole purpose of the RIPE > Database and the data contained therein. > > Thoughts??? > > cheers > denis > > co-chair DB-WG > > > > On Friday, 31 July 2020, 20:20:10 CEST, Leo Vegoda <[email protected]> wrote: > > > Hi Denis, > > These are good questions. As so many of the answers lie with the RIPE > NCC or the NRO, I suppose we need input from them to proceed further. > > Kind regards, > > Leo > > On Wed, Jul 29, 2020 at 3:47 PM [email protected] > <mailto:[email protected]> > <[email protected] <mailto:[email protected]>> wrote: > > > > Hi Leo > > > > Some of the questions that need to be answered include: > > > > -who needs to be able to read/understand/interpret which parts of the data > > in the RIPE Database (maybe both the community and the NCC need input to > > answer this)? > > -is any of the data contained in the RIPE Database essential for the > > operation of the registry and not duplicated anywhere else (maybe the NCC > > and the NCC Services WG need input to answer this)? > > -is any of the data important to LEAs and governments, is that a > > consideration, do they have the resources to understand the data in any > > format (community and LEAs input needed for this one)? > > -One of the mission statements of the NRO is "Providing and promoting a > > coordinated Internet number registry system" so if we are going to > > internationalise the public face of the registry should it be > > coordinated(is that a community, RIR or NRO question)? > > > > cheers > > denis > > > > co-chair DB-WG > > > > On Wednesday, 29 July 2020, 21:09:55 CEST, Leo Vegoda <[email protected] > > <mailto:[email protected]>> wrote: > > > > > > Hi Denis, > > > > I agree that this is a registry issue and not just a database issue, > > which is why I sent the message I did on 8 July. > > > > I'd like to understand how much of this work should be led by the RIPE > > NCC versus the community. Also, because of the breadth of the issues, > > should the discussion be here or on another list? > > > > Kind regards, > > > > Leo Vegoda > > > > On Wed, Jul 29, 2020 at 10:45 AM [email protected] > > <mailto:[email protected]> > > <[email protected] <mailto:[email protected]>> wrote: > > > > > > Hi Leo > > > > > > As I have said many times, internationalising the RIPE Database is not a > > > technical issue, it is a registry issue. I think it does need a separate > > > process from the database requirements. Especially if we consider it as a > > > cross registry issue. > > > > > > Incidentally I did suggest on this mailing list several months ago that > > > the requirements task force considers the issue of UTF-8. No one from the > > > task force has yet replied to me on that or any other comment I have made > > > about the requirements. > > > > > > cheers > > > denis > > > > > > co-chair DB-WG > > > > > > On Wednesday, 29 July 2020, 18:20:14 CEST, Leo Vegoda <[email protected] > > > <mailto:[email protected]>> wrote: > > > > > > > > > Hi, > > > > > > Thanks for providing the impact analysis for this initial change. > > > > > > What should the process be for introducing greater support for > > > internationalization in the RIPE Database? George, Cynthia and others > > > have made good points about the need to improve internationalization > > > of more than just e-mail addresses. Is that support something that > > > should be handled through the process that follows the final report of > > > the Database TF or does it need to be addressed separately? > > > > > > Thanks, > > > > > > Leo > > > > > > On Wed, Jul 29, 2020 at 8:03 AM Edward Shryane via db-wg <[email protected] > > > <mailto:[email protected]>> wrote: > > > > > > > > Dear Colleagues, > > > > > > > > Here is the impact analysis for the NWI-11 implementation. > > > > > > > > The Database team plans to implement NWI-11 as per the Solution > > > > Definition: > > > > https://www.ripe.net/ripe/mail/archives/db-wg/2020-June/006525.html > > > > <https://www.ripe.net/ripe/mail/archives/db-wg/2020-June/006525.html> > > > > > > > > (1) Impact to Whois Update > > > > > > > > The implementation will automatically apply Punycode encoding (as per > > > > RFC 5891) to the domain part of an email address during Whois update. > > > > > > > > The encoding is only applied to an IDN domain name, and changes the > > > > current behaviour as follows: > > > > - ASCII encoded values will not be affected (as before). > > > > - Non-ASCII but latin-1 encoded values will be encoded as Punycode. > > > > - Non-latin-1 encoded values (e.g. UTF-8) will also be encoded as > > > > Punycode. These values previously were transformed to latin-1, with a > > > > '?' substitution. > > > > > > > > The local part of an email address must only contain ASCII characters. > > > > If non-ASCII characters are found in the local part, the address is > > > > rejected as invalid. > > > > > > > > This change will only affect attributes with an email address syntax > > > > (i.e. abuse-mailbox, e-mail, irt-nfy, mnt-nfy, notify, ref-nfy, upd-to). > > > > > > > > If an email address is converted to Punycode, a warning will be > > > > included in the update response. > > > > > > > > Any Punycode conversion failure will result in the attribute value > > > > being rejected as invalid. A workaround in this case is to encode the > > > > value before submitting the update. > > > > > > > > (2) Impact to Whois Query > > > > > > > > When querying the RIPE database, any Punycode encoded email address is > > > > returned as-is (i.e it is not decoded). > > > > > > > > (3) Impact to Existing Data > > > > > > > > We will perform a cleanup to convert any existing non-ASCII (but > > > > latin-1 encoded) IDN domain names to Punycode in attributes with an > > > > email address syntax. This affects very few objects. The maintainer(s) > > > > will be notified by email beforehand. > > > > > > > > (4) Impact to Whois Documentation > > > > > > > > We will update the database documentation with details of this > > > > behaviour change. > > > > > > > > (5) Release Timeline > > > > > > > > We expect the NWI-11 implementation to take about 1 month (including > > > > code changes and testing), and will include the feature in the Whois > > > > 1.98 release. > > > > > > > > As usual, we will deploy the release to the Release Candidate > > > > environment for 2 weeks before production, to allow for testing. > > > > > > > > Regards > > > > Ed Shryane > > > > RIPE NCC > > > > > > > > > > > > > > > > On 23 Jul 2020, at 12:00, [email protected] > > > > <mailto:[email protected]> wrote: > > > > > > > > Hi Ed > > > > > > > > The chairs see there is a consensus to move forward with implementing > > > > Punycode. Can you present an impact analysis explaining what changes > > > > you propose, what effect those changes will have on updates and queries > > > > (by all the different methods), if anyone needs to modify their > > > > software interacting with the database. > > > > > > > > cheers > > > > denis > > > > > > > > co-chair DB-WG > > > > > > > >
