Re: [db-wg] To internationalise or not, that is the question?

Edward Shryane via db-wg Thu, 06 Aug 2020 01:05:08 -0700

Hi Denis, Colleagues,

As you requested, the Database team will prepare a thorough investigation 
(impact analysis) of UTF-8 in the RIPE Database, as a starting point for 
further discussion.


Regards
Ed Shryane
RIPE NCC


> On 5 Aug 2020, at 14:46, [email protected] wrote:
> 
> Colleagues
> 
> We have a problem with UTF-8. Many people keep saying you want it, we should 
> have it, lets do it...But every time we get to these difficult, non technical 
> questions every one goes silent. This is why we have never implemented UTF-8 
> since it was first mentioned many years ago. No one in the community seems to 
> know how to answer these questions.
> 
> So I have a suggestion. The RIPE NCC has the manpower with the expertise to 
> investigate these issues. I propose we put a task on the RIPE NCC to do a 
> thorough investigation of UTF-8 in the RIPE Database from all possible angles 
> and report back to the community. This can be a starting point to a more 
> meaningful discussion.
> 
> We need to know what impact having non Latin1 characters in different parts 
> of the data set will have on the RIPE Registry, the RIPE NCC members, the 
> different user groups of the RIPE Database and the social, legal and 
> political impact of such a change. Which parts of the data set 
> can/should/shouldn't be allowed to be in other character sets. Who really 
> needs access to this data and what parts of it need to be understandable or 
> interpreted. Which does bring into question the whole purpose of the RIPE 
> Database and the data contained therein.
> 
> Thoughts???
> 
> cheers
> denis
> 
> co-chair DB-WG
> 
> 
> 
> On Friday, 31 July 2020, 20:20:10 CEST, Leo Vegoda <[email protected]> wrote:
> 
> 
> Hi Denis,
> 
> These are good questions. As so many of the answers lie with the RIPE
> NCC or the NRO, I suppose we need input from them to proceed further.
> 
> Kind regards,
> 
> Leo
> 
> On Wed, Jul 29, 2020 at 3:47 PM [email protected] 
> <mailto:[email protected]>
> <[email protected] <mailto:[email protected]>> wrote:
> >
> > Hi Leo
> >
> > Some of the questions that need to be answered include:
> >
> > -who needs to be able to read/understand/interpret which parts of the data 
> > in the RIPE Database (maybe both the community and the NCC need input to 
> > answer this)?
> > -is any of the data contained in the RIPE Database essential for the 
> > operation of the registry and not duplicated anywhere else (maybe the NCC 
> > and the NCC Services WG need input to answer this)?
> > -is any of the data important to LEAs and governments, is that a 
> > consideration, do they have the resources to understand the data in any 
> > format (community and LEAs input needed for this one)?
> > -One of the mission statements of the NRO is "Providing and promoting a 
> > coordinated Internet number registry system" so if we are going to 
> > internationalise the public face of the registry should it be 
> > coordinated(is that a community, RIR or NRO question)?
> >
> > cheers
> > denis
> >
> > co-chair DB-WG
> >
> > On Wednesday, 29 July 2020, 21:09:55 CEST, Leo Vegoda <[email protected] 
> > <mailto:[email protected]>> wrote:
> >
> >
> > Hi Denis,
> >
> > I agree that this is a registry issue and not just a database issue,
> > which is why I sent the message I did on 8 July.
> >
> > I'd like to understand how much of this work should be led by the RIPE
> > NCC versus the community. Also, because of the breadth of the issues,
> > should the discussion be here or on another list?
> >
> > Kind regards,
> >
> > Leo Vegoda
> >
> > On Wed, Jul 29, 2020 at 10:45 AM [email protected] 
> > <mailto:[email protected]>
> > <[email protected] <mailto:[email protected]>> wrote:
> > >
> > > Hi Leo
> > >
> > > As I have said many times, internationalising the RIPE Database is not a 
> > > technical issue, it is a registry issue. I think it does need a separate 
> > > process from the database requirements. Especially if we consider it as a 
> > > cross registry issue.
> > >
> > > Incidentally I did suggest on this mailing list several months ago that 
> > > the requirements task force considers the issue of UTF-8. No one from the 
> > > task force has yet replied to me on that or any other comment I have made 
> > > about the requirements.
> > >
> > > cheers
> > > denis
> > >
> > > co-chair DB-WG
> > >
> > > On Wednesday, 29 July 2020, 18:20:14 CEST, Leo Vegoda <[email protected] 
> > > <mailto:[email protected]>> wrote:
> > >
> > >
> > > Hi,
> > >
> > > Thanks for providing the impact analysis for this initial change.
> > >
> > > What should the process be for introducing greater support for
> > > internationalization in the RIPE Database? George, Cynthia and others
> > > have made good points about the need to improve internationalization
> > > of more than just e-mail addresses. Is that support something that
> > > should be handled through the process that follows the final report of
> > > the Database TF or does it need to be addressed separately?
> > >
> > > Thanks,
> > >
> > > Leo
> > >
> > > On Wed, Jul 29, 2020 at 8:03 AM Edward Shryane via db-wg <[email protected] 
> > > <mailto:[email protected]>> wrote:
> > > >
> > > > Dear Colleagues,
> > > >
> > > > Here is the impact analysis for the NWI-11 implementation.
> > > >
> > > > The Database team plans to implement NWI-11 as per the Solution 
> > > > Definition:
> > > > https://www.ripe.net/ripe/mail/archives/db-wg/2020-June/006525.html 
> > > > <https://www.ripe.net/ripe/mail/archives/db-wg/2020-June/006525.html>
> > > >
> > > > (1) Impact to Whois Update
> > > >
> > > > The implementation will automatically apply Punycode encoding (as per 
> > > > RFC 5891) to the domain part of an email address during Whois update.
> > > >
> > > > The encoding is only applied to an IDN domain name, and changes the 
> > > > current behaviour as follows:
> > > > - ASCII encoded values will not be affected (as before).
> > > > - Non-ASCII but latin-1 encoded values will be encoded as Punycode.
> > > > - Non-latin-1 encoded values (e.g. UTF-8) will also be encoded as 
> > > > Punycode. These values previously were transformed to latin-1, with a 
> > > > '?' substitution.
> > > >
> > > > The local part of an email address must only contain ASCII characters. 
> > > > If non-ASCII characters are found in the local part, the address is 
> > > > rejected as invalid.
> > > >
> > > > This change will only affect attributes with an email address syntax 
> > > > (i.e. abuse-mailbox, e-mail, irt-nfy, mnt-nfy, notify, ref-nfy, upd-to).
> > > >
> > > > If an email address is converted to Punycode, a warning will be 
> > > > included in the update response.
> > > >
> > > > Any Punycode conversion failure will result in the attribute value 
> > > > being rejected as invalid. A workaround in this case is to encode the 
> > > > value before submitting the update.
> > > >
> > > > (2) Impact to Whois Query
> > > >
> > > > When querying the RIPE database, any Punycode encoded email address is 
> > > > returned as-is (i.e it is not decoded).
> > > >
> > > > (3) Impact to Existing Data
> > > >
> > > > We will perform a cleanup to convert any existing non-ASCII (but 
> > > > latin-1 encoded) IDN domain names to Punycode in attributes with an 
> > > > email address syntax. This affects very few objects. The maintainer(s) 
> > > > will be notified by email beforehand.
> > > >
> > > > (4) Impact to Whois Documentation
> > > >
> > > > We will update the database documentation with details of this 
> > > > behaviour change.
> > > >
> > > > (5) Release Timeline
> > > >
> > > > We expect the NWI-11 implementation to take about 1 month (including 
> > > > code changes and testing), and will include the feature in the Whois 
> > > > 1.98 release.
> > > >
> > > > As usual, we will deploy the release to the Release Candidate 
> > > > environment for 2 weeks before production, to allow for testing.
> > > >
> > > > Regards
> > > > Ed Shryane
> > > > RIPE NCC
> > > >
> > > >
> > > >
> > > > On 23 Jul 2020, at 12:00, [email protected] 
> > > > <mailto:[email protected]> wrote:
> > > >
> > > > Hi Ed
> > > >
> > > > The chairs see there is a consensus to move forward with implementing 
> > > > Punycode. Can you present an impact analysis explaining what changes 
> > > > you propose, what effect those changes will have on updates and queries 
> > > > (by all the different methods), if anyone needs to modify their 
> > > > software interacting with the database.
> > > >
> > > > cheers
> > > > denis
> > > >
> > > > co-chair DB-WG
> > > >
> > > >

Re: [db-wg] To internationalise or not, that is the question?

Reply via email to