Hi Sasha, > On 25 Nov 2023, at 13:14, Sasha Romijn <[email protected]> wrote: > > Hi, > > I like this proposal - people should be able to include their names > accurately in the database. I know there’s an RFC that’s says everything > should be ASCII, but I don’t think many implementations have followed that in > the last decade. > > On 24 Nov 2023, at 10:21, Job Snijders via db-wg <[email protected]> wrote: >> Have the effects of LATIN-1 on downstream applications such as NRTM v3 >> and NRTM v4 been considered? > > As far as I know, NRTMv3 has no defined encoding, but, speaking from memory, > IRRDv4 does a best effort to decode it as UTF-8. There are encoding errors as > a result, but as they occur in few fields that have loose syntax anyways, the > impact is small. RIPE already limits personal data anyways, so not sure how > much of this would be included. > > NRTMv4 is explicitly UTF-8, so a LATIN-1 database has to transcode (if that’s > the term). I haven’t checked whether the current RIPE db implementation does > this. >
There are not many non-ASCII characters in the snapshot as "descr:" and "remarks" attributes are dummified. I found an example of an a-umlaut (ä) in the PGPKEY-AC7C8A10 object which is latin-1 in the database, and correctly encoded in the NRTMv4 snapshot as UTF-8 bytes 0xc3a4. Regards Ed Shryane RIPE NCC -- To unsubscribe from this mailing list, get a password reminder, or change your subscription options, please visit: https://lists.ripe.net/mailman/listinfo/db-wg
