Frederik Ramm wrote: > Hi, > > >> Any idea what the user name should be? I find it hard to believe that >> user="jos??????¯®????" (from the API) is correct. >> > > Well on 05 December I did have a problem with the planet diff, quoting > from old E-Mail: > > > > latest daily planet diff has an UTF-8 problem on line 58267: > <node id="25254929" timestamp="2007-12-04T17:26:52Z" user="josé" ... > Seems like the user names don't get encoded properly. > > <<<<<< > > Username looks conspicuously similar ;) > I remember that email, I was hoping the problem would magically disappear ;-)
Checking the history of that node from the API again gives user="jos逴巊 »H´" (hopefully this is coming through okay, it includes a bunch of Chinese-like characters). I'll check it out in more detail soon. It does look like it should be user="josé" but given that the API is also returning "interesting" data it sounds like there's a deeper problem somewhere. Either way, osmosis shouldn't be emitting invalid UTF-8, but fixing it may not be easy. It might have something to do with characters that can't be represented with 16-bit characters. If it does turn out to be a problem elsewhere I can try to put a hack in place to at least emit valid UTF-8, but it will require me doing some more reading of unicode standards which I'm not excited about :-) _______________________________________________ talk mailing list [email protected] http://lists.openstreetmap.org/cgi-bin/mailman/listinfo/talk

