Re: [OSM-dev] way 27483626 UTF-8 truncation
On Sun, Oct 5, 2008 at 2:12 AM, Matt Amos <[EMAIL PROTECTED]> wrote: > On Sat, Oct 4, 2008 at 9:36 AM, Florian Lohoff <[EMAIL PROTECTED]> wrote: > > To get the ROMA database in sync again i replaced the notes by > > "broken-utf8" - As notes typically get not rendered thats not a problem > > for me though. ROMA was down for a half a day before i discovered the > > broken files and fixed them ... > > likewise. the easiest way to fix them was hand-editing the change > files. i don't find it to be particularly onerous - just the price we > pay for being on the bleeding edge ;-) The corrupted data in the db has been fixed by TomH and I've re-generated the changeset files. Unfortunately it's too late for you guys now ... ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] way 27483626 UTF-8 truncation
On Sat, Oct 4, 2008 at 9:36 AM, Florian Lohoff <[EMAIL PROTECTED]> wrote: > To get the ROMA database in sync again i replaced the notes by > "broken-utf8" - As notes typically get not rendered thats not a problem > for me though. ROMA was down for a half a day before i discovered the > broken files and fixed them ... likewise. the easiest way to fix them was hand-editing the change files. i don't find it to be particularly onerous - just the price we pay for being on the bleeding edge ;-) cheers, matt ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] way 27483626 UTF-8 truncation
On Sat, Oct 04, 2008 at 06:34:12PM +1000, Brett Henderson wrote: > Subject: Re: [OSM-dev] way 27483626 UTF-8 truncation > > Florian Lohoff wrote: > >On Sat, Oct 04, 2008 at 03:24:12PM +1000, Brett Henderson wrote: > > > >>>Another 2 change files contain utf-8 bugs and osmosis refuses to process > >>>them: > >>> > >>>200810031022-200810031023.osc > >>>200810031023-200810031024.osc > >>> > >>> > >>I've tested both of these files and they seem okay. The only problem I > >>can find is way 27483626 which has a broken "note" tag in file > >>2008100310-2008100311.osc. Are you sure these files are broken? > >> > > > >wget -O - > >http://planet.openstreetmap.org/minute/200810031022-200810031023.osc.gz | > >gzip -d | iconv -f utf8 -t utf8 > >[...] > > > > > > > > > > > > > > > > > > > > http://planet.openstreetmap.org/minute/200810031023-200810031024.osc.gz | > >gzip -d | iconv -f utf8 -t utf8 > >[...] > > > > > > > > > > > > > > > > > > signature.asc Description: Digital signature ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] way 27483626 UTF-8 truncation
Florian Lohoff wrote: > On Sat, Oct 04, 2008 at 03:24:12PM +1000, Brett Henderson wrote: > >>> Another 2 change files contain utf-8 bugs and osmosis refuses to process >>> them: >>> >>> 200810031022-200810031023.osc >>> 200810031023-200810031024.osc >>> >>> >> I've tested both of these files and they seem okay. The only problem I >> can find is way 27483626 which has a broken "note" tag in file >> 2008100310-2008100311.osc. Are you sure these files are broken? >> > > wget -O - > http://planet.openstreetmap.org/minute/200810031022-200810031023.osc.gz | > gzip -d | iconv -f utf8 -t utf8 > [...] > > > > > > > > > > http://planet.openstreetmap.org/minute/200810031023-200810031024.osc.gz | > gzip -d | iconv -f utf8 -t utf8 > [...] > > > > > > > > > http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] way 27483626 UTF-8 truncation
On Sat, Oct 04, 2008 at 03:24:12PM +1000, Brett Henderson wrote: > >Another 2 change files contain utf-8 bugs and osmosis refuses to process > >them: > > > >200810031022-200810031023.osc > >200810031023-200810031024.osc > > > I've tested both of these files and they seem okay. The only problem I > can find is way 27483626 which has a broken "note" tag in file > 2008100310-2008100311.osc. Are you sure these files are broken? wget -O - http://planet.openstreetmap.org/minute/200810031022-200810031023.osc.gz | gzip -d | iconv -f utf8 -t utf8 [...] http://planet.openstreetmap.org/minute/200810031023-200810031024.osc.gz | gzip -d | iconv -f utf8 -t utf8 [...] signature.asc Description: Digital signature ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] way 27483626 UTF-8 truncation
Brett Henderson wrote: > Florian Lohoff wrote: >> On Fri, Oct 03, 2008 at 01:36:31PM +0100, Matt Amos wrote: >> >>> Subject: [OSM-dev] way 27483626 UTF-8 truncation >>> >>> i just noticed that the hourly change file >>> 2008100310-2008100311.osc.gz has an invalid UTF-8 string in the note >>> tag for way 27483626 ( >>> http://www.openstreetmap.org/browse/way/27483626/history ). i have >>> trunctated it to the nearest word, so this email is just to give >>> forewarning that hourly or daily diff imports today might have a bit >>> of trouble. >>> >>> its the same problem as discussed here >>> http://lists.openstreetmap.org/pipermail/dev/2008-August/011525.html >>> >> >> Another 2 change files contain utf-8 bugs and osmosis refuses to process >> them: >> >> 200810031022-200810031023.osc >> 200810031023-200810031024.osc >> > I've tested both of these files and they seem okay. The only problem > I can find is way 27483626 which has a broken "note" tag in file > 2008100310-2008100311.osc. Are you sure these files are broken? > > I've sent an email to TomH asking if he can fix the problematic tag. If anybody has any other ideas on how to update the db let me know. Brett ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] way 27483626 UTF-8 truncation
Florian Lohoff wrote: > On Fri, Oct 03, 2008 at 01:36:31PM +0100, Matt Amos wrote: > >> Subject: [OSM-dev] way 27483626 UTF-8 truncation >> >> i just noticed that the hourly change file >> 2008100310-2008100311.osc.gz has an invalid UTF-8 string in the note >> tag for way 27483626 ( >> http://www.openstreetmap.org/browse/way/27483626/history ). i have >> trunctated it to the nearest word, so this email is just to give >> forewarning that hourly or daily diff imports today might have a bit >> of trouble. >> >> its the same problem as discussed here >> http://lists.openstreetmap.org/pipermail/dev/2008-August/011525.html >> > > Another 2 change files contain utf-8 bugs and osmosis refuses to process > them: > > 200810031022-200810031023.osc > 200810031023-200810031024.osc > I've tested both of these files and they seem okay. The only problem I can find is way 27483626 which has a broken "note" tag in file 2008100310-2008100311.osc. Are you sure these files are broken? ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] way 27483626 UTF-8 truncation
Florian Lohoff wrote: > On Fri, Oct 03, 2008 at 01:36:31PM +0100, Matt Amos wrote: > >> Subject: [OSM-dev] way 27483626 UTF-8 truncation >> >> i just noticed that the hourly change file >> 2008100310-2008100311.osc.gz has an invalid UTF-8 string in the note >> tag for way 27483626 ( >> http://www.openstreetmap.org/browse/way/27483626/history ). i have >> trunctated it to the nearest word, so this email is just to give >> forewarning that hourly or daily diff imports today might have a bit >> of trouble. >> >> its the same problem as discussed here >> http://lists.openstreetmap.org/pipermail/dev/2008-August/011525.html >> > > Another 2 change files contain utf-8 bugs and osmosis refuses to process > them: > > 200810031022-200810031023.osc > 200810031023-200810031024.osc > Any idea which nodes or ways are broken in these? This isn't an osmosis bug. The database now has incorrect/corrupted tag data in the history tables that needs to be corrected. Following the URL: http://www.openstreetmap.org/browse/way/27483626/history results in random results from the API. If we can identity the broken records we can ask TomH nicely to fix them. I can then move osmosis backwards in time to re-generate the affected time period. I don't know how this broken data gets created in the first place. There was some discussion about this the last time it happened, I'll have to try to dig up the emails. It's not simple to fix osmosis to prevent this occurring. Osmosis is reading doubly encoded data from the database and removing the double encoding as it writes to the xml file. It's a hack and there is no simple way of verifying the data before it gets written to the file. I have a local process running at home verifying the output which has detected the problem, but I was asleep at the time it occurred :-) ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] way 27483626 UTF-8 truncation
On Fri, Oct 03, 2008 at 01:36:31PM +0100, Matt Amos wrote: > Subject: [OSM-dev] way 27483626 UTF-8 truncation > > i just noticed that the hourly change file > 2008100310-2008100311.osc.gz has an invalid UTF-8 string in the note > tag for way 27483626 ( > http://www.openstreetmap.org/browse/way/27483626/history ). i have > trunctated it to the nearest word, so this email is just to give > forewarning that hourly or daily diff imports today might have a bit > of trouble. > > its the same problem as discussed here > http://lists.openstreetmap.org/pipermail/dev/2008-August/011525.html Another 2 change files contain utf-8 bugs and osmosis refuses to process them: 200810031022-200810031023.osc 200810031023-200810031024.osc Flo -- Florian Lohoff [EMAIL PROTECTED] +49-171-2280134 Those who would give up a little freedom to get a little security shall soon have neither - Benjamin Franklin signature.asc Description: Digital signature ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
[OSM-dev] way 27483626 UTF-8 truncation
i just noticed that the hourly change file 2008100310-2008100311.osc.gz has an invalid UTF-8 string in the note tag for way 27483626 ( http://www.openstreetmap.org/browse/way/27483626/history ). i have trunctated it to the nearest word, so this email is just to give forewarning that hourly or daily diff imports today might have a bit of trouble. its the same problem as discussed here http://lists.openstreetmap.org/pipermail/dev/2008-August/011525.html cheers, matt ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev