On Mon, Dec 21, 2009 at 6:44 AM, Jon Burgess <jburgess...@googlemail.com>wrote:
> On Mon, 2009-12-21 at 01:08 -0500, Anthony wrote: > > Cool. If anyone familiar with the planet dumper tool is listening... > > > > In > > > http://svn.openstreetmap.org/applications/utils/planet.osm/C/output_osm.c > > > > } else if ((*in >= 0) && (*in < 32)) { > > escape_tmp[len] = '?'; > > len++; > > > > should be something like > > > > } else if ((*in > 0) && (*in < 32)) { > > len+=sprintf(&escape_tmp[len], "&#%d;", *in); > > > > "Something like" as in I haven't even checked if that compiles :). > > Most of the control characters are not allowed in a valid XML file. It > makes no difference whether they are present as an ASCII character or as > the equivalent entity. > Ah yes. Hmm. That said, most of the characters actually in the database are carriage returns, which along with tabs and line feeds (also in the db) are valid in XML. Other characters are present - for instance ASCII 3 in http://www.openstreetmap.org/browse/changeset/1325382 - those will be more of a problem. Hopefully the database can be cleaned of the rest of the characters, because I'd imagine each dumper is going to have a slightly different way of dealing with them. Until that's done, I guess there's no right answer. > > Of course, another thing to consider is that 1024 bytes isn't enough > > for the truly pathological cases. I think you need like 1531 or > > something to handle that. Fixing this might be enough to properly > > process the current db, though. > > How do you arrive at the 1531 number? > strlen(""")*255+1 Not sure if that's the absolute longest encoded string. But 255 quotes makes a valid key/value, and the planet dumper would truncate it, right? > Any chance of adding num_changes? > > The current output reflects the same information as the /changeset API > call. Do you think it should be there too? > Not as a bug, but as a feature request, I guess so. It's more useful in the dumps than the API (you can use it to make sure you've got everything downloaded), but it'd be useful in the API as well, I suppose. It seems to be in the DB, so there shouldn't be a performance impact, right? I see it's mentioned on http://wiki.openstreetmap.org/wiki/.osm
_______________________________________________ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev