Stefan de Konink wrote: > As discussed before; it is possible to do a second pass binary encoding > with all strings in a distinct table. Where the linked list can be > recovered to an array can be recovered from the storage. This would make > a significance difference for the tag keys alone. > > In this case all string fields can converted to unsigned long fields for > now 4G of distinct fields seems enough :)
Since I have some more statistics. The binary file is 418MB The strings within the binary file 224MB (\n terminated) Amount of lines: 29688795 This list deduplicated: 19MB Amount of lines: 2087179 So with some quick calculations: 418 - 224 + 90 + 19 =~ 303MB ...now it would be nice to see how this values work out on the full planet :) Never the less; 300MB of binary data directly useable in any application, plus an on demand generated index, doesn't sound bad for entire country. Stefan _______________________________________________ dev mailing list [email protected] http://lists.openstreetmap.org/listinfo/dev

