On Fri, Feb 29, 2008 at 9:23 AM, Dave Stubbs <[EMAIL PROTECTED]> wrote:
> On Thu, Feb 28, 2008 at 10:47 PM, Jason Reid > <[EMAIL PROTECTED]> wrote: > > > > David Earl wrote: > > > How feasible would it be to put a set of attributes either on the top > > > level element or an element created for the purpose which tells me > how > > > many nodes, ways and relations there are in the file. If you have the > > > counts to hand at the beginning, great, but if not if you wrote '... > > > nodecount="000000000000" waycount="000000000000" > > > relationcount="000000000000"' at the beginning, and then when you've > > > output the elements and counted them up as you do it, at the end seek > > > back and replace the zeros with the counts. > > > > > > This would enable me and others to do progress reporting on making a > > > pass through the file. (I can't do it by file size and read position > > > because the filesize function won't go bigger than 2Gb in PHP, and I > > > can't count the elements before I start without completely > decompressing > > > the file first, which I no longer have enough free disk to do). > > > > > > David > > > > > > _______________________________________________ > > > dev mailing list > > > [email protected] > > > http://lists.openstreetmap.org/cgi-bin/mailman/listinfo/dev > > > > > There is the planet statistics script that I wrote a while back (in > > python) that I need to get around to popping into SVN, it doesn't count > > nodes or relations currently, only ways, but it wouldn't be hard to add > > (plus it would give it something to do since 92.5% of the objects in > the > > dump are nodes and it currently scans over them silently). It could be > > modified to sit in between the output of the planet script and gzip and > > calculate as the file is being compressed (the script uses a stream > > consuming parser to read stdin, in my uses piping from bzcat currently, > > and could pass the stream back out stdout unmodified) > > > > > I think if we wanted counting it would be simpler to just add it to > the C code rather than pipe through another application which actually > has the same limitations (no knowledge of counts at the start, and no > seek). > > The other possibility would be to write to a whole sequence of files, > all compressed, and just tar the results with a stats meta file to > make a single file for download... most processors could be modified > to read tarballs quite easily, and if not you could untar them first - > it would basically be an OSM Jar but with choice of compression. Just > a random thought... I'm sure you can think of many holes. > > Don't forget there's also > http://www.openstreetmap.org/stats/data_stats.html -- if you just want > a rough guess at the number of nodes/ways and you are dealing with a > recent planet, then you could just scrape that to get the numbers. > There's also http://osmxapi.hypercube.telascience.org/total.xml. This is xml so it may be easier to handle than data_stats. > > Dave > > _______________________________________________ > dev mailing list > [email protected] > http://lists.openstreetmap.org/cgi-bin/mailman/listinfo/dev >
_______________________________________________ dev mailing list [email protected] http://lists.openstreetmap.org/cgi-bin/mailman/listinfo/dev

