Hello, Hopefully my last question regarding v.generalize and speeding up the process.
Context: I have multiple years of data that need to be generalized. For each year, I need a number of different generalizations (specific number TBD). Question: What would be the best way to do that in parallel? One mapset for each year? Can I run multiple v.generalizes on the same input with different outputs? My first thought was to run completely separated grass processes for each simplification, but I didn't find a way to make it search something different than .grass / .grass70 for the configuration stuff.... Thanks again F -=--=-=- Fábio Augusto Salve Dias ICMC - USP http://sites.google.com/site/fabiodias/ On Sun, Jan 11, 2015 at 8:32 PM, Markus Metz <[email protected]> wrote: > On Sat, Jan 10, 2015 at 7:23 PM, Fábio Dias <[email protected]> wrote: >>> I have optimized the GRASS vector library in trunk r64032 and added >>> another topology check to v.generalize in trunk r64033. The profile of >>> v.generalize now shows that it is limited by disk I/O speed (on my >>> laptop with a standard laptop-like spinning HDD), which means that the >>> algorithms are, under the test conditions, close to their optimum. >>> This picture might change as soon as you use a high-performance server >>> or a SSD. >> >> >> Then I should do a profile on my current setup. > > I have updated v.generalize again in trunk r64067. Please test the > latest version. > >> >>> [...] the Terraclass >>> shapefiles are full of errors. If you want to fix these errors, this >>> will take some time. >> >> You know this dataset? The errors are really bugging me. It is, mostly >> due to the process/tools they usually use. We have passed over the >> request for a more topologically correct approach. Maybe on the next >> iteration. But I'll create another thread asking advice regarding >> these errors shortly :) > > I know the Terraclass dataset a bit. I used some tiles for testing. I > was not able to import any of my test tiles without errors (after > years of thinking about the conversion of non-topological vectors to > topological vectors). Terraclass data are based on PRODES data, which > I know pretty well. The PRODES classification also comes as shapfiles > which are also full of errors, but these I managed to remove by > carefully choosing the snapping threshold for v.in.ogr. > >> By not previously dissolving and further doing v.clean tool=break the >> original data, I've reduced the processing time from more than 30h for >> 1% to 24h to 11%. With the latest release, 9% in 18h. > > 9% in 18h seems promising. > >> >> However, this whole thing got me thinking about you said on an early message: >> >>> The check_topo function can not be executed in parallel because 1) >>> topology must not be modified for several boundaries in parallel, 2) >>> data are written to disk, and disk IO is by nature not parallel. >> >> Well, disk IO, there's not much we can do about it. > > We can here and there sometimes reduce disk IO (which I did in some of > my recent changes). > > Markus M _______________________________________________ grass-user mailing list [email protected] http://lists.osgeo.org/mailman/listinfo/grass-user
