On Mon, Apr 22, 2013 at 11:03 AM, Mark Wynter <[email protected]> wrote: > Thanks Marcus. > Tried sqlite backend suggestion - no improvement - then read that that > sqlite is the default backend for grass7. > I suspect the complexity of the input dataset may be the contributing factor. > For example, I ran v.clean over the already cleaned OSM dataset (2.6M lines), > and it took only a few minutes since there were no intersections and no > duplicates to remove.
I tested with a OSM road vector with 2.6M lines, the output has 5.3M lines: lots of intersections and duplicates which were cleaned in less than 15 minutes. I am surprised that you experience slow removal of duplicates, breaking lines should take much longer. About why removing duplicates takes longer at the end: when you have 5 lines that could be duplicates you could check 1 with 2, 3, 4, 5 2 with 1, 3, 4, 5 3 with 1, 2, 4, 5 4 with 1, 2, 3, 5 5 with 1, 2, 3, 4 or checking each combination only once: 1 with 2, 3, 4, 5 2 with 3, 4, 5 3 with 4, 5 4 with 5 alternatively 2 with 1 3 with 1, 2 4 with 1, 2, 3 5 with 1, 2, 3, 4 The current implementation uses the latter. Markus M > > >> Something is wrong there. Your dataset has 971074 roads, I tested with >> an OSM dataset with 2645287 roads, 2.7 times as many as in your >> dataset. Cleaning these 2645287 lines took me less than 15 minutes. I >> suspect a slow database backend (dbf). Try to use sqlite as database >> backend: >> >> db.connect driver=sqlite >> database=$GISDBASE/$LOCATION_NAME/$MAPSET/sqlite/sqlite.db >> >> Do not substitute the variables. >> >> HTH, >> >> Markus M > _______________________________________________ grass-user mailing list [email protected] http://lists.osgeo.org/mailman/listinfo/grass-user
