Re: [GRASS-user] r.watershed speed-up

Markus Metz Tue, 29 Jul 2008 11:46:14 -0700

Dear Chuck,

r.watershed is a much valued tool in GRASS, for me the best watershedanalsis tool not only in GRASS, therefore I thought about a a way tokeep the results identical too. I am also aware that the closer theresults produced by changes in the algorithm are to the results producedby original algorithm, the higher the chances that it will be acceptedby the community.

With regard to your suggestion, I would not adjust DEM values, becausein larger regions the minimum possible increment is already there in thedata, i.e. there are no gaps in the data distribution that can be filledwith adjusted values. One theoretical way out would be to read in DEMsas FCELL or DCELL, but then there is the floating point comparisonproblem. (I tried against better knowledge, it doesn't work). Regardingthe breadth first search, where do you see breadth first <when the DEMvalues are different>? You lost me there. I don't see differences in howpoints are searched between the two versions, but maybe I have not fullyunderstood the original algorithm. As far as I have understood theoriginal algorithm, the list of astar_pts following astar_pts.nxt iskept in ascending order using elevation. If there are already pointswith equal elevation, the new point is inserted after all other pointswith the same elevation (line 91 in original do_astar.c), so that thepoint inserted first (of several points with equal elevation) will beremoved first (line 19 in original do_astar.c). This is still the casein the new algorithm (insertion: line 136, removal: lines 31 and 192)most of the time. If the binary heap becomes fairly large and there aremany points with equal elevation, there might be an exception. Pleaselet me know if I got something wrong there!

Another possibility to produce the exact same results like in theoriginal version would be to go recursively down the heap and pick thepoint added earliest from all points with elevation equal to the rootpoint. This is easy to implement, but it would have slowed down thesearch algorithm somewhat and I wanted to get something lightening fast.

I have one main argument why it is not a disaster if the results are not100% identical:The order in which neighbouring cells are added is in both versions,with respect to the focus cell:low, up, left, right, upper right corner, lower left corner, lower rightcorner, upper left cornerThis order is always kept, irrespective of the already established flowdirection, thus it is a random order and there is not really a reasonwhy the algorithm should stick to that order. I think a rare replacementof that random order (2% difference of flow direction in MoritzLennert's test) with another random order (binary heap shuffling) is nota disaster and the result is still valid. I did build in a check to makeresults more similar, but there are still scenarios when this checkdoesn't catch.

So my main question to you, the original author of r.watershed, is, if arare violation to the (in my opinion random) order in which neighboursare added to the list would cause the results to be no longer valid.The other question is if I should provide now a version that reallyproduces identical results, or if I first sort out the problem of howneighbours are (should be) added to and removed from the list. BTW, Itried to change the order of adding neighbours to the list too, takinginto account the already established flow direction. It produces verystraight lines in flat terrain, which is ok in hydrological terms, butsome randomness looked better. Flat terrain in the DEM must not be flatin reality because of problems with DEM resolution and accuracy,randomness produces there more naturally looking results.


Sorry for the long reply!

Regards,

Markus


Charles Ehlschlaeger wrote:

Dear Markus and other r.watershed enthusiasts,
I've thought of a way to make your faster version of r.watershed give
identical results as the older r.watershed. r.watershed.old uses a breadth
first search in section 2. r.watershed.fast is breadth first <when the DEM
values are different>. The trick would be to slightly adjust DEM values by
when they were added to the heap. Here is some pseudo-code
cellsAddedToHeap = 0

minCellIncrement = Double.MIN_VALUE # (java) This constant represents# the smallest increment a CELL# can be, if the CELLs are doubles.# At the time a cell is added to heap, the DEM value placed# in the heap would be:

DemOfCellToHeap = DEM + (minCellIncrement * cellsAddedToHeap++))

Sincerely, chuck

_______________________________________________
grass-user mailing list
[email protected]
http://lists.osgeo.org/mailman/listinfo/grass-user

Re: [GRASS-user] r.watershed speed-up

Reply via email to