>> Richard Barnes: >>In building tile/block managers, I too have found it difficult to use >>iterators or design algorithms without specifically considering both >>tiles/blocks >>and cells. Without doing so, it is very easy to write code >>which is (extremely) cache inefficient.
I think that is also Ari's main objection and a valid one too. It is possible to qualify "extremely" though: for pixel-by-pixel operations it means having to cache a whole row of blocks for each band, rather than a single block (or do some really expensive swapping). So in the worst case, I have blocks that are one full column of the band, then I need to cache the complete dataset. In the best case I have blocks that are a single row and I need to cache only a single block. My personal justification for going this route anyway is that I intend to do operations on multiple bands that may have different block sizes. Now imagine a design the works on the basis of blocks and that keeps blocks in cache until they are no longer needed. In the worst case I would have a combination of bands where one has full rows as blocks and the other has full columns as blocks; I would then need to cache the complete dataset. In the best case I would have two datasets with identical block sizes and I would need to cache only a single block per dataset. So, in both solutions there is a worst case that requires caching the whole dataset and a best case requiring only a single block per dataset. In both solutions the problem can be solved if you can control the block sizes. I can see that row-by-row will be problematic more often than block-by-block, but the extremes are the same. I think another problem people see with the row-by-row solution is that it does not parallelize well. However, I do not think that is a real problem because it is possible to rewrite a large row-by-row operation as a number of row-by-row operations on subsets (possibly blocks, but not necessarily so) of the raster data sets. >> I'm not sure if flow algebras have arisen in the discussion yet, but they >> come to mind when I think of raster algebras. >> They permit operations in >> which the values of "downstream" cells are functions of upstream cells. In >> such a case, efficient calculations are then driven both by blocking and >> >> by the data itself. In recent work, I've found that a number of flow algebra >> functions can be written by considering only one block at a time (link, >> >> link). I'm working on generalizing the concept now and can imagine it >> forming an easy way to quickly add general terrain analysis functionality. Ari has mentioned catchment delineation, so it certainly is an interest. The fill algorithm has been a computational bottleneck for me, so I will read your paper with interest too. And I would be highly interested in your generalized flow algebra, IMO there is great potential for generalized spatial analysis that sits somewhere between the hard-baked tools in GIS software and the low level raster data interface of, say, GDAL. _______________________________________________ gdal-dev mailing list [email protected] http://lists.osgeo.org/mailman/listinfo/gdal-dev
