>> Richard Barnes:
>>In building tile/block managers, I too have found it difficult to use 
>>iterators or design algorithms without specifically considering both 
>>tiles/blocks >>and cells. Without doing so, it is very easy to write code 
>>which is (extremely) cache inefficient.

I think that is also Ari's main objection and a valid one too. It is possible 
to qualify "extremely" though: for pixel-by-pixel operations it means having to 
cache a whole row of blocks for each band, rather than a single block (or do 
some really expensive swapping). So in the worst case, I have blocks that are 
one full column of the band, then I need to cache the complete dataset. In the 
best case I have blocks that are a single row and I need to cache only a single 
block.

My personal justification for going this route anyway is that I intend to do 
operations on multiple bands that may have different block sizes. Now imagine a 
design the works on the basis of blocks and that keeps blocks in cache until 
they are no longer needed.  In the worst case I would have a combination of 
bands where one has full rows as blocks and the other has full columns as 
blocks; I would then need to cache the complete dataset. In the best case I 
would have two datasets with identical block sizes and I would need to cache 
only a single block per dataset.

So, in both solutions there is a worst case that requires caching the whole 
dataset and a best case requiring only a single block per dataset. In both 
solutions the problem can be solved if you can control the block sizes. I can 
see that row-by-row will be problematic more often than block-by-block, but the 
extremes are the same.

I think another problem people see with the row-by-row solution is that it does 
not parallelize well. However, I do not think that is a real problem because it 
is possible to rewrite a large row-by-row operation as a number of row-by-row 
operations on subsets (possibly blocks, but not necessarily so) of the raster 
data sets.

>> I'm not sure if flow algebras have arisen in the discussion yet, but they 
>> come to mind when I think of raster algebras. 
>> They permit operations in
>> which the values of "downstream" cells are functions of upstream cells. In 
>> such a case, efficient calculations are then driven both by blocking and >> 
>> by the data itself. In recent work, I've found that a number of flow algebra 
>> functions can be written by considering only one block at a time (link, >> 
>> link). I'm working on generalizing the concept now and can imagine it 
>> forming an easy way to quickly add general terrain analysis functionality.

Ari has mentioned catchment delineation, so it certainly is an interest.  The 
fill algorithm has been a computational bottleneck for me, so I will read your 
paper with interest too. And I would be highly interested in your generalized 
flow algebra, IMO there is great potential for generalized spatial analysis 
that sits somewhere between the hard-baked tools in GIS software and the low 
level raster data interface of, say, GDAL. 
_______________________________________________
gdal-dev mailing list
[email protected]
http://lists.osgeo.org/mailman/listinfo/gdal-dev

Reply via email to