Hi Jeron, > Please help me to determine the case when a whole output image is > needed. IMO input is readonly and output is writeonly. I don't see the > need atm to support whole output images in a 'per output pixel' > approach. And every 'per input pixel' approach can be written by a 'per > output pixel' approach. In the current nodes the two approaches are mixed. The problem of the concept of pixel to pixel operations is also that this tends to be implemented with a lot of overhead. Like having 3 frames on the call stack for adding two pixels and this for every pixel in the buffer,.... it is really nasty. This is why even adding buffers together is rather inefficient at the moment. Another example would be the filter node, with these pixel_processors for convolution. If you really think about low level efficiency, down to the level of single instructions, a lot could be done better at the moment.
I also realize that the argument "it would work with the current compositor" is a strong argument. But I got some problems with that. First of all I think that a compositor should be in principal be able to support all image processing operations. I think it's a rather bad idea to be stuck with a very limited architecture, which already requires a bunch of hacks to implement the functionality of current nodes as those doing convolution. Another problem I see with tiling is, that you are doing spacial partitioning and are therefore stuck in the spatial domain. But there are a lot of possibilities of working in gradient and frequency domain, also including speedups. But you won't be able to convert a tile to gradient domain, because you can't determine the correct gradient on the borders. When you want to work in frequency domain you also run into issues with tiling, because of your spacial partitioning. But back to the simple issue with operations, which need full buffer access. I agree that this could be still done with tiling, because you can simply compute all input tiles and just access those when computing one single output tile. So this is sort of how this should work? At least your diagram in your document looks like this. Any other workarounds like using overlapping tiles for the very special case of a 3*3 kernel convolution are just hacks, but will prevent the implementation of any future nodes, which have other non pixel->pixel operations. Such future node for instance could be tone-mapping. This is for e.g. a standard feature in lux, so I guess it's not that absurd to include such features in blenders compositor. And some tone mapping algorithms need to operate on the entire image. In terms of memory usage, caching, etc. if we assume that only reasonable sized buffers are used, let's say up to 64MB, I also don't see the strong benefits in using tiles rather than buffers, which hold the entire image. But maybe you have to be more specific about the caching scheme you want to use here. aurel _______________________________________________ Bf-committers mailing list [email protected] http://lists.blender.org/mailman/listinfo/bf-committers
