Hi, I'm working on some low-level pixel processing code (stuff that is not possible to do using standard API or on eg GPU), and I had lots of eg these:
for y in someStartY ..< someStopY { for x in someStartX ..< someStopX { ... pixels[x, y] ... } } So I implemented some (value) types like eg IntPoint2D, IntSize2D, IntRect2D and I made an IntRect2DIterator so that IntRect2D could be a Sequence over its (discrete) points. With this I could rewrite the above like so: for pixelPosAsIntPoint2D in someIntRect2D { ... pixels[pixelPosAsIntPoint2D] ... } For some reason the first version (two nested for loops for x and y) is always a bit faster than the abstracted version no matter how I write it (tried using eg &+ instead of + etc). Is it possible to write as a zero cost abstraction like this, if so, how? If not, why? PS Note that eg this: for y in someStartY ..< someStopY { for x in someStartX ..< someStopX { let pixelPosAsIntPoint2D = IntPoint2D(x: x, y: y) ... pixels[pixelPosAsIntPoint2D] ... } } is exactly as fast as the top example (using just ... pixels[x, y] ...). So the difference in execution time seems to be due to something in the Iterator and not eg the pixel accessing subscript taking the 2d int point type instead of separate x and y ints. Here is one Iterator variant that I have tested: struct Rect2DIntPointIterator : IteratorProtocol, Sequence { let startX, startY, stopX, stopY: Int var px, py: Int init(rect: Rect2DInt) { startX = rect.origin.x startY = rect.origin.y stopX = rect.maxX stopY = rect.maxY px = startX py = startY } mutating func next() -> Point2DInt? { defer { px = px &+ 1 } if px == stopX { px = startX py = py &+ 1 if py == stopY { return nil } } return Point2DInt(x: px, y: py) } } And here are typical execution times from an example test: 2.1 seconds using my Iterator (the fastest I can get it, no matter how I try to rewrite it). 1.5 seconds using nested x, y for loops. I'm pretty sure my testing is done thoroughly (measuring average of many runs, using random data, avoiding dead code elimination, whole module optimization, etc). I have tried profiling the code and looking at the disassmbly but I'm failing to understand what's going on. So the ultimate answer would be in the form of a (2d, Int) Rectangle type whose (2d, Int) Points can be iterated in a for loop, at zero cost compared to doing the same using two nested for loops. Or an explanation of why this is impossible. DS /Jens
_______________________________________________ swift-users mailing list swift-users@swift.org https://lists.swift.org/mailman/listinfo/swift-users