It's quite easy to speed it up, actually. Let's take a look at your transform
iterator:
proc flip(s: seq[string]): seq[string] =
result = s # copy
result[0] = s[^1]
result[^1] = s[0]
proc transpose(s: seq[string]): seq[string] =
result = s # copy
for i in 0 .. s.high:
for j in 0 .. s.high:
result[j][i] = s[i][j]
iterator transform(s: seq[string]): seq[string] =
for a in [s, s.flip]: # copy x 3
let
b = a.transpose # copy
c = b.flip # copy
d = c.transpose # copy
# x2
yield a; yield b; yield c; yield d # possibly copy x 4
# total of: copy x 9, possibly copy x 17
Well, you take a sequence BY VALUE but still you don't operate in-place but
allocate new sequences all the time. And all strings, it'd add. The bare
minimum would be to get rid of copies you don't need:
proc flip(s: var seq[string]) =
swap s[0], s[^1] # in-place
proc transpose(s: var seq[string]) =
for i in 0 .. s.high:
for j in i+1 .. s.high:
swap s[j][i], s[i][j] # in-place
iterator transform(s: var seq[string]): var seq[string] =
var tmp = s # copy
tmp.flip # in-place
template yieldAll(c) =
yield c; c.transpose # in-place
yield c; c.flip # in-place
yield c; c.transpose # in-place
yield c # in-place
yieldAll(s) # in-place
s = tmp # copy
yieldAll(s) # in-place
Now a new seq is allocated (and filled; i.e. deepCopied) only two times: at var
tmp = s and at s = tmp. You could further reduce that to one by using
shallowCopy in the latter case. Then, if you want to truly use only seq all the
time, you can check whenever the following would be faster:
proc flip2(s: var seq[string]) =
for i in 0 .. s.high:
swap s[i][0], s[i][^1] # in-place
iterator transform(s: var seq[string]): var seq[string] =
template yieldAll(c) =
yield c; c.transpose # in-place
yield c; c.flip # in-place
yield c; c.transpose # in-place
yield c # in-place
yieldAll(s) # in-place
s.flip2 # in-place
s.flip # in-place
yieldAll(s) # in-place
It may be tempting to use a 2D array / seq mocking one, as transpose and
reading would benefit from memory locality (but not THAT much, I guess). On the
other hand, flip (but not flip2!) benefit from memory scattering so it's not
that obvious, but it would still faster than flip2, thanks to locality (e.g.
you can use copyMem instead of a swap in a loop). Still, I think it might be
nice to give 2D array a try.
@Araq Are these seq copied in yield? It would be nice if not but the value
semantics (which seem to break on let instead of var, so maybe here too?) make
me doubt...