A while ago, Henry requested a comparison of interpreter speed.

For this purpose I translated the Fannkuch task from the Benchmark game
from Python to J as literally the same as possible (generators required a
huge cludge of locale based OOP and goto. statements).
Now that J has grown multi-threading capabilities, I completed the
translation, as the python code did also do multi-threading (actually it
seems to spin up multiple processes, rather than threads). The comparison
looks like this (on my modest laptop):
jpjacobs@icarus:~/Projects/J/BenchmarkGame$ time ~/j904/bin/jconsole
fannkuch-redux_nothread.ijs 10
73196
Pfannkuchen(10) = 38

real    1m35,353s
user    1m35,163s
sys     0m0,173s
jpjacobs@icarus:~/Projects/J/BenchmarkGame$ time ~/j904/bin/jconsole
fannkuch-redux_notasks.ijs 10
73196
Pfannkuchen(10) = 38

real    1m30,151s
user    1m30,008s
sys     0m0,138s
jpjacobs@icarus:~/Projects/J/BenchmarkGame$ time ~/j904/bin/jconsole
fannkuch-redux.ijs 10
73196
Pfannkuchen(10) = 38

real    1m2,029s
user    3m19,923s
sys     0m24,551s
jpjacobs@icarus:~/Projects/J/BenchmarkGame$ time python3.9
fannkuch-redux.py 10
73196
Pfannkuchen(10) = 38

real    0m6,753s
user    0m26,127s
sys     0m0,021s

The "nothread" version does not use threads at all; the "notasks" version
spins up the threads, but does not dispatch any tasks, in order to take a
look at perhaps the j engine using threads without explicit user
instruction. The version without underscores uses 4 tasks dispatched on 4
threads.

Summarising, it seems from the real time (actual time used) that creating
threads, but not using them slightly reduces the time taken to +-66%. Using
the treads to execute tasks does reduce the time taken further, but by far
not as low as 25%. This cannot be due to data communication overhead, since
each task receives 3 numbers, and returns 2. I also noticed some
inconsistent results in the parallel version, i.e. the program locking up,
or yielding a checksum that is not correct.
Python still beats J by an order of magnitude, and I suspect that it is due
to not having iterators, but having to emulate them using locales and
jumping around in the next__ functions with goto. . If anyone has a better
way of writing this, I'm curious.

I attached all 4 versions (3 J versions and 1 python version) of the task
to this message (remove the .txt extension before use).

Jan-Pieter
# The Computer Language Benchmarks Game
# https://salsa.debian.org/benchmarksgame-team/benchmarksgame/
#
# contributed by Joerg Baumann
# many thanks to Oleg Mazurov for his helpful description

from sys import argv
from math import factorial
from multiprocessing import cpu_count, Pool
from itertools import islice, starmap

def permutations(n, start, size):
    p = bytearray(range(n))
    count = bytearray(n)

    remainder = start
    for v in range(n - 1, 0, -1):
        count[v], remainder = divmod(remainder, factorial(v))
        for _ in range(count[v]): # never executed in single thread code, 
because at this point count[i] == \x00
            p[:v], p[v] = p[1:v + 1], p[0]

    assert(count[1] == 0)
    assert(size < 2 or (size % 2 == 0))

    if size < 2:
        yield p[:]
    else:
        rotation_swaps = [None] * n
        for i in range(1, n):
            r = list(range(n))
            for v in range(1, i + 1):
                r[:v], r[v] = r[1:v + 1], r[0]
            swaps = []
            for dst, src in enumerate(r):
                if dst != src:
                    swaps.append((dst, src))
            rotation_swaps[i] = tuple(swaps)

        while True:
            yield p[:]
            p[0], p[1] = p[1], p[0]
            yield p[:]
            i = 2
            while count[i] >= i:
                count[i] = 0
                i += 1
            else:
                count[i] += 1
                t = p[:]
                for dst, src in rotation_swaps[i]:
                    p[dst] = t[src]

def alternating_flips_generator(n, start, size):
    maximum_flips = 0
    alternating_factor = 1
    lv=0
    for permutation in islice(permutations(n, start, size), size):
        first = permutation[0]
        if first:
            flips_count = 1
            while True:
                permutation[:first + 1] = permutation[first::-1]
                first = permutation[0]
                if not first: break
                flips_count += 1
            if maximum_flips < flips_count:
                maximum_flips = flips_count
            yield flips_count * alternating_factor
        else:
            yield 0
        alternating_factor = -alternating_factor
        lv+=1
    yield maximum_flips

def task(n, start, size):
    alternating_flips = alternating_flips_generator(n, start, size)
    return sum(islice(alternating_flips, size)), next(alternating_flips)

def fannkuch(n):
    if n < 0:
        for data in islice(permutations(-n, 0, factorial(-n)), factorial(-n)):
            print(''.join(map(lambda n: str(n + 1), data)))
    else:
        assert(n > 0)

        task_count = cpu_count()
        total = factorial(n)
        task_size = (total + task_count - 1) // task_count

        if task_size < 20000:
            task_size = total
            task_count = 1

        assert(task_size % 2 == 0)

        task_args = [(n, i * task_size, task_size) for i in range(task_count)]

        if task_count > 1:
            with Pool() as pool:
                checksums, maximums = zip(*pool.starmap(task, task_args))
        else:
            checksums, maximums = zip(*starmap(task, task_args))

        checksum, maximum = sum(checksums), max(maximums)
        print("{0}\nPfannkuchen({1}) = {2}".format(checksum, n, maximum))

if __name__ == "__main__":
    fannkuch(int(argv[1]))
NB. description here: 
https://benchmarksgame-team.pages.debian.net/benchmarksgame/description/fannkuchredux.html#fannkuchredux
NB. input n, for permutation length
NB. output checksum and 'fankuch(n) = result'
NB.  fkTacit=: {{
NB.  cs=: 0 NB. checksum Not Yet Implemented
NB.  fap=: (i.@!@] (flipct@A.)"0 1 >:@i.) NB. for all permutations
NB.  NB. scalar attempt memoizing for avoiding repeated work
NB.  NB. works, but does not count iterations :/
NB.  flipct=: [: 0&{:: [: flip^:(_) 0;]
NB.  flip=: (>:&.>@{. , ((|.@{. , }.)~ {.)&.>@{:)^:(1~:{.&>@{:)
NB.  cs;>./ fap y
NB.  }}
NB. 'cs max'=: fkTacit ". n=: 2{:: ARGV
Note 'Python conversions'
- no bytearrays in J, so use char (if slow, try using ints)
- how to implement generators?
  class with create, use while. #n=:next__c 
)

coclass'permutations' NB. verified, works (at least when starting from 0)
divmod=: <.@% , |~
create=: {{ NB. takes y=n,start,stop
  'nn start size'=: y
  p=: i. nn
  count=: 0#~ nn
  remainder=: start
  for_vv. i.@-&.<: nn do.
    'd remainder'=: remainder divmod !vv
    count=: d vv}count NB. was oneliner in python
    NB. for loop inversing some items never seems to execute... in single 
thread because start is 0
    for. i. vv{count do.
      p=: vv (({.}.),{.@], >:@[ }. ]) p
    end.
  end.
  assert. 0=1{count
  assert. (<&2 +. 0=2&|) size
  state=: 0
}}
next=: {{
select. state NB. for emulating yields
NB. case. 0 do. goto_zero. NB. implied
case. 1 do. goto_one.
case. 2 do. goto_two.
end.
if. size < 2 do.
  p
  return.
end. NB. else unnecessary as the if returns. 
rotation_swaps=:nn#a:
for_i. i.&.<: nn do.
  r =: i. nn
  for_vv. >:i. i do.
    r=: vv (({.}.) , {.@] , >:@[ }. ]) r
  end.
  swaps=:0 2$0
  for_ds. r do. NB. enumerate is just using ds_index,ds in J
    if. ds~:ds_index do.
      swaps=:swaps,(ds_index,ds)
    end.
  end.
  rotation_swaps=: (<swaps) i} rotation_swaps
end.

 NB. while loop from here
  label_while.
  state=:1
  p
  return. NB. first yield after while True
  label_one.

  state=:2
  p=: (0 1{p) (1 0)} p
  return.

  label_two. NB. second yield after while True.
  i=:2 NB. buh?
  while. i<:i{count do.
    count=: 0 i} count
    i=:>:i
  end. NB. while else??
  count=: (>: i{count) i} count
  t=: p
  for_ds. >i{rotation_swaps do. NB. dst first, source last
    p=: (t{~{:ds) ({.ds)}p
  end.
  goto_while.
}}
destroy=:codestroy
cocurrent'base'
permutations_z_=: conew&'permutations'

coclass'altflipsgen'
destroy=:codestroy
create=:{{
  'nn start size'=:y
  maximum_flips=:0
  alternating_factor=:1
  p=: permutations nn,start,size
  NB. for loop replaced by while because of contained yields, keep loopvar lv
  lv=: 0
}}
next=:{{
while. lv<size do. NB. islice limits perm to size calls
  permutation=:next__p''
  first=: {.permutation
  if. first do.
    flips_count=:1
    while. 1 do.
      permutation=: (permutation |.@{.~ >:first) (i.>:first)} permutation
      first=: {.permutation
      if. first=0 do. break. end. NB. quirk of J not: -. 5 is "True"
      flips_count=:>:flips_count
    end.
    if. maximum_flips<flips_count do.
      maximum_flips=: flips_count
    end.
    lv=:>:lv
    ret=: flips_count * alternating_factor
    alternating_factor =: -alternating_factor
    ret
    return.
  else.
    lv=:>:lv
    alternating_factor =: -alternating_factor
    0
    return.
  end.
end.
destroy__p''
maximum_flips
}}
cocurrent'base'
altflipsgen_z_=: conew&'altflipsgen'

task=:{{ NB. y=n start size
'nn start size'=:y
af=.altflipsgen nn,start,size
NB. af returns as last iteration the max flips
ret=. (next__af''),~ +/next__af^:(>:i.size)''
destroy__af''
ret
}}

fannkuch=:{{ NB. y is n
  NB. start 4 threads (== cpu count, ideally)
  (0 T. ''"_)^:] 4
  nn=.y
  if. nn<0 do.
    p=: permutations (,0,!)@- nn
    for. i. !-nn do. NB. test for permutations
      data=: next__p''
      echo ,":"0>: data
    end.
    destroy__p''
  else.
    assert. nn>0
    NB. no parallel impl ATM
    task_count=: 1 T. '' NB. would be handy if there were a cpucount option to 
T.
    total=: !nn
    task_size=: (total+task_count-1) <.@% task_count
    if. task_size < 20000 do.
      task_size=:total
      task_count=:1
    end.
    assert. 0=2|task_size
    task_args =: nn ,. task_size,.~(task_size*i. task_count)
    if. task_count >1 do.
      NB. spin up parallel J tasks
      'checksums maximums'=: |: > task t. ''"1 task_args
    else.
      'checksums maximums'=: |:   task      "1 task_args
    end.
    'checksum maximum'=: (+/checksums),>./maximums
    echo checksum
    echo 'Pfannkuchen(',(":nn),') = ',": maximum
  end.
}}

([: exit@0: [: fannkuch@". 2&{::)^:('fannkuch-redux.ijs'-: 1&{::)^:(1<#) ARGV
NB. description here: 
https://benchmarksgame-team.pages.debian.net/benchmarksgame/description/fannkuchredux.html#fannkuchredux
NB. input n, for permutation length
NB. output checksum and 'fankuch(n) = result'
NB.  fkTacit=: {{
NB.  cs=: 0 NB. checksum Not Yet Implemented
NB.  fap=: (i.@!@] (flipct@A.)"0 1 >:@i.) NB. for all permutations
NB.  NB. scalar attempt memoizing for avoiding repeated work
NB.  NB. works, but does not count iterations :/
NB.  flipct=: [: 0&{:: [: flip^:(_) 0;]
NB.  flip=: (>:&.>@{. , ((|.@{. , }.)~ {.)&.>@{:)^:(1~:{.&>@{:)
NB.  cs;>./ fap y
NB.  }}
NB. 'cs max'=: fkTacit ". n=: 2{:: ARGV
Note 'Python conversions'
- no bytearrays in J, so use char (if slow, try using ints)
- how to implement generators?
  class with create, use while. #n=:next__c 
)

coclass'permutations' NB. verified, works (at least when starting from 0)
divmod=: <.@% , |~
create=: {{ NB. takes y=n,start,stop
  'nn start size'=: y
  p=: i. nn
  count=: 0#~ nn
  remainder=: start
  for_vv. i.@-&.<: nn do.
    'd remainder'=: remainder divmod !vv
    count=: d vv}count NB. was oneliner in python
    NB. for loop inversing some items never seems to execute... in single 
thread because start is 0
    for. i. vv{count do.
      p=: vv (({.}.),{.@], >:@[ }. ]) p
    end.
  end.
  assert. 0=1{count
  assert. (<&2 +. 0=2&|) size
  state=: 0
}}
next=: {{
select. state NB. for emulating yields
NB. case. 0 do. goto_zero. NB. implied
case. 1 do. goto_one.
case. 2 do. goto_two.
end.
if. size < 2 do.
  p
  return.
end. NB. else unnecessary as the if returns. 
rotation_swaps=:nn#a:
for_i. i.&.<: nn do.
  r =: i. nn
  for_vv. >:i. i do.
    r=: vv (({.}.) , {.@] , >:@[ }. ]) r
  end.
  swaps=:0 2$0
  for_ds. r do. NB. enumerate is just using ds_index,ds in J
    if. ds~:ds_index do.
      swaps=:swaps,(ds_index,ds)
    end.
  end.
  rotation_swaps=: (<swaps) i} rotation_swaps
end.

 NB. while loop from here
  label_while.
  state=:1
  p
  return. NB. first yield after while True
  label_one.

  state=:2
  p=: (0 1{p) (1 0)} p
  return.

  label_two. NB. second yield after while True.
  i=:2 NB. buh?
  while. i<:i{count do.
    count=: 0 i} count
    i=:>:i
  end. NB. while else??
  count=: (>: i{count) i} count
  t=: p
  for_ds. >i{rotation_swaps do. NB. dst first, source last
    p=: (t{~{:ds) ({.ds)}p
  end.
  goto_while.
}}
destroy=:codestroy
cocurrent'base'
permutations_z_=: conew&'permutations'

coclass'altflipsgen'
destroy=:codestroy
create=:{{
  'nn start size'=:y
  maximum_flips=:0
  alternating_factor=:1
  p=: permutations nn,start,size
  NB. for loop replaced by while because of contained yields, keep loopvar lv
  lv=: 0
}}
next=:{{
while. lv<size do. NB. islice limits perm to size calls
  permutation=:next__p''
  first=: {.permutation
  if. first do.
    flips_count=:1
    while. 1 do.
      permutation=: (permutation |.@{.~ >:first) (i.>:first)} permutation
      first=: {.permutation
      if. first=0 do. break. end. NB. quirk of J not: -. 5 is "True"
      flips_count=:>:flips_count
    end.
    if. maximum_flips<flips_count do.
      maximum_flips=: flips_count
    end.
    lv=:>:lv
    ret=: flips_count * alternating_factor
    alternating_factor =: -alternating_factor
    ret
    return.
  else.
    lv=:>:lv
    alternating_factor =: -alternating_factor
    0
    return.
  end.
end.
destroy__p''
maximum_flips
}}
cocurrent'base'
altflipsgen_z_=: conew&'altflipsgen'

task=:{{ NB. y=n start size
'nn start size'=:y
af=.altflipsgen nn,start,size
NB. af returns as last iteration the max flips
ret=. (next__af''),~ +/next__af^:(>:i.size)''
destroy__af''
ret
}}

fannkuch=:{{ NB. y is n
  nn=.y
  if. nn<0 do.
    p=: permutations (,0,!)@- nn
    for. i. !-nn do. NB. test for permutations
      data=: next__p''
      echo ,":"0>: data
    end.
    destroy__p''
  else.
    assert. nn>0
    NB. no parallel impl ATM
    task_count=: 1
    total=: !nn
    task_size=: (total+task_count-1) <.@% task_count
    if. 1 do. NB. task_size < 20000 do.
      task_size=:total
      task_count=:1
    end.
    assert. 0=2|task_size
    task_args =: nn ,. task_size,.~(task_size*i. task_count)
    if. task_count >1 do.
      NB. spin up parallel J tasks
      'checksums maximums'=: |: > task t. ''"1 task_args
    else.
      'checksums maximums'=: |:   task      "1 task_args
    end.
    'checksum maximum'=: (+/checksums),>./maximums
    echo checksum
    echo 'Pfannkuchen(',(":nn),') = ',": maximum
  end.
}}

([: exit@0: [: fannkuch@". 2&{::)^:('fannkuch-redux_nothread.ijs'-: 
1&{::)^:(1<#) ARGV
NB. description here: 
https://benchmarksgame-team.pages.debian.net/benchmarksgame/description/fannkuchredux.html#fannkuchredux
NB. input n, for permutation length
NB. output checksum and 'fankuch(n) = result'
NB.  fkTacit=: {{
NB.  cs=: 0 NB. checksum Not Yet Implemented
NB.  fap=: (i.@!@] (flipct@A.)"0 1 >:@i.) NB. for all permutations
NB.  NB. scalar attempt memoizing for avoiding repeated work
NB.  NB. works, but does not count iterations :/
NB.  flipct=: [: 0&{:: [: flip^:(_) 0;]
NB.  flip=: (>:&.>@{. , ((|.@{. , }.)~ {.)&.>@{:)^:(1~:{.&>@{:)
NB.  cs;>./ fap y
NB.  }}
NB. 'cs max'=: fkTacit ". n=: 2{:: ARGV
Note 'Python conversions'
- no bytearrays in J, so use char (if slow, try using ints)
- how to implement generators?
  class with create, use while. #n=:next__c 
)

coclass'permutations' NB. verified, works (at least when starting from 0)
divmod=: <.@% , |~
create=: {{ NB. takes y=n,start,stop
  'nn start size'=: y
  p=: i. nn
  count=: 0#~ nn
  remainder=: start
  for_vv. i.@-&.<: nn do.
    'd remainder'=: remainder divmod !vv
    count=: d vv}count NB. was oneliner in python
    NB. for loop inversing some items never seems to execute... in single 
thread because start is 0
    for. i. vv{count do.
      p=: vv (({.}.),{.@], >:@[ }. ]) p
    end.
  end.
  assert. 0=1{count
  assert. (<&2 +. 0=2&|) size
  state=: 0
}}
next=: {{
select. state NB. for emulating yields
NB. case. 0 do. goto_zero. NB. implied
case. 1 do. goto_one.
case. 2 do. goto_two.
end.
if. size < 2 do.
  p
  return.
end. NB. else unnecessary as the if returns. 
rotation_swaps=:nn#a:
for_i. i.&.<: nn do.
  r =: i. nn
  for_vv. >:i. i do.
    r=: vv (({.}.) , {.@] , >:@[ }. ]) r
  end.
  swaps=:0 2$0
  for_ds. r do. NB. enumerate is just using ds_index,ds in J
    if. ds~:ds_index do.
      swaps=:swaps,(ds_index,ds)
    end.
  end.
  rotation_swaps=: (<swaps) i} rotation_swaps
end.

 NB. while loop from here
  label_while.
  state=:1
  p
  return. NB. first yield after while True
  label_one.

  state=:2
  p=: (0 1{p) (1 0)} p
  return.

  label_two. NB. second yield after while True.
  i=:2 NB. buh?
  while. i<:i{count do.
    count=: 0 i} count
    i=:>:i
  end. NB. while else??
  count=: (>: i{count) i} count
  t=: p
  for_ds. >i{rotation_swaps do. NB. dst first, source last
    p=: (t{~{:ds) ({.ds)}p
  end.
  goto_while.
}}
destroy=:codestroy
cocurrent'base'
permutations_z_=: conew&'permutations'

coclass'altflipsgen'
destroy=:codestroy
create=:{{
  'nn start size'=:y
  maximum_flips=:0
  alternating_factor=:1
  p=: permutations nn,start,size
  NB. for loop replaced by while because of contained yields, keep loopvar lv
  lv=: 0
}}
next=:{{
while. lv<size do. NB. islice limits perm to size calls
  permutation=:next__p''
  first=: {.permutation
  if. first do.
    flips_count=:1
    while. 1 do.
      permutation=: (permutation |.@{.~ >:first) (i.>:first)} permutation
      first=: {.permutation
      if. first=0 do. break. end. NB. quirk of J not: -. 5 is "True"
      flips_count=:>:flips_count
    end.
    if. maximum_flips<flips_count do.
      maximum_flips=: flips_count
    end.
    lv=:>:lv
    ret=: flips_count * alternating_factor
    alternating_factor =: -alternating_factor
    ret
    return.
  else.
    lv=:>:lv
    alternating_factor =: -alternating_factor
    0
    return.
  end.
end.
destroy__p''
maximum_flips
}}
cocurrent'base'
altflipsgen_z_=: conew&'altflipsgen'

task=:{{ NB. y=n start size
'nn start size'=:y
af=.altflipsgen nn,start,size
NB. af returns as last iteration the max flips
ret=. (next__af''),~ +/next__af^:(>:i.size)''
destroy__af''
ret
}}

fannkuch=:{{ NB. y is n
  NB. start 4 threads (== cpu count, ideally)
  (0 T. ''"_)^:] 4
  nn=.y
  if. nn<0 do.
    p=: permutations (,0,!)@- nn
    for. i. !-nn do. NB. test for permutations
      data=: next__p''
      echo ,":"0>: data
    end.
    destroy__p''
  else.
    assert. nn>0
    NB. no parallel impl ATM
    task_count=: 1 T. '' NB. would be handy if there were a cpucount option to 
T.
    total=: !nn
    task_size=: (total+task_count-1) <.@% task_count
    if. 1 do. NB. task_size < 20000 do.
      task_size=:total
      task_count=:1
    end.
    assert. 0=2|task_size
    task_args =: nn ,. task_size,.~(task_size*i. task_count)
    if. task_count >1 do.
      NB. spin up parallel J tasks
      'checksums maximums'=: |: > task t. ''"1 task_args
    else.
      'checksums maximums'=: |:   task      "1 task_args
    end.
    'checksum maximum'=: (+/checksums),>./maximums
    echo checksum
    echo 'Pfannkuchen(',(":nn),') = ',": maximum
  end.
}}

([: exit@0: [: fannkuch@". 2&{::)^:('fannkuch-redux_notasks.ijs'-: 
1&{::)^:(1<#) ARGV
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to