Hi,
I converted one of the BenchmarkGame problem's python solution (
https://benchmarksgame-team.pages.debian.net/benchmarksgame/description/fannkuchredux.html)
to J.
I just transliterated the Python solution (including all the stinking
loops, ugly workarounds for generators etc), sticking to the original as
much as possible, so I am sure a faster solution must exist in J, and I
also did not implement the parallel part of the python solution once
arguments get large: this solution works till permutations of length 8.
This is mainly because I wrote this on my phone and neither J's nor
Python's parallelism is supported on my OS.
The scripts used are in attachment (strip the .txt from the names)
Results (abbreviated output; ran in powershell on Windows 10 on an Intel
core i7-8665U):
PS C:\Users\myuser\j903-user\temp> Measure-Command { For($i=0; $i -lt 100;
$i++) {..\..\j903\bin\jconsole.exe fannkuch-redux_noai.ijs 8}}
TotalMilliseconds : 53978,1287
PS C:\Users\myuser\j903-user\temp> Measure-Command { For($i=0; $i -lt 100;
$i++) {python fannkuch-redux.py 8}}
TotalMilliseconds : 50086,8826
PS C:\Users\myuser\j903-user\temp> python --version
Python 3.8.3
PS C:\Users\myuser\j903-user\temp> ..\..\j903\bin\jconsole.exe -js 'exit 0:
echo JVERSION'
Engine: j903/j64avx2/windows
Beta-o: commercial/2021-08-24T18:57:03
Library: 9.03.06
Platform: Win 64
Installer: J903 install
InstallPath: c:/users/myuser/j903
Contact: www.jsoftware.com
100 * 53978.1287 % 50086.8826
107.769
Jconsole takes 7.8% longer than Python averaged over 100 runs.
Which is quite good, I think, when compared with j903-beta a (most recent
available for my phone, Android 32 (armeabi-v7a)) and python 3.9.6; there
50 runs of J took 2m19.879s vs. python 0m49.456s.
That is, J takes 182.8% longer than Python, averaged over 50 runs!
I'm also curious to see how the best J solution ranks compared to the
above, as well as the paralnel version.
Best regards,
Jan-Pieter
On Sun, Aug 29, 2021, 04:59 Henry Rich <[email protected]> wrote:
> In beta-o I have rewritten the parser one more time. I don't see how it
> can get much faster. In a weakly-typed language like J there is only so
> much analysis that can be done before executing a sentence. I wonder
> how J performs compared to other popular languages, especially Python
> which is what the cool kids are using. If anyone is willing to do the
> comparison, I'd be obliged.
>
> I am thinking of measuring the execution time of a simple program, one
> that does scalar-only operations so we are focusing on the parser. A
> sample J program would be:
>
> v =: {{ totl =. 0 for_i. y do. totl =. totl + i * i_index end. totl }}
> 10 (6!:2) 'v a' [ a =. i. 1e6
>
> Large improvement would require heroic intervention: parsing & caching
> sentences on the fly & watching assignments to make sure the cached
> parse is valid. I would rather avoid that.
>
> Henry Rich
>
> --
> This email has been checked for viruses by AVG.
> https://www.avg.com
>
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
>
# The Computer Language Benchmarks Game
# https://salsa.debian.org/benchmarksgame-team/benchmarksgame/
#
# contributed by Joerg Baumann
# many thanks to Oleg Mazurov for his helpful description
from sys import argv
from math import factorial
from multiprocessing import cpu_count, Pool
from itertools import islice, starmap
def permutations(n, start, size):
p = bytearray(range(n))
count = bytearray(n)
remainder = start
for v in range(n - 1, 0, -1):
count[v], remainder = divmod(remainder, factorial(v))
for _ in range(count[v]): # never executed in single thread code,
because at this point count[i] == \x00
p[:v], p[v] = p[1:v + 1], p[0]
assert(count[1] == 0)
assert(size < 2 or (size % 2 == 0))
if size < 2:
yield p[:]
else:
rotation_swaps = [None] * n
for i in range(1, n):
r = list(range(n))
for v in range(1, i + 1):
r[:v], r[v] = r[1:v + 1], r[0]
swaps = []
for dst, src in enumerate(r):
if dst != src:
swaps.append((dst, src))
rotation_swaps[i] = tuple(swaps)
while True:
yield p[:]
p[0], p[1] = p[1], p[0]
yield p[:]
i = 2
while count[i] >= i:
count[i] = 0
i += 1
else:
count[i] += 1
t = p[:]
for dst, src in rotation_swaps[i]:
p[dst] = t[src]
def alternating_flips_generator(n, start, size):
maximum_flips = 0
alternating_factor = 1
lv=0
for permutation in islice(permutations(n, start, size), size):
first = permutation[0]
if first:
flips_count = 1
while True:
permutation[:first + 1] = permutation[first::-1]
first = permutation[0]
if not first: break
flips_count += 1
if maximum_flips < flips_count:
maximum_flips = flips_count
yield flips_count * alternating_factor
else:
yield 0
alternating_factor = -alternating_factor
lv+=1
yield maximum_flips
def task(n, start, size):
alternating_flips = alternating_flips_generator(n, start, size)
return sum(islice(alternating_flips, size)), next(alternating_flips)
def fannkuch(n):
if n < 0:
for data in islice(permutations(-n, 0, factorial(-n)), factorial(-n)):
print(''.join(map(lambda n: str(n + 1), data)))
else:
assert(n > 0)
task_count = cpu_count()
total = factorial(n)
task_size = (total + task_count - 1) // task_count
if task_size < 20000:
task_size = total
task_count = 1
assert(task_size % 2 == 0)
task_args = [(n, i * task_size, task_size) for i in range(task_count)]
if task_count > 1:
with Pool() as pool:
checksums, maximums = zip(*pool.starmap(task, task_args))
else:
checksums, maximums = zip(*starmap(task, task_args))
checksum, maximum = sum(checksums), max(maximums)
print("{0}\nPfannkuchen({1}) = {2}".format(checksum, n, maximum))
if __name__ == "__main__":
fannkuch(int(argv[1]))
NB. description here:
https://benchmarksgame-team.pages.debian.net/benchmarksgame/description/fannkuchredux.html#fannkuchredux
NB. input n, for permutation length
NB. output checksum and 'fankuch(n) = result'
NB. fkTacit=: {{
NB. cs=: 0 NB. checksum Not Yet Implemented
NB. fap=: (i.@!@] (flipct@A.)"0 1 >:@i.) NB. for all permutations
NB. NB. scalar attempt memoizing for avoiding repeated work
NB. NB. works, but does not count iterations :/
NB. flipct=: [: 0&{:: [: flip^:(_) 0;]
NB. flip=: (>:&.>@{. , ((|.@{. , }.)~ {.)&.>@{:)^:(1~:{.&>@{:)
NB. cs;>./ fap y
NB. }}
NB. 'cs max'=: fkTacit ". n=: 2{:: ARGV
Note 'Python conversions'
- no bytearrays in J, so use char (if slow, try using ints)
- how to implement generators?
class with create, use while. #n=:next__c
)
coclass'permutations' NB. verified, works (at least when starting from 0)
divmod=: <.@% , |~
create=: {{ NB. takes y=n,start,stop
'nn start size'=: y
p=: i. nn
count=: 0#~ nn
remainder=: start
for_vv. i.@-&.<: nn do.
'd remainder'=: remainder divmod !vv
count=: d vv}count NB. was oneliner in python
NB. for loop inversing some items never seems to execute... in single
thread because start is 0
for. i. vv{count do.
p=: vv (({.}.),{.@], >:@[ }. ]) p
end.
end.
assert. 0=1{count
assert. (<&2 +. 0=2&|) size
state=: 0
}}
next=: {{
select. state NB. for emulating yields
NB. case. 0 do. goto_zero. NB. implied
case. 1 do. goto_one.
case. 2 do. goto_two.
end.
if. size < 2 do.
p
return.
end. NB. else unnecessary as the if returns.
rotation_swaps=:nn#a:
for_i. i.&.<: nn do.
r =: i. nn
for_vv. >:i. i do.
r=: vv (({.}.) , {.@] , >:@[ }. ]) r
end.
swaps=:0 2$0
for_ds. r do. NB. enumerate is just using ds_index,ds in J
if. ds~:ds_index do.
swaps=:swaps,(ds_index,ds)
end.
end.
rotation_swaps=: (<swaps) i} rotation_swaps
end.
NB. while loop from here
label_while.
state=:1
p
return. NB. first yield after while True
label_one.
state=:2
p=: (0 1{p) (1 0)} p
return.
label_two. NB. second yield after while True.
i=:2 NB. buh?
while. i<:i{count do.
count=: 0 i} count
i=:>:i
end. NB. while else??
count=: (>: i{count) i} count
t=: p
for_ds. >i{rotation_swaps do. NB. dst first, source last
p=: (t{~{:ds) ({.ds)}p
end.
goto_while.
}}
destroy=:codestroy
cocurrent'base'
permutations_z_=: conew&'permutations'
coclass'altflipsgen'
destroy=:codestroy
create=:{{
'nn start size'=:y
maximum_flips=:0
alternating_factor=:1
p=: permutations nn,start,size
NB. for loop replaced by while because of contained yields, keep loopvar lv
lv=: 0
}}
next=:{{
while. lv<size do. NB. islice limits perm to size calls
permutation=:next__p''
first=: {.permutation
if. first do.
flips_count=:1
while. 1 do.
permutation=: (permutation |.@{.~ >:first) (i.>:first)} permutation
first=: {.permutation
if. first=0 do. break. end. NB. quirk of J not: -. 5 is "True"
flips_count=:>:flips_count
end.
if. maximum_flips<flips_count do.
maximum_flips=: flips_count
end.
lv=:>:lv
ret=: flips_count * alternating_factor
alternating_factor =: -alternating_factor
ret
return.
else.
lv=:>:lv
alternating_factor =: -alternating_factor
0
return.
end.
end.
NB. destroy__p''
maximum_flips
}}
cocurrent'base'
altflipsgen_z_=: conew&'altflipsgen'
task=:{{ NB. y=n start size
'nn start size'=:y
af=.altflipsgen nn,start,size
NB. af returns as last iteration the max flips
ret=. (next__af''),~ +/next__af^:(>:i.size)''
NB. destroy__af''
ret
}}
fannkuch=:{{ NB. y is n
nn=.y
if. nn<0 do.
p=: permutations (,0,!)@- nn
for. i. !-nn do. NB. test for permutations
data=: next__p''
echo ,":"0>: data
end.
NB. destroy__p''
else.
assert. nn>0
NB. no parallel impl ATM
task_count=: 1
total=: !nn
task_size=: (total+task_count-1) <.@% task_count
if. task_size < 20000 do.
task_size=total
task_count=1
end.
assert. 0=2|task_size
task_args =: nn ,. task_size,.~(task_size*i. task_count)
if. task_count >1 do.
NB. spin up parallel J
else.
'checksums maximums'=:|:task"1 task_args
'checksum maximum'=: (+/checksums),>./maximums
end.
echo checksum
echo 'Pfannkuchen(',(":nn),') = ',": maximum
end.
}}
([: exit@0: [: fannkuch@". 2&{::)^:('fannkuch-redux_noai.ijs'-: 1&{::)^:(1<#)
ARGV
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm