Hi,

I converted one of the BenchmarkGame problem's python solution (
https://benchmarksgame-team.pages.debian.net/benchmarksgame/description/fannkuchredux.html)
to J.
I just transliterated the Python solution (including all the stinking
loops, ugly workarounds for generators etc), sticking to the original as
much as possible, so I am sure a faster solution must exist in J, and I
also did not implement the parallel part of the python solution once
arguments get large: this solution works till permutations of length 8.
This is mainly because I wrote this on my phone and neither J's nor
Python's parallelism is supported on my OS.

The scripts used are in attachment (strip the .txt from the names)

Results (abbreviated output; ran in powershell on Windows 10 on an Intel
core i7-8665U):

PS C:\Users\myuser\j903-user\temp> Measure-Command { For($i=0; $i -lt 100;
$i++) {..\..\j903\bin\jconsole.exe fannkuch-redux_noai.ijs 8}}

TotalMilliseconds : 53978,1287

PS C:\Users\myuser\j903-user\temp> Measure-Command { For($i=0; $i -lt 100;
$i++) {python fannkuch-redux.py 8}}

TotalMilliseconds : 50086,8826

PS C:\Users\myuser\j903-user\temp> python --version
Python 3.8.3

PS C:\Users\myuser\j903-user\temp> ..\..\j903\bin\jconsole.exe -js 'exit 0:
echo JVERSION'
Engine: j903/j64avx2/windows
Beta-o: commercial/2021-08-24T18:57:03
Library: 9.03.06
Platform: Win 64
Installer: J903 install
InstallPath: c:/users/myuser/j903
Contact: www.jsoftware.com

   100 * 53978.1287 % 50086.8826
107.769

Jconsole takes 7.8% longer than Python averaged over 100 runs.

Which is quite good, I think, when compared with j903-beta a (most recent
available for my phone, Android 32 (armeabi-v7a)) and python 3.9.6; there
50 runs of J took 2m19.879s vs. python 0m49.456s.

That is, J takes 182.8% longer than Python, averaged over 50 runs!

I'm also curious to see how the best J solution ranks compared to the
above, as well as the paralnel version.


Best regards,

Jan-Pieter

On Sun, Aug 29, 2021, 04:59 Henry Rich <[email protected]> wrote:

> In beta-o I have rewritten the parser one more time.  I don't see how it
> can get much faster.  In a weakly-typed language like J there is only so
> much analysis that can be done before executing a sentence.  I wonder
> how J performs compared to other popular languages, especially Python
> which is what the cool kids are using.  If anyone is willing to do the
> comparison, I'd be obliged.
>
> I am thinking of measuring the execution time of a simple program, one
> that does scalar-only operations so we are focusing on the parser.  A
> sample J program would be:
>
> v =: {{ totl =. 0 for_i. y do. totl =. totl + i * i_index end. totl }}
> 10 (6!:2) 'v a' [ a =. i. 1e6
>
> Large improvement would require heroic intervention: parsing & caching
> sentences on the fly &  watching assignments to make sure the cached
> parse is valid.  I would rather avoid that.
>
> Henry Rich
>
> --
> This email has been checked for viruses by AVG.
> https://www.avg.com
>
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
>
# The Computer Language Benchmarks Game
# https://salsa.debian.org/benchmarksgame-team/benchmarksgame/
#
# contributed by Joerg Baumann
# many thanks to Oleg Mazurov for his helpful description

from sys import argv
from math import factorial
from multiprocessing import cpu_count, Pool
from itertools import islice, starmap

def permutations(n, start, size):
    p = bytearray(range(n))
    count = bytearray(n)

    remainder = start
    for v in range(n - 1, 0, -1):
        count[v], remainder = divmod(remainder, factorial(v))
        for _ in range(count[v]): # never executed in single thread code, 
because at this point count[i] == \x00
            p[:v], p[v] = p[1:v + 1], p[0]

    assert(count[1] == 0)
    assert(size < 2 or (size % 2 == 0))

    if size < 2:
        yield p[:]
    else:
        rotation_swaps = [None] * n
        for i in range(1, n):
            r = list(range(n))
            for v in range(1, i + 1):
                r[:v], r[v] = r[1:v + 1], r[0]
            swaps = []
            for dst, src in enumerate(r):
                if dst != src:
                    swaps.append((dst, src))
            rotation_swaps[i] = tuple(swaps)

        while True:
            yield p[:]
            p[0], p[1] = p[1], p[0]
            yield p[:]
            i = 2
            while count[i] >= i:
                count[i] = 0
                i += 1
            else:
                count[i] += 1
                t = p[:]
                for dst, src in rotation_swaps[i]:
                    p[dst] = t[src]

def alternating_flips_generator(n, start, size):
    maximum_flips = 0
    alternating_factor = 1
    lv=0
    for permutation in islice(permutations(n, start, size), size):
        first = permutation[0]
        if first:
            flips_count = 1
            while True:
                permutation[:first + 1] = permutation[first::-1]
                first = permutation[0]
                if not first: break
                flips_count += 1
            if maximum_flips < flips_count:
                maximum_flips = flips_count
            yield flips_count * alternating_factor
        else:
            yield 0
        alternating_factor = -alternating_factor
        lv+=1
    yield maximum_flips

def task(n, start, size):
    alternating_flips = alternating_flips_generator(n, start, size)
    return sum(islice(alternating_flips, size)), next(alternating_flips)

def fannkuch(n):
    if n < 0:
        for data in islice(permutations(-n, 0, factorial(-n)), factorial(-n)):
            print(''.join(map(lambda n: str(n + 1), data)))
    else:
        assert(n > 0)

        task_count = cpu_count()
        total = factorial(n)
        task_size = (total + task_count - 1) // task_count

        if task_size < 20000:
            task_size = total
            task_count = 1

        assert(task_size % 2 == 0)

        task_args = [(n, i * task_size, task_size) for i in range(task_count)]

        if task_count > 1:
            with Pool() as pool:
                checksums, maximums = zip(*pool.starmap(task, task_args))
        else:
            checksums, maximums = zip(*starmap(task, task_args))

        checksum, maximum = sum(checksums), max(maximums)
        print("{0}\nPfannkuchen({1}) = {2}".format(checksum, n, maximum))

if __name__ == "__main__":
    fannkuch(int(argv[1]))
NB. description here: 
https://benchmarksgame-team.pages.debian.net/benchmarksgame/description/fannkuchredux.html#fannkuchredux
NB. input n, for permutation length
NB. output checksum and 'fankuch(n) = result'
NB.  fkTacit=: {{
NB.  cs=: 0 NB. checksum Not Yet Implemented
NB.  fap=: (i.@!@] (flipct@A.)"0 1 >:@i.) NB. for all permutations
NB.  NB. scalar attempt memoizing for avoiding repeated work
NB.  NB. works, but does not count iterations :/
NB.  flipct=: [: 0&{:: [: flip^:(_) 0;]
NB.  flip=: (>:&.>@{. , ((|.@{. , }.)~ {.)&.>@{:)^:(1~:{.&>@{:)
NB.  cs;>./ fap y
NB.  }}
NB. 'cs max'=: fkTacit ". n=: 2{:: ARGV
Note 'Python conversions'
- no bytearrays in J, so use char (if slow, try using ints)
- how to implement generators?
  class with create, use while. #n=:next__c 
)

coclass'permutations' NB. verified, works (at least when starting from 0)
divmod=: <.@% , |~
create=: {{ NB. takes y=n,start,stop
  'nn start size'=: y
  p=: i. nn
  count=: 0#~ nn
  remainder=: start
  for_vv. i.@-&.<: nn do.
    'd remainder'=: remainder divmod !vv
    count=: d vv}count NB. was oneliner in python
    NB. for loop inversing some items never seems to execute... in single 
thread because start is 0
    for. i. vv{count do.
      p=: vv (({.}.),{.@], >:@[ }. ]) p
    end.
  end.
  assert. 0=1{count
  assert. (<&2 +. 0=2&|) size
  state=: 0
}}
next=: {{
select. state NB. for emulating yields
NB. case. 0 do. goto_zero. NB. implied
case. 1 do. goto_one.
case. 2 do. goto_two.
end.
if. size < 2 do.
  p
  return.
end. NB. else unnecessary as the if returns. 
rotation_swaps=:nn#a:
for_i. i.&.<: nn do.
  r =: i. nn
  for_vv. >:i. i do.
    r=: vv (({.}.) , {.@] , >:@[ }. ]) r
  end.
  swaps=:0 2$0
  for_ds. r do. NB. enumerate is just using ds_index,ds in J
    if. ds~:ds_index do.
      swaps=:swaps,(ds_index,ds)
    end.
  end.
  rotation_swaps=: (<swaps) i} rotation_swaps
end.

 NB. while loop from here
  label_while.
  state=:1
  p
  return. NB. first yield after while True
  label_one.

  state=:2
  p=: (0 1{p) (1 0)} p
  return.

  label_two. NB. second yield after while True.
  i=:2 NB. buh?
  while. i<:i{count do.
    count=: 0 i} count
    i=:>:i
  end. NB. while else??
  count=: (>: i{count) i} count
  t=: p
  for_ds. >i{rotation_swaps do. NB. dst first, source last
    p=: (t{~{:ds) ({.ds)}p
  end.
  goto_while.
}}
destroy=:codestroy
cocurrent'base'
permutations_z_=: conew&'permutations'

coclass'altflipsgen'
destroy=:codestroy
create=:{{
  'nn start size'=:y
  maximum_flips=:0
  alternating_factor=:1
  p=: permutations nn,start,size
  NB. for loop replaced by while because of contained yields, keep loopvar lv
  lv=: 0
}}
next=:{{
while. lv<size do. NB. islice limits perm to size calls
  permutation=:next__p''
  first=: {.permutation
  if. first do.
    flips_count=:1
    while. 1 do.
      permutation=: (permutation |.@{.~ >:first) (i.>:first)} permutation
      first=: {.permutation
      if. first=0 do. break. end. NB. quirk of J not: -. 5 is "True"
      flips_count=:>:flips_count
    end.
    if. maximum_flips<flips_count do.
      maximum_flips=: flips_count
    end.
    lv=:>:lv
    ret=: flips_count * alternating_factor
    alternating_factor =: -alternating_factor
    ret
    return.
  else.
    lv=:>:lv
    alternating_factor =: -alternating_factor
    0
    return.
  end.
end.
NB. destroy__p''
maximum_flips
}}
cocurrent'base'
altflipsgen_z_=: conew&'altflipsgen'

task=:{{ NB. y=n start size
'nn start size'=:y
af=.altflipsgen nn,start,size
NB. af returns as last iteration the max flips
ret=. (next__af''),~ +/next__af^:(>:i.size)''
NB. destroy__af''
ret
}}

fannkuch=:{{ NB. y is n
  nn=.y
  if. nn<0 do.
    p=: permutations (,0,!)@- nn
    for. i. !-nn do. NB. test for permutations
      data=: next__p''
      echo ,":"0>: data
    end.
    NB. destroy__p''
  else.
    assert. nn>0
    NB. no parallel impl ATM
    task_count=: 1
    total=: !nn
    task_size=: (total+task_count-1) <.@% task_count
    if. task_size < 20000 do.
      task_size=total
      task_count=1
    end.
    assert. 0=2|task_size
    task_args =: nn ,. task_size,.~(task_size*i. task_count)
    if. task_count >1 do.
      NB. spin up parallel J
    else.
      'checksums maximums'=:|:task"1 task_args
      'checksum maximum'=: (+/checksums),>./maximums
    end.
    echo checksum
    echo 'Pfannkuchen(',(":nn),') = ',": maximum
  end.
}}

([: exit@0: [: fannkuch@". 2&{::)^:('fannkuch-redux_noai.ijs'-: 1&{::)^:(1<#) 
ARGV
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to