In this particular case, the performance would be doubled if the copying
was eliminated. If that is done, along with an optimisation of the
reduction operators, you would see at least a 10× improvement.

In my previous tests I even higher improvements (100× or so) for specific
cases. Unfortunately my copy-on-write algorithm didn't actually work
properly so it needed to be rolled back. The potential is there, however.

Regards, Elias
On 28 Aug 2015 20:46, "Mike Duvos" <[email protected]> wrote:

> Hi Elias,
>
> That's very interesting.
>
> How fast do you think GNU APL could be made if the gratuitous array
> copying were completely eliminated?  I'm guessing it's always going to be
> more of an educational tool than a competitor to expensive commercial APL
> systems, which special case everything by type and rank, and do extensive
> idiom recognition.
>
> Regards,
>
> Mike
>
>
>
> On Thu, Aug 27, 2015 at 11:54 PM, Elias Mårtenson <[email protected]>
> wrote:
>
>> Thanks. I've run the test case with Callgrind now (it took over 3 hours).
>>
>> Like the previous test, we see that the reduce operator takes an
>> excessive amount of time. Specifically, the call to Prefix::reduce_A_F_B()
>> was called 67157 times and took 50% CPU time, of which 30% of the total
>> time (i.e. 60% of the reduction) was doing the ⍴-reduction and 11.5% was
>> doing a ↑-reduction.
>>
>> Out of the other half of processing time, 47.24% was spent doing
>> 6210421621 calls to Cell::init(). This was mostly caused by 105552 calls to
>> Value::clone(), creating what I believe is mostly unnecessary copies of the
>> an array with an average of 6210421621/105552≈59000 elements.
>>
>> Regards,
>> Elias
>>
>>
>> On 28 August 2015 at 10:49, Mike Duvos <[email protected]> wrote:
>>
>>> Hi Elias,
>>>
>>>       )IN PRIMESATF
>>>
>>> Should load PRIMESATF.atf.  You need to have it in your workspaces
>>> directory.  You can see where that is with )LIBS
>>>
>>> Regards,
>>>
>>> Mike
>>>
>>>
>>>
>>> On Thu, Aug 27, 2015 at 7:45 PM, Elias Mårtenson <[email protected]>
>>> wrote:
>>>
>>>> I have never used an ATF file, how do I load them?
>>>>
>>>> Generally, I just edit my APL files as plain .apl files. Makes the
>>>> Emacs navigate-to-definition easy to use too.
>>>>
>>>> Regards,
>>>> Elias
>>>>
>>>> On 28 August 2015 at 10:42, Mike Duvos <[email protected]> wrote:
>>>>
>>>>> Hi Elias,
>>>>>
>>>>> I am apparently still having problems with pasting Unicode dropping
>>>>> less than or equal to and greater than or equal to.  I've attached the 
>>>>> .atf
>>>>>
>>>>> Regards,
>>>>>
>>>>> Mike
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Aug 27, 2015 at 7:37 PM, Elias Mårtenson <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> I wanted to test this thing myself, but I'm getting this error when
>>>>>> testing it:
>>>>>>
>>>>>> *      TIME 'PRIMES←SIEVE 100000'*
>>>>>> DOMAIN ERROR
>>>>>> SIEVE[3]  →((⍴B)P←B⍳1)/L2
>>>>>>            ^           ^
>>>>>>
>>>>>> Regards,
>>>>>> Elias
>>>>>>
>>>>>> On 28 August 2015 at 10:30, Mike Duvos <[email protected]> wrote:
>>>>>>
>>>>>>> Here is a function that finds all the Primes less than N, by
>>>>>>> clearing bits in a boolean vector.
>>>>>>>
>>>>>>>       )CLEAR
>>>>>>> CLEAR WS
>>>>>>>
>>>>>>>      ⎕IO←0
>>>>>>>
>>>>>>>     ∇
>>>>>>> [0]   Z←SIEVE N;B;K;P
>>>>>>> [1]   Z←B←0 0,(¯2+N)⍴0=K←0
>>>>>>> [2]  L1:→((⍴B)P←B⍳1)/L2
>>>>>>> [3]   B←B∧(⍴B)⍴∼P↑1
>>>>>>> [4]   Z[K]←P◊K←K+1
>>>>>>> [5]   →L1
>>>>>>> [6]  L2:Z←K↑Z
>>>>>>>     ∇
>>>>>>>
>>>>>>> And our timing function, which we have used previously.
>>>>>>>
>>>>>>>     ∇
>>>>>>> [0]   TIME X;TS
>>>>>>> [1]   TS←⎕TS
>>>>>>> [2]   ⍎X
>>>>>>> [3]   (⍕(24 60 60 1000⊥¯4↑⎕TS-TS)÷1000),' Seconds.'
>>>>>>>     ∇
>>>>>>>
>>>>>>> [IBM APL2]
>>>>>>>
>>>>>>>       TIME 'PRIMES←SIEVE 100000'
>>>>>>> 1.865 Seconds.
>>>>>>>
>>>>>>> [GNU APL]
>>>>>>>
>>>>>>>       TIME 'PRIMES←SIEVE 100000'
>>>>>>> 132.056 Seconds.
>>>>>>>
>>>>>>>       10 10⍴PRIMES
>>>>>>>   2   3   5   7  11  13  17  19  23  29
>>>>>>>  31  37  41  43  47  53  59  61  67  71
>>>>>>>  73  79  83  89  97 101 103 107 109 113
>>>>>>> 127 131 137 139 149 151 157 163 167 173
>>>>>>> 179 181 191 193 197 199 211 223 227 229
>>>>>>> 233 239 241 251 257 263 269 271 277 281
>>>>>>> 283 293 307 311 313 317 331 337 347 349
>>>>>>> 353 359 367 373 379 383 389 397 401 409
>>>>>>> 419 421 431 433 439 443 449 457 461 463
>>>>>>> 467 479 487 491 499 503 509 521 523 541
>>>>>>>
>>>>>>>       10 10 ⍴¯100↑PRIMES
>>>>>>> 98897 98899 98909 98911 98927 98929 98939 98947 98953 98963
>>>>>>> 98981 98993 98999 99013 99017 99023 99041 99053 99079 99083
>>>>>>> 99089 99103 99109 99119 99131 99133 99137 99139 99149 99173
>>>>>>> 99181 99191 99223 99233 99241 99251 99257 99259 99277 99289
>>>>>>> 99317 99347 99349 99367 99371 99377 99391 99397 99401 99409
>>>>>>> 99431 99439 99469 99487 99497 99523 99527 99529 99551 99559
>>>>>>> 99563 99571 99577 99581 99607 99611 99623 99643 99661 99667
>>>>>>> 99679 99689 99707 99709 99713 99719 99721 99733 99761 99767
>>>>>>> 99787 99793 99809 99817 99823 99829 99833 99839 99859 99871
>>>>>>> 99877 99881 99901 99907 99923 99929 99961 99971 99989 99991
>>>>>>>
>>>>>>> So on that function, GNU APL is 70.807 times as slow as APL2, so
>>>>>>> obviously some performance issues remain.
>>>>>>>
>>>>>>> Function PD returns a list of the primes not greater than the square
>>>>>>> root of a number which divide it evenly.  If there are none, the number 
>>>>>>> is
>>>>>>> prime, and it returns the number.  FACTOR calls PD repeatedly to get the
>>>>>>> full prime factorization of its argument.   FFMT factors a list of 
>>>>>>> numbers,
>>>>>>> and returns the numbers and their factorizations printed out neatly.
>>>>>>>
>>>>>>>     ∇
>>>>>>> [0]   Z←PD X;Q
>>>>>>> [1]   →(0≠⍴Z←(Q=⌊Q←X÷Z)/Z←(PRIMES⌈X⋆0.5)/PRIMES)/0
>>>>>>> [2]   Z←,X
>>>>>>>     ∇
>>>>>>>
>>>>>>>     ∇
>>>>>>> [0]   Z←FACTOR X;Q
>>>>>>> [1]   Z←''
>>>>>>> [2]  L1:→(1=Q←⌊X÷×/Z)/L2
>>>>>>> [3]   Z←Z,PD Q
>>>>>>> [4]   →L1
>>>>>>> [5]  L2:Z←Z[⍋Z]
>>>>>>>     ∇
>>>>>>>
>>>>>>>     ∇
>>>>>>> [0]   Z←FFMT X
>>>>>>> [1]   Z←FACTOR¨X←,X
>>>>>>> [2]   Z←(((⍴X),1)⍴X),Z
>>>>>>> [3]   Z←('Number' 'Prime Factorization'),[0]Z
>>>>>>> [4]
>>>>>>>     ∇
>>>>>>>
>>>>>>> Now lets get some random data, being careful to call Roll and not
>>>>>>> the system-destroying Deal.
>>>>>>>
>>>>>>>       4 5⍴DATA←?20⍴¯1+2*31
>>>>>>> 1979327593 1319354819 771200257 1811210650  789012101
>>>>>>> 1029042612  152268513  20570707  832781767  898376923
>>>>>>> 1725231712  117387147 931909676  752862863 1800558041
>>>>>>>  736032325  151634070 731135336 1798144557 1223769030
>>>>>>>
>>>>>>>       FFMT DATA
>>>>>>>      Number  Prime Factorization
>>>>>>>  1979327593  14747 134219
>>>>>>>  1319354819  17 23 101 33409
>>>>>>>   771200257  17 17 1117 2389
>>>>>>>  1811210650  2 5 5 31 1168523
>>>>>>>   789012101  101 7812001
>>>>>>>  1029042612  2 2 3 3 13 29 75821
>>>>>>>   152268513  3 137 370483
>>>>>>>    20570707  20570707
>>>>>>>   832781767  47 199 269 331
>>>>>>>   898376923  898376923
>>>>>>>  1725231712  2 2 2 2 2 53913491
>>>>>>>   117387147  3 23 1701263
>>>>>>>   931909676  2 2 23 23 37 11903
>>>>>>>   752862863  752862863
>>>>>>>  1800558041  6841 263201
>>>>>>>   736032325  5 5 7 29 145031
>>>>>>>   151634070  2 3 3 5 7 233 1033
>>>>>>>   731135336  2 2 2 7247 12611
>>>>>>>  1798144557  3 11 1667 32687
>>>>>>>  1223769030  2 3 5 11 1471 2521
>>>>>>>
>>>>>>> That ran in a reasonable amount of time.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to