I see that J allowed one to do automatic memoization by coding a M. in the header. I have used it a couple times. I was working on euler 66, which is the one where you have to solve for the lowest solution for non-square values of D between 1 and 1000, and you pick the largest x and the answer is the D associated with the largest minimum x. (ignoring the implicit solution where x=1 and y = 0).
x^2 - Dy^2 = 1 So I dug around on the net and found that these are called Pell Equations, and they are solved by deriving the generalized periodic continued fraction for the square root of D. Mathworld had a page that listed a specific scheme for doing the job, there were a bunch of recursive piecewise functions that were easily restatable in J. http://mathworld.wolfram.com/PellEquation.html I wrote the program, and when it ran successfully the first time, it took 15 minutes to run to completion (bugs were mostly related to my failure to use extended arithmetic in all needed places). Because the functions that calculate Then I started working on the program. Remember that there is a one minute guideline in Euler. I also noted that many people had reported very fast times for this problem using approximately the same scheme. First I made a new version with a bunch of stuff stripped out. I accepted order of magnitude results as good enough. I stopped calculating q (y) and checking the equation. Then I tried reordering statements, changing if.s to select.s. I got the run time down to 1.5 minutes, but, after one small change, I forgot to reload the code and my next run took six minutes. I began to suspect that the programs were running slower and slower the longer that they were run and I began to suspect the memoization. Now, make no mistake, these functions need memoization. Without memoization, running d from 1 to 100 takes many minutes. I started a test a couple minutes ago, and it has not completed as I write this, while with memoization, I was able to get this same stuff to run from 1 to 1000 in 1.5 minutes. But by reloading my code (which I assumes clears memoization) every 100 ticks of the main loop, I could get the code to finish in a couple seconds for the fast version and under 15 seconds for the slow version. I actually stuck a load'file' to reload the file the scripts were from in the main loop, to run every hundred ticks of the main loop. In fact, the code runs slightly faster when I reload it from the file every 50 ticks of the main loop. I assume that reading from the O/S is very expensive. But it is a win relative to searching 1000 entries in the memoization table 1000 times. If I could, I'd clear the memoization every cycle or two of the main loop. x is d, passed to every routine, and y is the level of the function, they recur over each other down to where most of them originate as 1, 0 or the floor of the square root of D. This is an example of three of the routines, they somehow work together to calculate the a term, which is used to calculate r. r is one less than the first term of a which is double the value of the zero term and r, possibly transformed, tells you which p and q (lower case, not these) terms to use for the minimum values for x and y. As you can see, it calls itself at level zero, which just takes the floor of the square root of the left term, and two other routines, that end up calling themselves and each other and a again, descending in value of the right argument. Lots of calls to calculate, say, a_10 since you actually need (for this function) to calculate a i.r without knowing r in advance. That actually makes good use of memoization (I presume) e66a =: dyad define M. if. y = 0 do. x: <.@%: x return. end. x: <. ((x e66a 0)+(x e66P y))% (x e66Q y) ) e66P =: dyad define M. if. y = 0 do. x: 0 return. end. if. y = 1 do. x e66a 0 return. end. x: ((x e66a <:y)*(x e66Q <:y))-(x e66P <:y) ) e66Q =: dyad define M. if. y = 0 do. x: 1 return. end. if. y = 1 do. x: x - *: x e66a 0 return. end. x: (x-*: x e66P y)%(x e66Q <: y) ) The "fast" version uses selects, no significant difference. The point is that once I am done making calls about 5 e66a"0 i.10 I will never make another call with 5 as the left argument. and it seems that the function calls slow way way way down. First time from 1-1000, the "fast" version ran in 1.5 minutes (the unmemoized version has not finished >:i.100). The second time it ran in six minutes. Just to make myself happy, I stuck a "load" statement in my main routine to be run every 100 times through the loop. The "fast version" ran in 6 seconds. The slow version that used extended arithmetic for everything, that I had never seen run in under three minutes over the space 1<:x<:1000 ran in 12 seconds. What I was saving by avoiding arithmetic was another thousand trips through the memo lists. Of the 1.5 minutes my routine took to run, it seems that 96% of the time I was looking through memoized stuff that I never wanted to see again. The only possible thing I can imagine that the load of the same code is doing is getting the compiler to clear the memoization tables. Calculating a is so expensive that it makes a noticable difference if I calculate the terms in a while loop, stopping on the exact term that is the last one needed, rather than calculating them 3 or 5 at a time. But it is not the properly memoized calculation of a that it so expensive, it is the poor memo lookup. I have manually memoized functions many times, manually, mostly in Perl and D. Both of those languages have hashes - you can make the arguments into a key, somehow, and build a hash that is indexed by the tree. The tree lookup slows down the entry to the function, but not too much, and furthermore, if you put a thousand times as much into the tree, it only slows the lookup a little. I don't see that here - it is so expensive to load the tree that it is by far cheaper to reread everything from disk. I am beginning to suspect that the M. implementation does not store things into trees or hashes or anything efficient for lookup, it stores them into maybe linked lists that are searched sequentially, or maybe some other scheme is used that slows the lookup to a crawl when you get more than a few thousand items in the list. The highest value of r was in the 70's, about half had a value of 10, so let's say that a was called about 20 times per tick. This means that every 100 main loop ticks, I added 2000 entries to the memoization table - and it started to noticeably slow down, while by the time I had added 10000 entries to the table, it was crawling, and at about 100000, it was slower than running the original code the first time. See, when I run the same function twice in a row, without looking anything up, it would seem to me that if things are completely memoized, that every possible value I am looking up is in the memo tables. But the reality is that the function runs slower. Six minutes vs. 1.5 minutes. I have not looked in the source, but I am gonna guess that the memoization is a linear lookup in an unordered list, or if it is using any sort of indexing, it gets overwhelmed by a few thousand input records, and it reverts to some sort of unordered overflow area that has to be searched. So I guess I would like to know if there is any work going on in the memoization area, and if it would be possible to add code so that the memoization would be cleared? Like, a foreign conjunction, or some trick? Can I define any function and clear memoization? (I tested that, does not work, even when the function you define also uses memoization.) These functions really need memoization, the routine has not yet finished the space over 1<:x<:100 without memoization, it has been an hour now that process explorer reports that J has been using one of my CPUs. I killed it before it finished. But then memoization is the biggest performance hit to my finished code.. -- Of course I can ride in the carpool lane, officer. Jesus is my constant companion. ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
