On Tue, Oct 18, 2011 at 12:24, <[email protected]> wrote:
> Send Programming mailing list submissions to > [email protected] > > To subscribe or unsubscribe via the World Wide Web, visit > http://jsoftware.com/cgi-bin/mailman/listinfo/programming > or, via email, send a message with subject or body 'help' to > [email protected] > > You can reach the person managing the list at > [email protected] > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Programming digest..." > > > Today's Topics: > > 1. memoization good and bad for performance, can you flush it? > (Nick Simicich) > 2. Re: memoization good and bad for performance, can you flush > it? (Raul Miller) > 3. Re: Repeated application of a verb (Raul Miller) > 4. Exact surds (David Vaughan) > 5. Re: Exact surds (Raul Miller) > 6. Re: Exact surds (David Vaughan) > 7. Re: Exact surds (Raul Miller) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Tue, 18 Oct 2011 08:43:47 -0400 > From: Nick Simicich <[email protected]> > Subject: [Jprogramming] memoization good and bad for performance, can > you flush it? > To: Programming forum <[email protected]> > Message-ID: > <CA+2d-ZwBSDyzvm2eEpcai_DttU5rGf=wd-svyz-x6vyot1a...@mail.gmail.com > > > Content-Type: text/plain; charset=ISO-8859-1 > > I see that J allowed one to do automatic memoization by coding a M. in the > header. I have used it a couple times. I was working on euler 66, which > is > the one where you have to solve for the lowest solution for non-square > values of D between 1 and 1000, and you pick the largest x and the answer > is > the D associated with the largest minimum x. (ignoring the implicit > solution > where x=1 and y = 0). > > x^2 - Dy^2 = 1 > > So I dug around on the net and found that these are called Pell Equations, > and they are solved by deriving the generalized periodic continued fraction > for the square root of D. Mathworld had a page that listed a specific > scheme for doing the job, there were a bunch of recursive piecewise > functions that were easily restatable in J. > > http://mathworld.wolfram.com/PellEquation.html > > > I wrote the program, and when it ran successfully the first time, it took > 15 > minutes to run to completion (bugs were mostly related to my failure to use > extended arithmetic in all needed places). Because the functions that > calculate > > Then I started working on the program. Remember that there is a one minute > guideline in Euler. I also noted that many people had reported very fast > times for this problem using approximately the same scheme. First I made a > new version with a bunch of stuff stripped out. I accepted order of > magnitude results as good enough. I stopped calculating q (y) and checking > the equation. Then I tried reordering statements, changing if.s to > select.s. > > I got the run time down to 1.5 minutes, but, after one small change, I > forgot to reload the code and my next run took six minutes. > > I began to suspect that the programs were running slower and slower the > longer that they were run and I began to suspect the memoization. > > Now, make no mistake, these functions need memoization. Without > memoization, running d from 1 to 100 takes many minutes. I started a test > a > couple minutes ago, and it has not completed as I write this, while with > memoization, I was able to get this same stuff to run from 1 to 1000 in 1.5 > minutes. > > But by reloading my code (which I assumes clears memoization) every 100 > ticks of the main loop, I could get the code to finish in a couple seconds > for the fast version and under 15 seconds for the slow version. I actually > stuck a load'file' to reload the file the scripts were from in the main > loop, to run every hundred ticks of the main loop. In fact, the code runs > slightly faster when I reload it from the file every 50 ticks of the main > loop. I assume that reading from the O/S is very expensive. But it is a > win > relative to searching 1000 entries in the memoization table 1000 times. > > If I could, I'd clear the memoization every cycle or two of the main loop. > x is d, passed to every routine, and y is the level of the function, they > recur over each other down to where most of them originate as 1, 0 or the > floor of the square root of D. This is an example of three of the > routines, > they somehow work together to calculate the a term, which is used to > calculate r. r is one less than the first term of a which is double the > value of the zero term and r, possibly transformed, tells you which p and q > (lower case, not these) terms to use for the minimum values for x and y. > As > you can see, it calls itself at level zero, which just takes the floor of > the square root of the left term, and two other routines, that end up > calling themselves and each other and a again, descending in value of the > right argument. Lots of calls to calculate, say, a_10 since you actually > need (for this function) to calculate a i.r without knowing r in advance. > That actually makes good use of memoization (I presume) > > e66a =: dyad define M. > if. y = 0 do. x: <.@%: x return. end. > x: <. ((x e66a 0)+(x e66P y))% (x e66Q y) > ) > e66P =: dyad define M. > if. y = 0 do. x: 0 return. end. if. y = 1 do. x e66a 0 return. end. > x: ((x e66a <:y)*(x e66Q <:y))-(x e66P <:y) > ) > e66Q =: dyad define M. > if. y = 0 do. x: 1 return. end. if. y = 1 do. x: x - *: x e66a 0 return. > end. > x: (x-*: x e66P y)%(x e66Q <: y) > ) > > The "fast" version uses selects, no significant difference. The point is > that once I am done making calls about 5 e66a"0 i.10 I will never make > another call with 5 as the left argument. and it seems that the function > calls slow way way way down. First time from 1-1000, the "fast" version > ran > in 1.5 minutes (the unmemoized version has not finished >:i.100). The > second time it ran in six minutes. Just to make myself happy, I stuck a > "load" statement in my main routine to be run every 100 times through the > loop. The "fast version" ran in 6 seconds. The slow version that used > extended arithmetic for everything, that I had never seen run in under > three > minutes over the space 1<:x<:1000 ran in 12 seconds. What I was saving by > avoiding arithmetic was another thousand trips through the memo lists. > > Of the 1.5 minutes my routine took to run, it seems that 96% of the time I > was looking through memoized stuff that I never wanted to see again. The > only possible thing I can imagine that the load of the same code is doing > is > getting the compiler to clear the memoization tables. > > Calculating a is so expensive that it makes a noticable difference if I > calculate the terms in a while loop, stopping on the exact term that is the > last one needed, rather than calculating them 3 or 5 at a time. But it is > not the properly memoized calculation of a that it so expensive, it is the > poor memo lookup. > > > I have manually memoized functions many times, manually, mostly in Perl and > D. Both of those languages have hashes - you can make the arguments into a > key, somehow, and build a hash that is indexed by the tree. The tree > lookup > slows down the entry to the function, but not too much, and furthermore, if > you put a thousand times as much into the tree, it only slows the lookup a > little. I don't see that here - it is so expensive to load the tree that > it > is by far cheaper to reread everything from disk. > > I am beginning to suspect that the M. implementation does not store things > into trees or hashes or anything efficient for lookup, it stores them into > maybe linked lists that are searched sequentially, or maybe some other > scheme is used that slows the lookup to a crawl when you get more than a > few > thousand items in the list. The highest value of r was in the 70's, about > half had a value of 10, so let's say that a was called about 20 times per > tick. This means that every 100 main loop ticks, I added 2000 entries to > the memoization table - and it started to noticeably slow down, while by > the > time I had added 10000 entries to the table, it was crawling, and at about > 100000, it was slower than running the original code the first time. > > See, when I run the same function twice in a row, without looking anything > up, it would seem to me that if things are completely memoized, that every > possible value I am looking up is in the memo tables. But the reality is > that the function runs slower. Six minutes vs. 1.5 minutes. > > I have not looked in the source, but I am gonna guess that the memoization > is a linear lookup in an unordered list, or if it is using any sort of > indexing, it gets overwhelmed by a few thousand input records, and it > reverts to some sort of unordered overflow area that has to be searched. > > So I guess I would like to know if there is any work going on in the > memoization area, and if it would be possible to add code so that the > memoization would be cleared? Like, a foreign conjunction, or some trick? > Can I define any function and clear memoization? (I tested that, does not > work, even when the function you define also uses memoization.) These > functions really need memoization, the routine has not yet finished the > space over 1<:x<:100 without memoization, it has been an hour now that > process explorer reports that J has been using one of my CPUs. I killed it > before it finished. But then memoization is the biggest performance hit to > my finished code.. > > > -- > Of course I can ride in the carpool lane, officer. Jesus is my constant > companion. > > > ------------------------------ > > Message: 2 > Date: Tue, 18 Oct 2011 09:11:01 -0400 > From: Raul Miller <[email protected]> > Subject: Re: [Jprogramming] memoization good and bad for performance, > can you flush it? > To: Programming forum <[email protected]> > Message-ID: > <CAD2jOU9QvcgS15cD=6a4slobkzxmzw-_cerwcq4gqv8xq46...@mail.gmail.com > > > Content-Type: text/plain; charset=ISO-8859-1 > > On Tue, Oct 18, 2011 at 8:43 AM, Nick Simicich <[email protected]> wrote: > > Can I define any function and clear memoization? (I tested that, does not > > work, even when the function you define also uses memoization.) > > I am surprised that function redefinition does not clear the > memoization, but maybe the memo is keyed by the function definition? > > Anyways, I would try updating the definition with an extraneous > constant. Since J does not do data flow analysis, that should be > sufficient. > > In other words, if you have something like: > F=: f&([ 23&]) M. > > then changing the number should get you a fresh set of memo buckets > (if my hypothesis about M. implementation is correct). > > -- > Raul > > > ------------------------------ > > Message: 3 > Date: Tue, 18 Oct 2011 10:26:33 -0400 > From: Raul Miller <[email protected]> > Subject: Re: [Jprogramming] Repeated application of a verb > To: Programming forum <[email protected]> > Message-ID: > <CAD2jOU-5FmzTWCDMuVSuJhJ9=wrMvxwEEeO=OP+XvqXnup=u...@mail.gmail.com > > > Content-Type: text/plain; charset=ISO-8859-1 > > I was asked to explain a statement I had made earlier: > > On Sat, Oct 8, 2011 at 8:04 AM, Raul Miller <[email protected]> wrote: > > Depending on what specifically you are trying to do, ^: may or may not > > be the right thing to use. > > > > The issue is that ^: assumes that it is dealing with a mathematical > > function. ?If its function gives the same result twice in a row, it > > will stop right there. > > > > This is how ^:(test)^:_ works something like a while loop -- when > > (test) returns false, the inner ^: becomes an identity function and > > the outer ^: knows that it's time to stop. > > This is probably easiest to clarify using examples. > > <.@-:^:_]20 > 0 > > If I repeatedly divide a number in half and find the next lower > integer, I get 0. I can see the intermediate results if I replace ^:_ > with ^:a: > > <.@-:^:a:]20 > 20 10 5 2 1 0 > > This works because ^: stops when it sees the same value twice in a row. > > Now, for contrast, let's consider: > > ? bind 3^:a:]0 > 0 1 2 0 1 0 1 0 > ? bind 3^:a:]0 > 0 1 2 1 > ? bind 3^:a:]0 > 0 > ? bind 3^:a:]0 > 0 2 0 2 0 2 1 > > Here, ? bind 3 is not a function. It arbitrarily picks a result which > will be 0, 1 or 2. So this mechanism stops, arbitrarily, when ? has > given us the same value twice in a row. > > Or, here's a different example: > > require'misc' > prompt bind 'type something: '^:a: '' > type something: this is something > type something: so is this > type something: and this > type something: and this > > this is something > so is this > and this > > Here, when I entered the same thing twice in a row, J decided that the > "function" I was using was done -- after all, it gave the same result > twice in a row. > > So, anyways, this means that the mechanism > F ^:test^:_ > will terminate prematurely if F is not a function and it produces the > same result twice in a row, even when the result of test was a true > value. > > If F was a function and it produced the same result twice in a row, > this would not matter: If F was a function and it produced the same > result twice in a row, it would produce the same result an infinite > number of times in a row. So either the result is correct or the > result is irrelevant. But if you are looking for F to do > "non-functional things" you might be upset if your loop terminated > prematurely (and the simple workaround is to use some other mechanism > to implement your "while" loop -- but you can also arrange so that F > never produces the same result twice in a row). > > I hope this makes sense. > > -- > Raul > > > ------------------------------ > > Message: 4 > Date: Tue, 18 Oct 2011 16:30:17 +0100 > From: David Vaughan <[email protected]> > Subject: [Jprogramming] Exact surds > To: Programming forum <[email protected]> > Message-ID: <[email protected]> > Content-Type: text/plain; charset=us-ascii > > Can J represent surds exactly instead of reverting to floating point? It > would be more convenient if I could do that rather than storing the squared > version. > > ------------------------------ > > Message: 5 > Date: Tue, 18 Oct 2011 11:50:15 -0400 > From: Raul Miller <[email protected]> > Subject: Re: [Jprogramming] Exact surds > To: Programming forum <[email protected]> > Message-ID: > <cad2jou-nsgawxuqorqm9em4hrj2msb9s955qr9f1cxipnwx...@mail.gmail.com > > > Content-Type: text/plain; charset=ISO-8859-1 > > On Tue, Oct 18, 2011 at 11:30 AM, David Vaughan > <[email protected]> wrote: > > Can J represent surds exactly instead of reverting to floating point? It > would be more convenient if I could do that rather than storing the squared > version. > > Yes, but by generating a verb which would generate the floating point > value and not as a number. > > cubeRootOf5=: %:&5 bind 3 > > Of course you can manipulate the resulting numbers but those results > are still floating point: > > (cubeRootOf5 ^ 3:) '' > 5 > 5-(cubeRootOf5 ^ 3:) '' > 8.88178e_16 > > That said, in some cases, J can clean up floating point discrepancies > > 5-x:(cubeRootOf5 ^ 3:) '' > 0 > > but that's from direct manipulation of the values and does not care > about delayed evaluation: > > 5-x:((3 %:5) ^ 3:) '' > 0 > > Anyways, J implements the data structures to support symbolic > manipulation but it does not have a particularly large library of > prebuilt symbolic manipulation routines. > > FYI, > > -- > Raul > > > ------------------------------ > > Message: 6 > Date: Tue, 18 Oct 2011 17:09:08 +0100 > From: David Vaughan <[email protected]> > Subject: Re: [Jprogramming] Exact surds > To: Programming forum <[email protected]> > Message-ID: <[email protected]> > Content-Type: text/plain; charset=us-ascii > > I'm hoping to write a verb to compute the continued fraction expansion of a > square root. I've done it in Java though there are accuracy issues with > floating points and speed issues with using exact numbers (you end up > multiplying massive numbers in some cases). Is J equipped to deal with these > problems or should I be looking for an alternative method entirely? > > On 18 Oct 2011, at 16:50, Raul Miller wrote: > > > On Tue, Oct 18, 2011 at 11:30 AM, David Vaughan > > <[email protected]> wrote: > >> Can J represent surds exactly instead of reverting to floating point? It > would be more convenient if I could do that rather than storing the squared > version. > > > > Yes, but by generating a verb which would generate the floating point > > value and not as a number. > > > > cubeRootOf5=: %:&5 bind 3 > > > > Of course you can manipulate the resulting numbers but those results > > are still floating point: > > > > (cubeRootOf5 ^ 3:) '' > > 5 > > 5-(cubeRootOf5 ^ 3:) '' > > 8.88178e_16 > > > > That said, in some cases, J can clean up floating point discrepancies > > > > 5-x:(cubeRootOf5 ^ 3:) '' > > 0 > > > > but that's from direct manipulation of the values and does not care > > about delayed evaluation: > > > > 5-x:((3 %:5) ^ 3:) '' > > 0 > > > > Anyways, J implements the data structures to support symbolic > > manipulation but it does not have a particularly large library of > > prebuilt symbolic manipulation routines. > > > > FYI, > > > > -- > > Raul > > ---------------------------------------------------------------------- > > For information about J forums see http://www.jsoftware.com/forums.htm > > > > ------------------------------ > > Message: 7 > Date: Tue, 18 Oct 2011 12:23:11 -0400 > From: Raul Miller <[email protected]> > Subject: Re: [Jprogramming] Exact surds > To: Programming forum <[email protected]> > Message-ID: > <CAD2jOU_yZU4cn-g_ir1F6Vk0ipohQ_1o6Zm4JSFFV=edfmx...@mail.gmail.com > > > Content-Type: text/plain; charset=ISO-8859-1 > > The best I can think of at the moment is using x: on the floating > point value for an initial value and then newton's method to gain the > extra precision you want. > > -- > Raul > > On Tue, Oct 18, 2011 at 12:09 PM, David Vaughan > <[email protected]> wrote: > > I'm hoping to write a verb to compute the continued fraction expansion of > a square root. I've done it in Java though there are accuracy issues with > floating points and speed issues with using exact numbers (you end up > multiplying massive numbers in some cases). Is J equipped to deal with these > problems or should I be looking for an alternative method entirely? > > > > On 18 Oct 2011, at 16:50, Raul Miller wrote: > > > >> On Tue, Oct 18, 2011 at 11:30 AM, David Vaughan > >> <[email protected]> wrote: > >>> Can J represent surds exactly instead of reverting to floating point? > It would be more convenient if I could do that rather than storing the > squared version. > >> > >> Yes, but by generating a verb which would generate the floating point > >> value and not as a number. > >> > >> cubeRootOf5=: %:&5 bind 3 > >> > >> Of course you can manipulate the resulting numbers but those results > >> are still floating point: > >> > >> ? (cubeRootOf5 ^ 3:) '' > >> 5 > >> ? 5-(cubeRootOf5 ^ 3:) '' > >> 8.88178e_16 > >> > >> That said, in some cases, J can clean up floating point discrepancies > >> > >> ? 5-x:(cubeRootOf5 ^ 3:) '' > >> 0 > >> > >> but that's from direct manipulation of the values and does not care > >> about delayed evaluation: > >> > >> ? 5-x:((3 %:5) ^ 3:) '' > >> 0 > >> > >> Anyways, J implements the data structures to support symbolic > >> manipulation but it does not have a particularly large library of > >> prebuilt symbolic manipulation routines. > >> > >> FYI, > >> > >> -- > >> Raul > >> ---------------------------------------------------------------------- > >> For information about J forums see http://www.jsoftware.com/forums.htm > > > > ---------------------------------------------------------------------- > > For information about J forums see http://www.jsoftware.com/forums.htm > > > > > ------------------------------ > > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm > > End of Programming Digest, Vol 73, Issue 45 > ******************************************* > -- C. Andrew Shepp Shepp & Associates LLC www.linkedin.com/in/andrewshepp Mail and attachments are private communications only voice +1 314 3243139 ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
