Bill's ideas are much more likely relevant than the following, but sticking with the "24 hour" bit, and even if i. or ~. is highly suspect, thoughts:
a .0001% branch would normally have a lot of variance even if the expected hit rate is once every 24 hours. For regular near 24 hour period crashes to be caused by a low path branch, it would have to be the type of branch that has 0% chance at program start, but then grows to close to 1% over time, perhaps as a result of large data/value sizes. for i. ~., speculating uninformedly, perhaps there is a data size that causes everything to miss L3 cache and results in a large jump in timings. ----- Original Message ----- From: bill lam <[email protected]> To: Programming forum <[email protected]> Cc: Sent: Tuesday, August 19, 2014 10:35:41 AM Subject: Re: [Jprogramming] stalled J processes I hoped you were not doing ~. or i. on floating point numbers, the comparison tolerance can beat you even on another faster machine. On Aug 19, 2014 10:26 PM, "Raul Miller" <[email protected]> wrote: > I am doing intensive calculation. > > Here is what top says about this j process: > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 26894 ubuntu 20 0 9846m 9.1g 2276 R 100.0 30.9 597:01.69 ijconsole > > And actually... J has not been running for 597 hours. The machine has only > been running for 3 days. So I can see something is wrong outside J itself, > and perhaps I need to switch to a different machine. (I wish there were a > way of reporting the machine as having signs of being defective, but I > don't know how to do that.) > > Still, at 9 gigs of ram, it's not even close to running out of memory (this > is an amazon ec2 m3.2xlarge). But I guess there's a reason these machine > are cheap. > > Thanks, > > -- > Raul > > > > > On Tue, Aug 19, 2014 at 10:17 AM, bill lam wrote: > J seldom stalled for me > unless memory full or doing intensive > computation. Your trace showed it > run = i. or ~. that may take a > long time to completion. You may try > setting a much lower value > for memory limit and execution time limit to > force it break > sooner. > > > Вт, 19 авг 2014, Raul Miller написал(а): >> > I am using require 'task' which I believe uses 15!:0. >> >> I believe I > have switched to using 2!:0 for all my external process needs. >> >> And, > of course, the machine is hooked to the internet, which is >> "unsafe" and > relies on people being well behaved or something >> approximating that. >> > >> So, yes, I am "doing something unsafe". >> >> As for unlikely... I've > probably used 25+ years of cpu time on this >> project. Unlikely is > approximately the same thing as inevitable in my >> book. >> >> None of > which really helps isolate this particular problem. >> >> That said, I can > try removing the require 'task' bit. But keep in mind >> that an experiment > costs on the order of 24 hours (and the price of a >> meal) so if there are > other things that seem like "no-brainers" I >> guess I really ought to > consider them. >> >> Thanks, >> >> -- >> RaulI >> >> On Tue, Aug 19, 2014 > at 9:46 AM, Joe Bogner wrote: >> > Does it do anything 'unsafe' ? Dynamic > memory allocation like memw or 15!:0 >> > ? I've had crashes, not stalls. > The only stalls I've had are on windows due >> > to deadlocks. As far as I > can remember, J is single threaded so a deadlock >> > seems unlikely unless > you have logic that is waiting on some other resource >> > (socket, file, > etc) >> > >> > Here's a stackoverflow post that might help with > troubleshooting: >> > > http://stackoverflow.com/questions/7785692/program-stalls-during-long-runs > >> > >> > >> > On Tue, Aug 19, 2014 at 9:39 AM, Raul Miller wrote: >> > >> > >> I have a J program that keeps stalling. My impression is that it has >> > >> been stalling in a random location, but I might be wrong about that. >> > >> >> >> (1) It takes almost 24 hours to get to the point where it stalls, > so >> >> so far my tests have been few. >> >> >> >> (2) When it stalls, it > ignores signals for attention, so debugging is hard. >> >> >> >> (3) I have > had other problems (for example with ioerrors), which also >> >> makes this > difficult to isolate. >> >> >> >> (4) Here's a stack trace of my current > example of this flaw: >> >> >> >> #0 0x00007fb2da462cf9 in ?? () from >> >> > /usr/lib/x86_64-linux-gnu/libj.so.8.0.2 >> >> #1 0x00007fb2da4626b6 in > jtequ () from >> >> /usr/lib/x86_64-linux-gnu/libj.so.8.0.2 >> >> #2 > 0x00007fb2da5278bf in ?? () from >> >> > /usr/lib/x86_64-linux-gnu/libj.so.8.0.2 >> >> #3 0x00007fb2da514c39 in > jtindexofsub () from >> >> /usr/lib/x86_64-linux-gnu/libj.so.8.0.2 >> >> #4 > 0x00007fb2da5165e2 in jtnub () from >> >> > /usr/lib/x86_64-linux-gnu/libj.so.8.0.2 >> >> >> >> I am looking for a way > of resolving this issue (quickly, because other >> >> people are waiting on > me). >> >> >> >> So far, the best I can think of is that I need to go > through a copy of >> >> J source and remove every potential flaw I can find > - starting with >> >> the implementation of jtequ. >> >> >> >> But perhaps > also there are things I can look for in gdb? >> >> Unfortunately, my > knowledge of intel machine language is woefully >> >> incomplete, so I do > not know specifically what I should be looking >> >> for. >> >> >> >> > Thanks, >> >> >> >> -- >> >> Raul >> >> > ---------------------------------------------------------------------- >> > >> For information about J forums see http://www.jsoftware.com/forums.htm > >> >> >> > > ---------------------------------------------------------------------- >> > > For information about J forums see http://www.jsoftware.com/forums.htm >> > ---------------------------------------------------------------------- >> > For information about J forums see http://www.jsoftware.com/forums.htm > > > -- > regards, > ==================================================== > GPG > key 1024D/4434BAB3 2008-08-24 > gpg --keyserver subkeys.pgp.net > --recv-keys > 4434BAB3 > gpg --keyserver subkeys.pgp.net --armor --export 4434BAB3 > > ---------------------------------------------------------------------- > > For information about J forums see http://www.jsoftware.com/forums.htm > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
