Re: [R] R seems to stall after several hours on a long series of analyses... where to start?
Hi, I saw something similar, when I had R to look in a file every half minute if there was a request to do something, and if so, do that something and empty the file. (This was my way of testing if I coud do an interactive web page, somehow I managed to get the web page to write the requests to the file that R would look in. R would update a graph that was visible on that same web page). Anyway, this ran smoothly for while (40 minutes I think), then it just stopped. When I examined the situation, R suddenly woke up and continued its task as if nothing had happened (which was quite correct). My amateur interpretation was that the system put R to sleep since it appeared to be inactive according to the system. When I swithed to R, it became interactive and was given CPU time again. Maybe this gives some inspiration to solve the problem. The system was Windows NT, R version 1.8, I think. Kind regards. Sixten David L. Van Brunt, Ph.D. [EMAIL PROTECTED] 2005-11-07 16:09 Great suggestions, all. I do have a timer in there, and it looks like the time to complete a loop is not increasing as it goes. From your comments, I take it that suggests there is not a memory leak. I could try scripting the loop from the shell, rather than R, to see if that works, but will do that as a last resort as it will require a good deal of re-writing (the loop follows some setup code that builds a pretty large data set... the loop then slaps several new columns on a copy of that data set, and analyses that...) I'll still try the other platform as well, see if the same problem occurs there. On 11/7/05, jim holtman [EMAIL PROTECTED] wrote: Here is some code that I use to track the progress of my scripts. This will print out the total cpu time and the memory that is being used. You call it with 'my.stats(message)' to print out message on the console. Also, have you profiled your code to see where the time is being spent? Can you break it up into multiple runs so that you can start with a fresh version of memory? ==script=== my.stats - local({ # local variables to hold the last times # first two variables are the elasped and CPU times from the last report lastTime - lastCPU - 0 function(text = stats, reset=F) { procTime - proc.time()[1:3] # get current metrics if (reset){ # setup to mark timing from this point lastTime - procTime[3] lastCPU - procTime[1] + procTime[2] } else { cat(text, -,sys.call(sys.parent())[[1]], : , round((procTime[1] + procTime[2]) - lastCPU,1), round(procTime[3] - lastTime,1), , procTime, : , round(memory.size()/2.^20., 1.), MB\n) invisible(flush.console()) # force a write to the console } } }) = here is some sample output= my.stats(reset=TRUE) # reset counters x - runif(1e6) # generate 1M random numbers my.stats('random') random - my.stats : 0.3 31.8 96.17 11.7 230474.9 : 69.5 MB y - x*x+sqrt(x) # just come calculation my.stats('calc') calc - my.stats : 0.7 71.2 96.52 11.74 230514.3 : 92.4 MB You can see that memory is growing. The first number is the CPU time and the second (in ) is the elapsed time. HTH On 11/7/05, David L. Van Brunt, Ph.D. [EMAIL PROTECTED] wrote: Not sure where to even start on this I'm hoping there's some debugging I can do... I have a loop that cycles through several different data sets (same structure, different info), performing randomForest growth and predictions... saving out the predictions for later study... I get about 5 hours in (9%... of the planned iterations.. yikes!) and R just freezes. This happens in interactive and batch mode execution (I can see from the .Rout file that it gets about 9% through in Batch mode, and about 6% if in interactive mode... does that suggest memory problems?) I'm thinking of re-executing this same code on a different platform to see if that's the issue (currently using OS X)... any other suggestions on where to look, or what to try to get more information? Sorry so vague... it's a LOT of code, runs fine without error for many iterations, so I didn't think the problem was syntax... -- --- David L. Van Brunt, Ph.D. mailto: [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? -- --- David L. Van Brunt, Ph.D. mailto:[EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] R seems to stall after several hours on a long series of analyses... where to start?
Not sure where to even start on this I'm hoping there's some debugging I can do... I have a loop that cycles through several different data sets (same structure, different info), performing randomForest growth and predictions... saving out the predictions for later study... I get about 5 hours in (9%... of the planned iterations.. yikes!) and R just freezes. This happens in interactive and batch mode execution (I can see from the .Rout file that it gets about 9% through in Batch mode, and about 6% if in interactive mode... does that suggest memory problems?) I'm thinking of re-executing this same code on a different platform to see if that's the issue (currently using OS X)... any other suggestions on where to look, or what to try to get more information? Sorry so vague... it's a LOT of code, runs fine without error for many iterations, so I didn't think the problem was syntax... -- --- David L. Van Brunt, Ph.D. mailto:[EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R seems to stall after several hours on a long series of analyses... where to start?
David L. Van Brunt, Ph.D. wrote: Not sure where to even start on this I'm hoping there's some debugging I can do... I have a loop that cycles through several different data sets (same structure, different info), performing randomForest growth and predictions... saving out the predictions for later study... I get about 5 hours in (9%... of the planned iterations.. yikes!) and R just freezes. This happens in interactive and batch mode execution (I can see from the .Rout file that it gets about 9% through in Batch mode, and about 6% if in interactive mode... does that suggest memory problems?) I'm thinking of re-executing this same code on a different platform to see if that's the issue (currently using OS X)... any other suggestions on where to look, or what to try to get more information? Sorry so vague... it's a LOT of code, runs fine without error for many iterations, so I didn't think the problem was syntax... You could try running an external debugger to see whether it appears R is stuck in a loop. I don't know what OS X debuggers are like, but on Windows, you can see routine names even without debugging information. Recompiling R with debugging info will make the results a lot easier to interpret. Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R seems to stall after several hours on a long series of analyses... where to start?
Great suggestions, all. I do have a timer in there, and it looks like the time to complete a loop is not increasing as it goes. From your comments, I take it that suggests there is not a memory leak. I could try scripting the loop from the shell, rather than R, to see if that works, but will do that as a last resort as it will require a good deal of re-writing (the loop follows some setup code that builds a pretty large data set... the loop then slaps several new columns on a copy of that data set, and analyses that...) I'll still try the other platform as well, see if the same problem occurs there. On 11/7/05, jim holtman [EMAIL PROTECTED] wrote: Here is some code that I use to track the progress of my scripts. This will print out the total cpu time and the memory that is being used. You call it with 'my.stats(message)' to print out message on the console. Also, have you profiled your code to see where the time is being spent? Can you break it up into multiple runs so that you can start with a fresh version of memory? ==script=== my.stats - local({ # local variables to hold the last times # first two variables are the elasped and CPU times from the last report lastTime - lastCPU - 0 function(text = stats, reset=F) { procTime - proc.time()[1:3] # get current metrics if (reset){ # setup to mark timing from this point lastTime - procTime[3] lastCPU - procTime[1] + procTime[2] } else { cat(text, -,sys.call(sys.parent())[[1]], : , round((procTime[1] + procTime[2]) - lastCPU,1), round(procTime[3] - lastTime,1), , procTime, : , round(memory.size()/2.^20., 1.), MB\n) invisible(flush.console()) # force a write to the console } } }) = here is some sample output= my.stats(reset=TRUE) # reset counters x - runif(1e6) # generate 1M random numbers my.stats('random') random - my.stats : 0.3 31.8 96.17 11.7 230474.9 : 69.5 MB y - x*x+sqrt(x) # just come calculation my.stats('calc') calc - my.stats : 0.7 71.2 96.52 11.74 230514.3 : 92.4 MB You can see that memory is growing. The first number is the CPU time and the second (in ) is the elapsed time. HTH On 11/7/05, David L. Van Brunt, Ph.D. [EMAIL PROTECTED] wrote: Not sure where to even start on this I'm hoping there's some debugging I can do... I have a loop that cycles through several different data sets (same structure, different info), performing randomForest growth and predictions... saving out the predictions for later study... I get about 5 hours in (9%... of the planned iterations.. yikes!) and R just freezes. This happens in interactive and batch mode execution (I can see from the .Rout file that it gets about 9% through in Batch mode, and about 6% if in interactive mode... does that suggest memory problems?) I'm thinking of re-executing this same code on a different platform to see if that's the issue (currently using OS X)... any other suggestions on where to look, or what to try to get more information? Sorry so vague... it's a LOT of code, runs fine without error for many iterations, so I didn't think the problem was syntax... -- --- David L. Van Brunt, Ph.D. mailto: [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? -- --- David L. Van Brunt, Ph.D. mailto:[EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R seems to stall after several hours on a long series of analyses... where to start?
I'll try the memory stats function first, see what I get... I do have top on OS X, so I'll try watching that more closely as well. Great suggestions here. On 11/7/05, Paul Gilbert [EMAIL PROTECTED] wrote: David L. Van Brunt, Ph.D. wrote: Great suggestions, all. I do have a timer in there, and it looks like the time to complete a loop is not increasing as it goes. From your comments, I take it that suggests there is not a memory leak. I could try scripting the loop from the shell, rather than R, to see if that works, but will do that as a last resort as it will require a good deal of re-writing (the loop follows some setup code that builds a pretty large data set... the loop then slaps several new columns on a copy of that data set, and analyses that...) You may find it is better to make a new array for these columns. R tends to make copies when you do this sort of thing, with the result that you have multiple copies of your original dataset. Also, define the array to be the final size, with NA values, rather than appending rows or columns. I'll still try the other platform as well, see if the same problem occurs there. I'm curious to hear what you find. I doubt you willfind a big difference, but you will find a big diffence on a machine with more physical memory. Paul On 11/7/05, jim holtman [EMAIL PROTECTED] wrote: Here is some code that I use to track the progress of my scripts. This will print out the total cpu time and the memory that is being used. You call it with 'my.stats(message)' to print out message on the console. Also, have you profiled your code to see where the time is being spent? Can you break it up into multiple runs so that you can start with a fresh version of memory? ==script=== my.stats - local({ # local variables to hold the last times # first two variables are the elasped and CPU times from the last report lastTime - lastCPU - 0 function(text = stats, reset=F) { procTime - proc.time()[1:3] # get current metrics if (reset){ # setup to mark timing from this point lastTime - procTime[3] lastCPU - procTime[1] + procTime[2] } else { cat(text, -,sys.call(sys.parent())[[1]], : , round((procTime[1] + procTime[2]) - lastCPU,1), round(procTime[3] - lastTime,1), , procTime, : , round(memory.size()/2.^20., 1.), MB\n) invisible(flush.console()) # force a write to the console } } }) = here is some sample output= my.stats(reset=TRUE) # reset counters x - runif(1e6) # generate 1M random numbers my.stats('random') random - my.stats : 0.3 31.8 96.17 11.7 230474.9 : 69.5 MB y - x*x+sqrt(x) # just come calculation my.stats('calc') calc - my.stats : 0.7 71.2 96.52 11.74 230514.3 : 92.4 MB You can see that memory is growing. The first number is the CPU time and the second (in ) is the elapsed time. HTH On 11/7/05, David L. Van Brunt, Ph.D. [EMAIL PROTECTED] wrote: Not sure where to even start on this I'm hoping there's some debugging I can do... I have a loop that cycles through several different data sets (same structure, different info), performing randomForest growth and predictions... saving out the predictions for later study... I get about 5 hours in (9%... of the planned iterations.. yikes!) and R just freezes. This happens in interactive and batch mode execution (I can see from the .Rout file that it gets about 9% through in Batch mode, and about 6% if in interactive mode... does that suggest memory problems?) I'm thinking of re-executing this same code on a different platform to see if that's the issue (currently using OS X)... any other suggestions on where to look, or what to try to get more information? Sorry so vague... it's a LOT of code, runs fine without error for many iterations, so I didn't think the problem was syntax... -- --- David L. Van Brunt, Ph.D. mailto: [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? -- --- David L. Van Brunt, Ph.D. mailto:[EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- --- David L. Van Brunt, Ph.D. mailto:[EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list