Re: [R] runtime on ising model
Mike, I'm not sure what you mean about removing foo but I think the method is sound in diagnosing a program issue and the results speak for themselves. I did invert my if statement at the suggestion of a CS professor (who also suggested recoding in C, but I'm in an applied math program and haven't had the time to take programming courses, which i know would be helpful) Anyway, with the statement as: if( !(k %in% c(10^4,10^5,10^6,10^7)) ){ #do nothing } else { q - q+1 Out[[q]] - M } run times were back to around 20 minutes. So as best I can tell something happens in the if statement causing the computer to work ahead, as the professor suggests. I'm no expert on R (and have no desire to try looking at the R source code (it would only confuse me)) but if anyone can offer guidance on how the if statement works (Does R try to work ahead? Under what conditions does it try to work ahead so I can try to exploit this behavior) I would greatly appreciate it. If it would require too much knowledge of the computer system to understand I doubt I would be able to make use of it, but maybe someone else could benefit. On Tue, Oct 26, 2010 at 3:24 PM, Mike Marchywka marchy...@hotmail.comwrote: Date: Tue, 26 Oct 2010 12:53:14 -0400 From: mike...@gmail.com To: j...@bitwrit.com.au CC: r-help@r-project.org Subject: Re: [R] runtime on ising model I have an update on where the issue is coming from. I commented out the code for pos[k+1] - M[i,j] and the if statement for time = 10^4, 10^5, 10^6, 10^7 and the storage and everything ran fast(er). Next I added back in the pos statements and still runtimes were good (around 20 minutes). So I'm left with something is causing problems in: I haven't looked at this since some passing interest in magnetics decades ago, something about 8-tracks and cassettes, but you have to be careful with conclusions like I removed foo and problem went away therefore problem was foo. Performance issues are often caused by memory, not CPU limitations. Removing anything with a big memory footprint could speed things up. IO can be a real bottleneck. If you are talking about things on minute timescales, look at task manager and see if you are even CPU limited. Look for page faults or IO etc. If you really need performance and have a task which is relatively simple, don't ignore c++ as a way to generate data points and then import these into R for analysis. In short, just because you are focusing on math it doesn't mean the computer is limited by that. ## Store state at time 10^4, 10^5, 10^6, 10^7 if( k %in% c(10^4,10^5,10^6,10^7) ){ q - q+1 Out[[q]] - M } Would there be any reason R is executing the statements inside the if before getting to the logical check? Maybe R is written to hope for the best outcome (TRUE) and will just throw out its work if the logic comes up FALSE? I guess I can always break the for loop up into four parts and store the state at the end of each, but thats an unsatisfying solution to me. Jim, I like the suggestion of just pulling one big sample, but since I can get the runtimes under 30 minutes just by removing the storage piece I doubt I would see any noticeable changes by pulling large sample vectors. Thanks, Michael On Tue, Oct 26, 2010 at 6:22 AM, Jim Lemon wrote: On 10/26/2010 04:50 PM, Michael D wrote: So I'm in a stochastic simulations class and I having issues with the amount of time it takes to run the Ising model. I usually don't like to attach the code I'm running, since it will probably make me look like a fool, but I figure its the best way I can find any bits I can speed up run time. As for the goals of the exercise: I need the state of the system at time=1, 10k, 100k, 1mill, and 10mill and the percentage of vertices with positive spin at all t Just to be clear, i'm not expecting anyone to tell me how to program this model, cause I know what I have works for this exercise, but it takes far too long to run and I'd like to speed it up by replacing slow operations wherever possible. Hi Michael, One bottleneck is probably the sampling. If it doesn't grab too much memory, setting up a vector of the samples (maybe a million at a time if 10 million is too big - might be able to rewrite your sample vector when you store the state) and using k (and an offset if you don't have one big vector) to index it will give you some speed. Jim [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted
Re: [R] runtime on ising model
On Oct 28, 2010, at 11:52 AM, Michael D wrote: Mike, I'm not sure what you mean about removing foo but I think the method is sound in diagnosing a program issue and the results speak for themselves. I did invert my if statement at the suggestion of a CS professor (who also suggested recoding in C, but I'm in an applied math program and haven't had the time to take programming courses, which i know would be helpful) Anyway, with the statement as: if( !(k %in% c(10^4,10^5,10^6,10^7)) ){ #do nothing } else { q - q+1 Out[[q]] - M } run times were back to around 20 minutes. Have you tried replacing all of those 10^x operations with their integer equivalents, c(1L, 10L, 100L)? Each time through the loop you are unnecessarily calling the ^ function 4 times. You could also omit the last one. 10^7, during testing since M at the last iteration (k=10^7) would be the final value and you could just assign the state of M at the end. So we have eliminated 4*10^7 unnecessary ^ calls and 10^7 unnecessary comparisons. (The CS professor is perhaps used to having the C compiler do all thinking of this sort for him.) -- David So as best I can tell something happens in the if statement causing the computer to work ahead, as the professor suggests. I'm no expert on R (and have no desire to try looking at the R source code (it would only confuse me)) but if anyone can offer guidance on how the if statement works (Does R try to work ahead? Under what conditions does it try to work ahead so I can try to exploit this behavior) I would greatly appreciate it. If it would require too much knowledge of the computer system to understand I doubt I would be able to make use of it, but maybe someone else could benefit. On Tue, Oct 26, 2010 at 3:24 PM, Mike Marchywka marchy...@hotmail.comwrote: Date: Tue, 26 Oct 2010 12:53:14 -0400 From: mike...@gmail.com To: j...@bitwrit.com.au CC: r-help@r-project.org Subject: Re: [R] runtime on ising model I have an update on where the issue is coming from. I commented out the code for pos[k+1] - M[i,j] and the if statement for time = 10^4, 10^5, 10^6, 10^7 and the storage and everything ran fast(er). Next I added back in the pos statements and still runtimes were good (around 20 minutes). So I'm left with something is causing problems in: I haven't looked at this since some passing interest in magnetics decades ago, something about 8-tracks and cassettes, but you have to be careful with conclusions like I removed foo and problem went away therefore problem was foo. Performance issues are often caused by memory, not CPU limitations. Removing anything with a big memory footprint could speed things up. IO can be a real bottleneck. If you are talking about things on minute timescales, look at task manager and see if you are even CPU limited. Look for page faults or IO etc. If you really need performance and have a task which is relatively simple, don't ignore c++ as a way to generate data points and then import these into R for analysis. In short, just because you are focusing on math it doesn't mean the computer is limited by that. ## Store state at time 10^4, 10^5, 10^6, 10^7 if( k %in% c(10^4,10^5,10^6,10^7) ){ q - q+1 Out[[q]] - M } Would there be any reason R is executing the statements inside the if before getting to the logical check? Maybe R is written to hope for the best outcome (TRUE) and will just throw out its work if the logic comes up FALSE? I guess I can always break the for loop up into four parts and store the state at the end of each, but thats an unsatisfying solution to me. Jim, I like the suggestion of just pulling one big sample, but since I can get the runtimes under 30 minutes just by removing the storage piece I doubt I would see any noticeable changes by pulling large sample vectors. Thanks, Michael On Tue, Oct 26, 2010 at 6:22 AM, Jim Lemon wrote: On 10/26/2010 04:50 PM, Michael D wrote: So I'm in a stochastic simulations class and I having issues with the amount of time it takes to run the Ising model. I usually don't like to attach the code I'm running, since it will probably make me look like a fool, but I figure its the best way I can find any bits I can speed up run time. As for the goals of the exercise: I need the state of the system at time=1, 10k, 100k, 1mill, and 10mill and the percentage of vertices with positive spin at all t Just to be clear, i'm not expecting anyone to tell me how to program this model, cause I know what I have works for this exercise, but it takes far too long to run and I'd like to speed it up by replacing slow operations wherever possible. Hi Michael, One bottleneck is probably the sampling. If it doesn't grab too much memory, setting up a vector of the samples (maybe a million at a time if 10 million is too big - might be able to rewrite your sample vector when you
Re: [R] runtime on ising model
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of David Winsemius Sent: Thursday, October 28, 2010 9:20 AM To: Michael D Cc: r-help@r-project.org Subject: Re: [R] runtime on ising model On Oct 28, 2010, at 11:52 AM, Michael D wrote: Mike, I'm not sure what you mean about removing foo but I think the method is sound in diagnosing a program issue and the results speak for themselves. I did invert my if statement at the suggestion of a CS professor (who also suggested recoding in C, but I'm in an applied math program and haven't had the time to take programming courses, which i know would be helpful) Anyway, with the statement as: if( !(k %in% c(10^4,10^5,10^6,10^7)) ){ #do nothing } else { q - q+1 Out[[q]] - M } run times were back to around 20 minutes. Did that one change really make a difference? R does not evaluate anything in the if or else clauses of an if statement before evaluating the condition. Have you tried replacing all of those 10^x operations with their integer equivalents, c(1L, 10L, 100L)? Each time through the loop you are unnecessarily calling the ^ function 4 times. You could also omit the last one. 10^7, during testing since M at the last iteration (k=10^7) would be the final value and you could just assign the state of M at the end. So we have eliminated 4*10^7 unnecessary ^ calls and 10^7 unnecessary comparisons. (The CS professor is perhaps used to having the C compiler do all thinking of this sort for him.) %in% is a relatively expensive function. Use == if you can. E.g., compare the following 2 ways of stashing something at times 1e4, 1e5, and 1e6: system.time({z - integer() for(k in seq_len(1e6)) if(k %in% set) z[length(z)+1]-k print(z)}) [1] 1 10 100 user system elapsed 46.790 0.023 46.844 system.time({z - integer() nextCheckPoint - 10^4 for(k in seq_len(1e6)) if( k == nextCheckPoint ) { nextCheckPoint - nextCheckPoint * 10 z[length(z)+1]-k } print(z)}) [1] 1 10 100 user system elapsed 4.529 0.013 4.545 With such a large number of iterations it pays to remove unneeded function calls in arithmetic expressions. R does not optimize them out - it is up to you to do that. E.g., system.time(for(i in seq_len(1e6)) sign(pi)*(-1)) user system elapsed 6.802 0.014 6.818 system.time(for(i in seq_len(1e6)) -sign(pi)) user system elapsed 3.896 0.011 3.911 Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -- David So as best I can tell something happens in the if statement causing the computer to work ahead, as the professor suggests. I'm no expert on R (and have no desire to try looking at the R source code (it would only confuse me)) but if anyone can offer guidance on how the if statement works (Does R try to work ahead? Under what conditions does it try to work ahead so I can try to exploit this behavior) I would greatly appreciate it. If it would require too much knowledge of the computer system to understand I doubt I would be able to make use of it, but maybe someone else could benefit. On Tue, Oct 26, 2010 at 3:24 PM, Mike Marchywka marchy...@hotmail.comwrote: Date: Tue, 26 Oct 2010 12:53:14 -0400 From: mike...@gmail.com To: j...@bitwrit.com.au CC: r-help@r-project.org Subject: Re: [R] runtime on ising model I have an update on where the issue is coming from. I commented out the code for pos[k+1] - M[i,j] and the if statement for time = 10^4, 10^5, 10^6, 10^7 and the storage and everything ran fast(er). Next I added back in the pos statements and still runtimes were good (around 20 minutes). So I'm left with something is causing problems in: I haven't looked at this since some passing interest in magnetics decades ago, something about 8-tracks and cassettes, but you have to be careful with conclusions like I removed foo and problem went away therefore problem was foo. Performance issues are often caused by memory, not CPU limitations. Removing anything with a big memory footprint could speed things up. IO can be a real bottleneck. If you are talking about things on minute timescales, look at task manager and see if you are even CPU limited. Look for page faults or IO etc. If you really need performance and have a task which is relatively simple, don't ignore c++ as a way to generate data points and then import these into R for analysis. In short, just because you are focusing on math it doesn't mean the computer is limited by that. ## Store state at time 10^4, 10
Re: [R] runtime on ising model
On Oct 28, 2010, at 12:20 PM, David Winsemius wrote: On Oct 28, 2010, at 11:52 AM, Michael D wrote: Mike, I'm not sure what you mean about removing foo but I think the method is sound in diagnosing a program issue and the results speak for themselves. I did invert my if statement at the suggestion of a CS professor (who also suggested recoding in C, but I'm in an applied math program and haven't had the time to take programming courses, which i know would be helpful) Anyway, with the statement as: if( !(k %in% c(10^4,10^5,10^6,10^7)) ){ #do nothing } else { q - q+1 Out[[q]] - M } run times were back to around 20 minutes. Have you tried replacing all of those 10^x operations with their integer equivalents, c(1L, 10L, 100L)? Each time through the loop you are unnecessarily calling the ^ function 4 times. You could also omit the last one. 10^7, during testing since M at the last iteration (k=10^7) would be the final value and you could just assign the state of M at the end. So we have eliminated 4*10^7 unnecessary ^ calls and 10^7 unnecessary comparisons. (The CS professor is perhaps used to having the C compiler do all thinking of this sort for him.) Bill Dunlap's suggestion to use == instead of %in% cut the time to 1/3 of what it had been even after the pre-calculation of the integer values( which only improved the looping times by 30%). The combination of the two with: if (k ==1L|k==10L|k==100L ) { ... } ... resulted in an improvement by a factor or 12.006/2.523 or 475% for the interim checking and printing operation using Bill's test suite. -- David So as best I can tell something happens in the if statement causing the computer to work ahead, as the professor suggests. I'm no expert on R (and have no desire to try looking at the R source code (it would only confuse me)) but if anyone can offer guidance on how the if statement works (Does R try to work ahead? Under what conditions does it try to work ahead so I can try to exploit this behavior) I would greatly appreciate it. If it would require too much knowledge of the computer system to understand I doubt I would be able to make use of it, but maybe someone else could benefit. On Tue, Oct 26, 2010 at 3:24 PM, Mike Marchywka marchy...@hotmail.com wrote: Date: Tue, 26 Oct 2010 12:53:14 -0400 From: mike...@gmail.com To: j...@bitwrit.com.au CC: r-help@r-project.org Subject: Re: [R] runtime on ising model I have an update on where the issue is coming from. I commented out the code for pos[k+1] - M[i,j] and the if statement for time = 10^4, 10^5, 10^6, 10^7 and the storage and everything ran fast(er). Next I added back in the pos statements and still runtimes were good (around 20 minutes). So I'm left with something is causing problems in: I haven't looked at this since some passing interest in magnetics decades ago, something about 8-tracks and cassettes, but you have to be careful with conclusions like I removed foo and problem went away therefore problem was foo. Performance issues are often caused by memory, not CPU limitations. Removing anything with a big memory footprint could speed things up. IO can be a real bottleneck. If you are talking about things on minute timescales, look at task manager and see if you are even CPU limited. Look for page faults or IO etc. If you really need performance and have a task which is relatively simple, don't ignore c++ as a way to generate data points and then import these into R for analysis. In short, just because you are focusing on math it doesn't mean the computer is limited by that. ## Store state at time 10^4, 10^5, 10^6, 10^7 if( k %in% c(10^4,10^5,10^6,10^7) ){ q - q+1 Out[[q]] - M } Would there be any reason R is executing the statements inside the if before getting to the logical check? Maybe R is written to hope for the best outcome (TRUE) and will just throw out its work if the logic comes up FALSE? I guess I can always break the for loop up into four parts and store the state at the end of each, but thats an unsatisfying solution to me. Jim, I like the suggestion of just pulling one big sample, but since I can get the runtimes under 30 minutes just by removing the storage piece I doubt I would see any noticeable changes by pulling large sample vectors. Thanks, Michael On Tue, Oct 26, 2010 at 6:22 AM, Jim Lemon wrote: On 10/26/2010 04:50 PM, Michael D wrote: So I'm in a stochastic simulations class and I having issues with the amount of time it takes to run the Ising model. I usually don't like to attach the code I'm running, since it will probably make me look like a fool, but I figure its the best way I can find any bits I can speed up run time. As for the goals of the exercise: I need the state of the system at time=1, 10k, 100k, 1mill, and 10mill and the percentage of vertices with positive spin at all
Re: [R] runtime on ising model
Date: Thu, 28 Oct 2010 09:58:40 -0700 From: wdun...@tibco.com To: dwinsem...@comcast.net; mike...@gmail.com CC: r-help@r-project.org Subject: Re: [R] runtime on ising model -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of David Winsemius Sent: Thursday, October 28, 2010 9:20 AM To: Michael D Cc: r-help@r-project.org Subject: Re: [R] runtime on ising model On Oct 28, 2010, at 11:52 AM, Michael D wrote: Mike, I'm not sure what you mean about removing foo but I think the method is sound in diagnosing a program issue and the results speak for themselves. Agreed on first part but not second- empirical debugging rarely produces compelling results in isolation. As a collection of symptons fine but not conclusive- if you learn c++ you will find out about all kinds of things like memory corruption that never make sense :) Here, the big concern is issues with memory as you never determined to be CPU limited although based on others comments you likely are in any case. I did invert my if statement at the suggestion of a CS professor (who also suggested recoding in C, but I'm in an applied math program and haven't had the time to take programming courses, which i know would be helpful) Anyway, with the statement as: if( !(k %in% c(10^4,10^5,10^6,10^7)) ){ #do nothing } else { q - q+1 Out[[q]] - M } run times were back to around 20 minutes. Did that one change really make a difference? R does not evaluate anything in the if or else clauses of an if statement before evaluating the condition. What is at issue here? That is, the OP claimed inverting polarity sped things up, suggesting that the branch mattered. AFAIK he never actually proved which branch was taken. This could imply many things or nothing: one branch may be slow, or cause a page fault, or the test may fail fast but succed slowly( testing huge array for equality for example) . Have you tried replacing all of those 10^x operations with their integer equivalents, c(1L, 10L, 100L)? Each time through the loop you are unnecessarily calling the ^ function 4 times. You could also omit the last one. 10^7, during testing since M at the last iteration (k=10^7) would be the final value and you could just assign the state of M at the end. So we have eliminated 4*10^7 unnecessary ^ calls and 10^7 unnecessary comparisons. (The CS professor is perhaps used to having the C compiler do all thinking of this sort for him.) %in% is a relatively expensive function. Use == if you can. E.g., compare the following 2 ways of stashing something at times 1e4, 1e5, and 1e6: system.time({z - integer() for(k in seq_len(1e6)) if(k %in% set) z[length(z)+1]-k print(z)}) [1] 1 10 100 user system elapsed 46.790 0.023 46.844 system.time({z - integer() nextCheckPoint - 10^4 for(k in seq_len(1e6)) if( k == nextCheckPoint ) { nextCheckPoint - nextCheckPoint * 10 z[length(z)+1]-k } print(z)}) [1] 1 10 100 user system elapsed 4.529 0.013 4.545 With such a large number of iterations it pays to remove unneeded function calls in arithmetic expressions. R does not optimize them out - it is up to you to do that. E.g., system.time(for(i in seq_len(1e6)) sign(pi)*(-1)) user system elapsed 6.802 0.014 6.818 system.time(for(i in seq_len(1e6)) -sign(pi)) user system elapsed 3.896 0.011 3.911 Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -- David So as best I can tell something happens in the if statement causing the computer to work ahead, as the professor suggests. I'm no expert on R (and have no desire to try looking at the R source code (it would only confuse me)) but if anyone can offer guidance on how the if statement works (Does R try to work ahead? Under what conditions does it try to work ahead so I can try to exploit this behavior) I would greatly appreciate it. If it would require too much knowledge of the computer system to understand I doubt I would be able to make use of it, but maybe someone else could benefit. On Tue, Oct 26, 2010 at 3:24 PM, Mike Marchywka wrote: Date: Tue, 26 Oct 2010 12:53:14 -0400 From: mike...@gmail.com To: j...@bitwrit.com.au CC: r-help@r-project.org Subject: Re: [R] runtime on ising model I have an update on where the issue is coming from. I commented out the code for pos[k+1] - M[i,j] and the if statement for time = 10^4, 10^5, 10^6, 10^7 and the storage and everything ran fast(er). Next I added back in the pos statements and still runtimes were good (around 20 minutes). So I'm left with something is causing problems in: I haven't looked at this since some passing interest
[R] runtime on ising model
So I'm in a stochastic simulations class and I having issues with the amount of time it takes to run the Ising model. I usually don't like to attach the code I'm running, since it will probably make me look like a fool, but I figure its the best way I can find any bits I can speed up run time. As for the goals of the exercise: I need the state of the system at time=1, 10k, 100k, 1mill, and 10mill and the percentage of vertices with positive spin at all t Just to be clear, i'm not expecting anyone to tell me how to program this model, cause I know what I have works for this exercise, but it takes far too long to run and I'd like to speed it up by replacing slow operations wherever possible. Thanks in advance for any help. Code Follows: N - 200 # Size of model R - 10^7 # Runs T - 1 # Temperature S - matrix( sample(c(-1,1),N^2,replace=TRUE) ,nrow=N,ncol=N) M - cbind(rep(0,N+2),rbind( rep(0,N), S, rep(0,N)), rep(0,N+2)) q - 1 Out - c() Out[[q]] - M pos - rep(0,R) pos[1] - sum(M==1) for(k in 1:R){ ## Pick random vertex U - sample(0:(N^2-1),1) i - floor(U/N) + 2 j - U%%(N) + 2 ## Calculate Energy Nei - c(M[i-1,j],M[i+1,j],M[i,j-1],M[i,j+1]) Ei - 2 * sum( M[i,j] != Nei) Ej - 2 * sum( sign(M[i,j])*(-1) != Nei) ## Accept Criteria Check if( Ej Ei ){ M[i,j] - sign(M[i,j])*(-1) pos[k+1] - M[i,j] } else { if( runif(1) exp(-1/T*(Ej-Ei)) ){ M[i,j] - sign(M[i,j])*(-1) pos[k+1] - M[i,j] } } ## Store state at time 10^4, 10^5, 10^6, 10^7 if( k %in% c(10^4,10^5,10^6,10^7) ){ q - q+1 Out[[q]] - M } } ## Output image(Out[[1]]) image(Out[[2]]) image(Out[[3]]) image(Out[[4]]) image(Out[[5]]) plot( cumsum(pos)/N^2, pch='.') [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] runtime on ising model
On 10/26/2010 04:50 PM, Michael D wrote: So I'm in a stochastic simulations class and I having issues with the amount of time it takes to run the Ising model. I usually don't like to attach the code I'm running, since it will probably make me look like a fool, but I figure its the best way I can find any bits I can speed up run time. As for the goals of the exercise: I need the state of the system at time=1, 10k, 100k, 1mill, and 10mill and the percentage of vertices with positive spin at all t Just to be clear, i'm not expecting anyone to tell me how to program this model, cause I know what I have works for this exercise, but it takes far too long to run and I'd like to speed it up by replacing slow operations wherever possible. Hi Michael, One bottleneck is probably the sampling. If it doesn't grab too much memory, setting up a vector of the samples (maybe a million at a time if 10 million is too big - might be able to rewrite your sample vector when you store the state) and using k (and an offset if you don't have one big vector) to index it will give you some speed. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] runtime on ising model
I have an update on where the issue is coming from. I commented out the code for pos[k+1] - M[i,j] and the if statement for time = 10^4, 10^5, 10^6, 10^7 and the storage and everything ran fast(er). Next I added back in the pos statements and still runtimes were good (around 20 minutes). So I'm left with something is causing problems in: ## Store state at time 10^4, 10^5, 10^6, 10^7 if( k %in% c(10^4,10^5,10^6,10^7) ){ q - q+1 Out[[q]] - M } Would there be any reason R is executing the statements inside the if before getting to the logical check? Maybe R is written to hope for the best outcome (TRUE) and will just throw out its work if the logic comes up FALSE? I guess I can always break the for loop up into four parts and store the state at the end of each, but thats an unsatisfying solution to me. Jim, I like the suggestion of just pulling one big sample, but since I can get the runtimes under 30 minutes just by removing the storage piece I doubt I would see any noticeable changes by pulling large sample vectors. Thanks, Michael On Tue, Oct 26, 2010 at 6:22 AM, Jim Lemon j...@bitwrit.com.au wrote: On 10/26/2010 04:50 PM, Michael D wrote: So I'm in a stochastic simulations class and I having issues with the amount of time it takes to run the Ising model. I usually don't like to attach the code I'm running, since it will probably make me look like a fool, but I figure its the best way I can find any bits I can speed up run time. As for the goals of the exercise: I need the state of the system at time=1, 10k, 100k, 1mill, and 10mill and the percentage of vertices with positive spin at all t Just to be clear, i'm not expecting anyone to tell me how to program this model, cause I know what I have works for this exercise, but it takes far too long to run and I'd like to speed it up by replacing slow operations wherever possible. Hi Michael, One bottleneck is probably the sampling. If it doesn't grab too much memory, setting up a vector of the samples (maybe a million at a time if 10 million is too big - might be able to rewrite your sample vector when you store the state) and using k (and an offset if you don't have one big vector) to index it will give you some speed. Jim [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] runtime on ising model
Date: Tue, 26 Oct 2010 12:53:14 -0400 From: mike...@gmail.com To: j...@bitwrit.com.au CC: r-help@r-project.org Subject: Re: [R] runtime on ising model I have an update on where the issue is coming from. I commented out the code for pos[k+1] - M[i,j] and the if statement for time = 10^4, 10^5, 10^6, 10^7 and the storage and everything ran fast(er). Next I added back in the pos statements and still runtimes were good (around 20 minutes). So I'm left with something is causing problems in: I haven't looked at this since some passing interest in magnetics decades ago, something about 8-tracks and cassettes, but you have to be careful with conclusions like I removed foo and problem went away therefore problem was foo. Performance issues are often caused by memory, not CPU limitations. Removing anything with a big memory footprint could speed things up. IO can be a real bottleneck. If you are talking about things on minute timescales, look at task manager and see if you are even CPU limited. Look for page faults or IO etc. If you really need performance and have a task which is relatively simple, don't ignore c++ as a way to generate data points and then import these into R for analysis. In short, just because you are focusing on math it doesn't mean the computer is limited by that. ## Store state at time 10^4, 10^5, 10^6, 10^7 if( k %in% c(10^4,10^5,10^6,10^7) ){ q - q+1 Out[[q]] - M } Would there be any reason R is executing the statements inside the if before getting to the logical check? Maybe R is written to hope for the best outcome (TRUE) and will just throw out its work if the logic comes up FALSE? I guess I can always break the for loop up into four parts and store the state at the end of each, but thats an unsatisfying solution to me. Jim, I like the suggestion of just pulling one big sample, but since I can get the runtimes under 30 minutes just by removing the storage piece I doubt I would see any noticeable changes by pulling large sample vectors. Thanks, Michael On Tue, Oct 26, 2010 at 6:22 AM, Jim Lemon wrote: On 10/26/2010 04:50 PM, Michael D wrote: So I'm in a stochastic simulations class and I having issues with the amount of time it takes to run the Ising model. I usually don't like to attach the code I'm running, since it will probably make me look like a fool, but I figure its the best way I can find any bits I can speed up run time. As for the goals of the exercise: I need the state of the system at time=1, 10k, 100k, 1mill, and 10mill and the percentage of vertices with positive spin at all t Just to be clear, i'm not expecting anyone to tell me how to program this model, cause I know what I have works for this exercise, but it takes far too long to run and I'd like to speed it up by replacing slow operations wherever possible. Hi Michael, One bottleneck is probably the sampling. If it doesn't grab too much memory, setting up a vector of the samples (maybe a million at a time if 10 million is too big - might be able to rewrite your sample vector when you store the state) and using k (and an offset if you don't have one big vector) to index it will give you some speed. Jim [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.