On Oct 28, 2010, at 11:52 AM, Michael D wrote:

Mike, I'm not sure what you mean about removing foo but I think the method is sound in diagnosing a program issue and the results speak for themselves.

I did invert my if statement at the suggestion of a CS professor (who also suggested recoding in C, but I'm in an applied math program and haven't had
the time to take programming courses, which i know would be helpful)

Anyway, with the statement as:

if( !(k %in% c(10^4,10^5,10^6,10^7)) ){
#do nothing
} else {
q <- q+1
Out[[q]] <- M
}

run times were back to around 20 minutes.

Have you tried replacing all of those 10^x operations with their integer equivalents, c(10000L, 100000L, 1000000L)? Each time through the loop you are unnecessarily calling the "^" function 4 times. You could also omit the last one. 10^7, during testing since M at the last iteration (k=10^7) would be the final value and you could just assign the state of M at the end. So we have eliminated 4*10^7 unnecessary "^" calls and 10^7 unnecessary comparisons. (The CS professor is perhaps used to having the C compiler do all thinking of this sort for him.)

--
David

So as best I can tell something
happens in the if statement causing the computer to work ahead, as the
professor suggests. I'm no expert on R (and have no desire to try looking at
the R source code (it would only confuse me)) but if anyone can offer
guidance on how the if statement works (Does R try to work ahead? Under what
conditions does it try to "work ahead" so I can try to exploit this
behavior) I would greatly appreciate it.
If it would require too much knowledge of the computer system to understand I doubt I would be able to make use of it, but maybe someone else could
benefit.

On Tue, Oct 26, 2010 at 3:24 PM, Mike Marchywka <marchy...@hotmail.com>wrote:

----------------------------------------
Date: Tue, 26 Oct 2010 12:53:14 -0400
From: mike...@gmail.com
To: j...@bitwrit.com.au
CC: r-help@r-project.org
Subject: Re: [R] runtime on ising model

I have an update on where the issue is coming from.

I commented out the code for "pos[k+1] <- M[i,j]" and the if statement
for
time = 10^4, 10^5, 10^6, 10^7 and the storage and everything ran
fast(er).
Next I added back in the "pos" statements and still runtimes were good
(around 20 minutes).

So I'm left with something is causing problems in:

I haven't looked at this since some passing interest in magnetics
decades ago, something about 8-tracks and cassettes, but you have
to be careful with conclusions like " I removed foo and problem
went away therefore problem was foo." Performance issues are often
caused by memory, not CPU limitations. Removing anything with a big
memory footprint could speed things up. IO can be a real bottleneck.
If you are talking about things on minute timescales, look at task
manager and see if you are even CPU limited. Look for page faults
or IO etc. If you really need performance and have a task which
is relatively simple, don't ignore c++ as a way to generate data
points and then import these into R for analysis.

In short, just because you are focusing on math it doesn't mean
the computer is limited by that.



## Store state at time 10^4, 10^5, 10^6, 10^7
if( k %in% c(10^4,10^5,10^6,10^7) ){
q <- q+1
Out[[q]] <- M
}

Would there be any reason R is executing the statements inside the "if"
before getting to the logical check?
Maybe R is written to hope for the best outcome (TRUE) and will just
throw
out its work if the logic comes up FALSE?
I guess I can always break the for loop up into four parts and store the
state at the end of each, but thats an unsatisfying solution to me.


Jim, I like the suggestion of just pulling one big sample, but since I
can
get the runtimes under 30 minutes just by removing the storage piece I
doubt
I would see any noticeable changes by pulling large sample vectors.

Thanks,
Michael

On Tue, Oct 26, 2010 at 6:22 AM, Jim Lemon  wrote:

On 10/26/2010 04:50 PM, Michael D wrote:

So I'm in a stochastic simulations class and I having issues with the
amount
of time it takes to run the Ising model.

I usually don't like to attach the code I'm running, since it will
probably
make me look like a fool, but I figure its the best way I can find any
bits
I can speed up run time.

As for the goals of the exercise:
I need the state of the system at time=1, 10k, 100k, 1mill, and 10mill
and the percentage of vertices with positive spin at all t

Just to be clear, i'm not expecting anyone to tell me how to program
this
model, cause I know what I have works for this exercise, but it takes
far
too long to run and I'd like to speed it up by replacing slow
operations
wherever possible.

Hi Michael,
One bottleneck is probably the sampling. If it doesn't grab too much memory, setting up a vector of the samples (maybe a million at a time
if 10
million is too big - might be able to rewrite your sample vector when
you
store the state) and using k (and an offset if you don't have one big
vector) to index it will give you some speed.

Jim



[[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to