Re: [R] [FORGED] Newbie Question on R versus Matlab/Octave versus C

2019-02-03 Thread Jeff Newmiller
Your code seems to be attempting to modify global variables from within 
functions... R purposely makes this hard to do. Don't fight it. Instead, 
use function arguments and produce function outputs with your functions.


Also, the ifelse function does not control flow of execution of code... it 
selects values between two vectors according to the state of the logical 
input vector. Note that all values in both possible input values must be 
computed when using ifelse before it can do its magic, so ifelse can be 
significantly slower than assigning into an indexed vector if a small 
fraction of the vector will be changing.


Below is some proof-of-concept code. It mostly modifies values in-place 
within the data frame rather than using ifelse.


You might want to read the Intro to R document available through the R 
console via:


RShowDoc("R-intro")

to look up numeric indexing and logical indexing syntax while reading 
through this.


#
makeNewWomen <- function( nWomen ) {
  data.frame( isAlive = rep_len( TRUE, nWomen )
, isPregnant = rep_len( FALSE, nWomen )
, nChildren = rep_len( 0L, nWomen )
, age = rep_len( 0, nWomen )
, dateOfPregnancy = rep_len( 0, nWomen )
, endDateLastPregnancy = rep_len( 0.0, nWomen )
)
}

updateWomen <- function( DF
   , jd
   , maxAge
   , timeStep
   , pregProb
   , gestation
   , minBirthAge
   , maxBirthAge
   ) {
  DF$isAlive[ maxAge <= DF$age ] <- FALSE
  fertileIdx <- with( DF, isAlive & !isPregnant & minBirthAge <= age & age <= 
maxBirthAge )
  conceiveIdx <- fertileIdx
  conceiveIdx[ conceiveIdx ] <- sample( c( FALSE, TRUE )
  , size = sum( fertileIdx )
  , replace = TRUE
  , prob = c( 1-pregProb, pregProb )
  )
  DF$isPregnant[ conceiveIdx ] <- TRUE
  DF$dateOfPregnancy[ conceiveIdx ] <- jd
  birthIdx <- with( DF, isAlive & isPregnant & ( dateOfPregnancy + gestation ) 
<= jd )
  femalechild <- sample( c( FALSE, TRUE )
   , size = sum( birthIdx )  # random within birthing group
   , replace = TRUE
   , prob = c( 0.5, 0.5 )
   )
  DF$isPregnant[ birthIdx ] <- FALSE # pregnancy over
  birthIdx[ birthIdx ] <- femalechild # track births further only where female
  # DF$age <- ifelse( DF$isAlive
  # , DF$age + timeStep
  # , DF$age
  # )
  DF$age[ DF$isAlive ] <- DF$age[ DF$isAlive ] + timeStep
  numNotAlive <- sum( !DF$isAlive )
  numBirths <- sum( birthIdx )
  if ( 0 < numBirths ) { # if needed, start female babies in existing or new 
rows
if ( 0 < numNotAlive ) {
  reuseidx <- which( !DF$isAlive )
  if ( numBirths <= numNotAlive ) {
# can fit all new births into existing DF
reuseidx <- reuseidx[ seq.int( numBirths ) ]
DF[ reuseidx, ] <- makeNewWomen( numBirths )
  } else {
DF[ reuseidx, ] <- makeNewWomen( length( reuseidx ) )
DF <- rbind( DF
   , makeNewWomen( numBirths - length( reuseidx ) )
   )
  }
} else { # no empty rows in DF
  DF <- rbind( DF
 , makeNewWomen( numBirths )
 )
}
  }
  DF  # return the updated data frame to the caller
}

calculatePopulation <- function( nWomen
   , maxDate
   , dpy
   , pregProb
   , maxAge
   , timeStep
   , gestation
   , minBirthAge
   , maxBirthAge
   , prealloc
   ) {
  jd <- 0
  nextSampleJd <- jd + dpy
  numSamples <- maxDate %/% dpy
  result <- data.frame( jd = rep( NA, numSamples )
  , NAlive = rep( NA, numSamples )
  , NPreg = rep( NA, numSamples )
  , NNotAlive = rep( NA, numSamples )
  )
  i <- 1L
  DF <- makeNewWomen( prealloc )
  DF$isAlive <- seq.int( prealloc ) <= nWomen # leave most entries "dead"
  while( jd < maxDate ) {
DF <- updateWomen( DF
 , jd
 , maxAge
 , timeStep
 , pregProb
 , gestation
 , minBirthAge
 , maxBirthAge
 )
if ( nextSampleJd <= jd ) {
  result$jd[ i ] <- jd
  result$NAlive[ i ] <- sum( DF$isAlive )
  result$NPreg[ i ] <- sum( DF$isPregnant )
  result$NNotAlive <- sum( !DF$isAlive )
  nextSampleJd <- 

Re: [R] [FORGED] Newbie Question on R versus Matlab/Octave versus C

2019-02-02 Thread Alan Feuerbacher

On 1/29/2019 11:50 PM, Jeff Newmiller wrote:

On Tue, 29 Jan 2019, Alan Feuerbacher wrote:

After my failed attempt at using Octave, I realized that most likely 
the main contributing factor was that I was not able to figure out an 
efficient data structure to model one person. But C lent itself 
perfectly to my idea of how to go about programming my simulation. So 
here's a simplified pseudocode sort of example of what I did:


Don't model one person... model an array of people.


To model a single reproducing woman I used this C construct:

typedef struct woman {
 int isAlive;
 int isPregnant;
 double age;
 . . .
} WOMAN;


# e.g.
Nwomen <- 100
women <- data.frame( isAlive = rep( TRUE, Nwomen )
    , isPregnant = rep( FALSE, Nwomen )
    , age = rep( 20, Nwomen )
    )

Then I allocated memory for a big array of these things, using the C 
malloc() function, which gave me the equivalent of this statement:


WOMAN women[NWOMEN];  /* An array of NWOMEN woman-structs */

After some initialization I set up two loops:

for( j=0; j

for ( j in seq.int( numberOfYears ) {
   # let vectorized data storage automatically handle the other for loop
   women <- updateWomen( women )
}

The function updateWomen() figures out things like whether the woman 
becomes pregnant or gives birth on a given day, dies, etc.


You can use your "fixed size" allocation strategy with flags indicating 
whether specific rows are in use, or you can only work with valid rows 
and add rows as needed for children... best to compute a logical vector 
that identifies all of the birthing mothers as a subset of the data 
frame, and build a set of children rows using the birthing mothers data 
frame as input, and then rbind the new rows to the updated women 
dataframe as appropriate. The most clear approach for individual 
decision calculations is the use of the vectorized "ifelse" function, 
though under certain circumstances putting an indexed subset on the left 
side of an assignment can modify memory "in place" (the 
functional-programming restriction against this is probably a foreign 
idea to a dyed-in-the-wool C programmer, but R usually prevents you from 
modifying the variable that was input to a function, automatically 
making a local copy of the input as needed in order to prevent such 
backwash into the caller's context).


Hi Jeff,

I'm well along in implementing your suggestions, but I don't understand 
the last paragraph. Here is part of the experimenting I've done so far:


*===*===*===*===*===*===*
updatePerson <- function() {
  ifelse( women$isAlive,
{
# Check whether to kill off this person, if she's pregnant whether
# to give birth, whether to make her pregnant again.
  women$age = women$age + timeStep
# Check if the person has reached maxAge
}
  )
}

calculatePopulation <- function() {
  lastDate = 0
  jd = 0
  while( jd < maxDate ) {
for( i in seq_len( nWomen ) ) {
  updatePerson();
}
todaysDateInt = floor(jd/dpy)
NAlive[todaysDateInt] = nWomen - nDead
# Do various other things
todaysDate = todaysDate + timeStep
jd = jd + timeStep
  }
}

nWomen <- 5
numberOfYears <- 30
women <- data.frame( isAlive = rep_len( TRUE, nWomen )
   , isPregnant = rep_len( FALSE, nWomen )
   , nChildren = rep_len( 0L, nWomen )
   , ageInt = rep_len( 0L, nWomen )
   , age = rep_len( 0, nWomen )
   , dateOfPregnancy = rep_len( 0, nWomen )
   , endDateLastPregnancy = rep_len( 0.0, nWomen )
   , minBirthAge = rep_len( 0, nWomen )
   , maxBirthAge = rep_len( 0, nWomen )
   )

# . . .

  calculatePopulation()

*===*===*===*===*===*===*

The above code (in its complete form) executes without errors. I don't 
understand at least two things:


In the updatePerson function, in the ifelse statement, how do I change 
the appropriate values in the women dataframe?


I don't understand most of your last paragraph at all.

Thanks so much for your help in learning R!

Alan

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [FORGED] Newbie Question on R versus Matlab/Octave versus C

2019-01-30 Thread Alan Feuerbacher

On 1/29/2019 11:50 PM, Jeff Newmiller wrote:

Thanks very much for providing these coding examples! I think this is a 
good way to learn some R.


Alan


On Tue, 29 Jan 2019, Alan Feuerbacher wrote:


On 1/28/2019 7:51 PM, Jeff Newmiller wrote:
If you forge on with your preconceptions of how such a simulation 
should be implemented then you will be able to reproduce your failure 
just as spectacularly using R as you did using Octave.


I think I've come to the same conclusion. :-)

It is crucial to employ vectorization of your algorithms if you want 
good performance with either Octave or R. That vectorization may 
either be over time or over separate simulations.


Please explain further, if you don't mind. My background is not in 
programming, but in analog microchip circuit design (I'm now retired). 
Thus I'm a user of circuit simulators, not a programmer of them. Also, 
I'm running this stuff on my home computers, either Linux or Windows 
machines.


I am running simulations of a million cases of power plant 
performance over 25 years in about a minute. I know someone who used 
R to simulate a CFD river flow problem in a class in a few minutes, 
while others using Fortran or Matlab were struggling to get 
comparable runs completed in many hours. I believe the difference was 
in how the data were structured and manipulated more than the 
language that was being used. I think the strong capabilities for 
presenting results using R makes using it advantageous over Octave, 
though.


After my failed attempt at using Octave, I realized that most likely 
the main contributing factor was that I was not able to figure out an 
efficient data structure to model one person. But C lent itself 
perfectly to my idea of how to go about programming my simulation. So 
here's a simplified pseudocode sort of example of what I did:


Don't model one person... model an array of people.


To model a single reproducing woman I used this C construct:

typedef struct woman {
 int isAlive;
 int isPregnant;
 double age;
 . . .
} WOMAN;


# e.g.
Nwomen <- 100
women <- data.frame( isAlive = rep( TRUE, Nwomen )
    , isPregnant = rep( FALSE, Nwomen )
    , age = rep( 20, Nwomen )
    )

Then I allocated memory for a big array of these things, using the C 
malloc() function, which gave me the equivalent of this statement:


WOMAN women[NWOMEN];  /* An array of NWOMEN woman-structs */

After some initialization I set up two loops:

for( j=0; j

for ( j in seq.int( numberOfYears ) {
   # let vectorized data storage automatically handle the other for loop
   women <- updateWomen( women )
}

The function updateWomen() figures out things like whether the woman 
becomes pregnant or gives birth on a given day, dies, etc.


You can use your "fixed size" allocation strategy with flags indicating 
whether specific rows are in use, or you can only work with valid rows 
and add rows as needed for children... best to compute a logical vector 
that identifies all of the birthing mothers as a subset of the data 
frame, and build a set of children rows using the birthing mothers data 
frame as input, and then rbind the new rows to the updated women 
dataframe as appropriate. The most clear approach for individual 
decision calculations is the use of the vectorized "ifelse" function, 
though under certain circumstances putting an indexed subset on the left 
side of an assignment can modify memory "in place" (the 
functional-programming restriction against this is probably a foreign 
idea to a dyed-in-the-wool C programmer, but R usually prevents you from 
modifying the variable that was input to a function, automatically 
making a local copy of the input as needed in order to prevent such 
backwash into the caller's context).


I added other refinements that are not relevant here, such as random 
variations of various parameters, using the GNU Scientific Library 
random number generator functions.


R has quite sophisticated random number generation by default.

If you can suggest a data construct in R or Octave that does something 
like this, and uses your idea of vectorization, I'd like to hear it. 
I'd like to implement it and compare results with my C implementation.


If your problems truly need a compiled language, the Rcpp package 
lets you mix C++ with R quite easily and then you get the best of 
both worlds. (C and Fortran are supported, but they are a bit more 
finicky to setup than C++).


I don't know the answer to that, but perhaps you can help decide.

Alan


On January 28, 2019 4:00:07 PM PST, Alan Feuerbacher 
 wrote:

On 1/28/2019 4:20 PM, Rolf Turner wrote:


On 1/29/19 10:05 AM, Alan Feuerbacher wrote:


Hi,

I recently learned of the existence of R through a physicist friend
who uses it in his research. I've used Octave for a decade, and C

for

35 years, but would like to learn R. These all have advantages and
disadvantages for certain tasks, but as I'm new to R I hardly know


Re: [R] [FORGED] Newbie Question on R versus Matlab/Octave versus C

2019-01-29 Thread Jeff Newmiller

On Tue, 29 Jan 2019, Alan Feuerbacher wrote:


On 1/28/2019 7:51 PM, Jeff Newmiller wrote:
If you forge on with your preconceptions of how such a simulation should be 
implemented then you will be able to reproduce your failure just as 
spectacularly using R as you did using Octave.


I think I've come to the same conclusion. :-)

It is crucial to employ vectorization of your algorithms if you want good 
performance with either Octave or R. That vectorization may either be over 
time or over separate simulations.


Please explain further, if you don't mind. My background is not in 
programming, but in analog microchip circuit design (I'm now retired). Thus 
I'm a user of circuit simulators, not a programmer of them. Also, I'm running 
this stuff on my home computers, either Linux or Windows machines.


I am running simulations of a million cases of power plant performance over 
25 years in about a minute. I know someone who used R to simulate a CFD 
river flow problem in a class in a few minutes, while others using Fortran 
or Matlab were struggling to get comparable runs completed in many hours. I 
believe the difference was in how the data were structured and manipulated 
more than the language that was being used. I think the strong capabilities 
for presenting results using R makes using it advantageous over Octave, 
though.


After my failed attempt at using Octave, I realized that most likely the main 
contributing factor was that I was not able to figure out an efficient data 
structure to model one person. But C lent itself perfectly to my idea of how 
to go about programming my simulation. So here's a simplified pseudocode sort 
of example of what I did:


Don't model one person... model an array of people.


To model a single reproducing woman I used this C construct:

typedef struct woman {
 int isAlive;
 int isPregnant;
 double age;
 . . .
} WOMAN;


# e.g.
Nwomen <- 100
women <- data.frame( isAlive = rep( TRUE, Nwomen )
   , isPregnant = rep( FALSE, Nwomen )
   , age = rep( 20, Nwomen )
   )

Then I allocated memory for a big array of these things, using the C malloc() 
function, which gave me the equivalent of this statement:


WOMAN women[NWOMEN];  /* An array of NWOMEN woman-structs */

After some initialization I set up two loops:

for( j=0; j

for ( j in seq.int( numberOfYears ) {
  # let vectorized data storage automatically handle the other for loop
  women <- updateWomen( women )
}

The function updateWomen() figures out things like whether the woman becomes 
pregnant or gives birth on a given day, dies, etc.


You can use your "fixed size" allocation strategy with flags indicating 
whether specific rows are in use, or you can only work with valid rows and 
add rows as needed for children... best to compute a logical vector that 
identifies all of the birthing mothers as a subset of the data frame, and 
build a set of children rows using the birthing mothers data frame as 
input, and then rbind the new rows to the updated women dataframe as 
appropriate. The most clear approach for individual decision calculations 
is the use of the vectorized "ifelse" function, though under certain 
circumstances putting an indexed subset on the left side of an assignment 
can modify memory "in place" (the functional-programming restriction 
against this is probably a foreign idea to a dyed-in-the-wool C 
programmer, but R usually prevents you from modifying the variable that 
was input to a function, automatically making a local copy of the input as 
needed in order to prevent such backwash into the caller's context).


I added other refinements that are not relevant here, such as random 
variations of various parameters, using the GNU Scientific Library random 
number generator functions.


R has quite sophisticated random number generation by default.

If you can suggest a data construct in R or Octave that does something like 
this, and uses your idea of vectorization, I'd like to hear it. I'd like to 
implement it and compare results with my C implementation.


If your problems truly need a compiled language, the Rcpp package lets you 
mix C++ with R quite easily and then you get the best of both worlds. (C 
and Fortran are supported, but they are a bit more finicky to setup than 
C++).


I don't know the answer to that, but perhaps you can help decide.

Alan


On January 28, 2019 4:00:07 PM PST, Alan Feuerbacher  
wrote:

On 1/28/2019 4:20 PM, Rolf Turner wrote:


On 1/29/19 10:05 AM, Alan Feuerbacher wrote:


Hi,

I recently learned of the existence of R through a physicist friend
who uses it in his research. I've used Octave for a decade, and C

for

35 years, but would like to learn R. These all have advantages and
disadvantages for certain tasks, but as I'm new to R I hardly know

how

to evaluate them. Any suggestions?


* C is fast, but with a syntax that is (to my mind) virtually
    incomprehensible.  (You probably think differently about 

Re: [R] [FORGED] Newbie Question on R versus Matlab/Octave versus C

2019-01-29 Thread Alan Feuerbacher

On 1/28/2019 7:51 PM, Jeff Newmiller wrote:

If you forge on with your preconceptions of how such a simulation should be 
implemented then you will be able to reproduce your failure just as 
spectacularly using R as you did using Octave.


I think I've come to the same conclusion. :-)


It is crucial to employ vectorization of your algorithms if you want good 
performance with either Octave or R. That vectorization may either be over time 
or over separate simulations.


Please explain further, if you don't mind. My background is not in 
programming, but in analog microchip circuit design (I'm now retired). 
Thus I'm a user of circuit simulators, not a programmer of them. Also, 
I'm running this stuff on my home computers, either Linux or Windows 
machines.



I am running simulations of a million cases of power plant performance over 25 
years in about a minute. I know someone who used R to simulate a CFD river flow 
problem in a class in a few minutes, while others using Fortran or Matlab were 
struggling to get comparable runs completed in many hours. I believe the 
difference was in how the data were structured and manipulated more than the 
language that was being used. I think the strong capabilities for presenting 
results using R makes using it advantageous over Octave, though.


After my failed attempt at using Octave, I realized that most likely the 
main contributing factor was that I was not able to figure out an 
efficient data structure to model one person. But C lent itself 
perfectly to my idea of how to go about programming my simulation. So 
here's a simplified pseudocode sort of example of what I did:


To model a single reproducing woman I used this C construct:

typedef struct woman {
  int isAlive;
  int isPregnant;
  double age;
  . . .
} WOMAN;

Then I allocated memory for a big array of these things, using the C 
malloc() function, which gave me the equivalent of this statement:


WOMAN women[NWOMEN];  /* An array of NWOMEN woman-structs */

After some initialization I set up two loops:

for( j=0; jThe function updateWomen() figures out things like whether the woman 
becomes pregnant or gives birth on a given day, dies, etc.


I added other refinements that are not relevant here, such as random 
variations of various parameters, using the GNU Scientific Library 
random number generator functions.


If you can suggest a data construct in R or Octave that does something 
like this, and uses your idea of vectorization, I'd like to hear it. I'd 
like to implement it and compare results with my C implementation.



If your problems truly need a compiled language, the Rcpp package lets you mix 
C++ with R quite easily and then you get the best of both worlds. (C and 
Fortran are supported, but they are a bit more finicky to setup than C++).


I don't know the answer to that, but perhaps you can help decide.

Alan



On January 28, 2019 4:00:07 PM PST, Alan Feuerbacher  
wrote:

On 1/28/2019 4:20 PM, Rolf Turner wrote:


On 1/29/19 10:05 AM, Alan Feuerbacher wrote:


Hi,

I recently learned of the existence of R through a physicist friend
who uses it in his research. I've used Octave for a decade, and C

for

35 years, but would like to learn R. These all have advantages and
disadvantages for certain tasks, but as I'm new to R I hardly know

how

to evaluate them. Any suggestions?


* C is fast, but with a syntax that is (to my mind) virtually
    incomprehensible.  (You probably think differently about this.)


I've been doing it long enough that I have little problem with it,
except for pointers. :-)


* In C, you essentially have to roll your own for all tasks; in R,
    practically anything (well ...) that you want to do has already
    been programmed up.  CRAN is a wonderful resource, and there's

more

    on github.

* The syntax of R meshes beautifully with *my* thought patterns;

YMMV.


* Why not just bog in and try R out?  It's free, it's readily

available,

    and there are a number of good online tutorials.


I just installed R on my Linux Fedora system, so I'll do that.

I wonder if you'd care to comment on my little project that prompted
this? As part of another project, I wanted to model population growth
starting from a handful of starting individuals. This is exponential in

the long run, of course, but I wanted to see how a few basic parameters

affected the outcome. Using Octave, I modeled a single person as a
"cell", which in Octave has a good deal of overhead. The program
basically looped over the entire population, and updated each person
according to the parameters, which included random statistical
variations. So when the total population reached, say 10,000, and an
update time of 1 day, the program had to execute 10,000 x 365 update
operations for each year of growth. For large populations, say 100,000,

the program did not return even after 24 hours of run time.

So I switched to C, and used its "struct" declaration and an array of
structs to model the 

Re: [R] [FORGED] Newbie Question on R versus Matlab/Octave versus C

2019-01-29 Thread Alan Feuerbacher

On 1/28/2019 6:07 PM, William Dunlap wrote:
S (R's predecessor) was designed by and for data analysts.  R generally 
follows that tradition.  I think that simulations such as yours are not 
its strength, although it can make analyzing (graphically and 
numerically) the results of the simulation fun.


At this point I think you're right on all counts.

Alan


Bill Dunlap
TIBCO Software
wdunlap tibco.com 


On Mon, Jan 28, 2019 at 4:00 PM Alan Feuerbacher > wrote:


On 1/28/2019 4:20 PM, Rolf Turner wrote:
 >
 > On 1/29/19 10:05 AM, Alan Feuerbacher wrote:
 >
 >> Hi,
 >>
 >> I recently learned of the existence of R through a physicist friend
 >> who uses it in his research. I've used Octave for a decade, and
C for
 >> 35 years, but would like to learn R. These all have advantages and
 >> disadvantages for certain tasks, but as I'm new to R I hardly
know how
 >> to evaluate them. Any suggestions?
 >
 > * C is fast, but with a syntax that is (to my mind) virtually
 >    incomprehensible.  (You probably think differently about this.)

I've been doing it long enough that I have little problem with it,
except for pointers. :-)

 > * In C, you essentially have to roll your own for all tasks; in R,
 >    practically anything (well ...) that you want to do has already
 >    been programmed up.  CRAN is a wonderful resource, and there's
more
 >    on github.
  >
 > * The syntax of R meshes beautifully with *my* thought patterns;
YMMV.
 >
 > * Why not just bog in and try R out?  It's free, it's readily
available,
 >    and there are a number of good online tutorials.

I just installed R on my Linux Fedora system, so I'll do that.

I wonder if you'd care to comment on my little project that prompted
this? As part of another project, I wanted to model population growth
starting from a handful of starting individuals. This is exponential in
the long run, of course, but I wanted to see how a few basic parameters
affected the outcome. Using Octave, I modeled a single person as a
"cell", which in Octave has a good deal of overhead. The program
basically looped over the entire population, and updated each person
according to the parameters, which included random statistical
variations. So when the total population reached, say 10,000, and an
update time of 1 day, the program had to execute 10,000 x 365 update
operations for each year of growth. For large populations, say 100,000,
the program did not return even after 24 hours of run time.

So I switched to C, and used its "struct" declaration and an array of
structs to model the population. This allowed the program to
complete in
under a minute as opposed to 24 hours+. So in line with your
comments, C
is far more efficient than Octave.

How do you think R would fare in this simulation?

Alan


---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

__
R-help@r-project.org  mailing list --
To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [FORGED] Newbie Question on R versus Matlab/Octave versus C

2019-01-29 Thread Alan Feuerbacher

On 1/28/2019 5:17 PM, Bert Gunter wrote:
I would say your question is foolish -- you disagree no doubt! -- 
because the point of using R (or Octave or C++) is to take advantage of 
the packages (= "libraries" in some languages; a library is something 
different in R) it (or they) offers to simplify your task. Many of R's 
libraries are written in C (or Fortran) an thus **are** fast as well as 
having task-appropriate functionality and UI's .


Yes, I'm well aware of the libraries in Octave. But so far as I was able 
to see, none of them fit my needs. I used Octave at first because I'm 
familiar with it. But far from an expert.


So I think instead of pursuing this discussion you would do well to 
search. I find rseek.org  to be especially good for 
this sort of thing. Searching there on "demography" brought up what 
appeared to be many appropriate hits -- including the "demography" 
package! -- which you could then examine to see whether and to what 
extent they provide the functionality you seek.


I looked over the demography package, and it indeed appears to do what I 
want. But it seems to be far more complicated than my simple problem, 
and has a large learning curve.


Alan


Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along 
and sticking things into it."

-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Mon, Jan 28, 2019 at 4:00 PM Alan Feuerbacher > wrote:


On 1/28/2019 4:20 PM, Rolf Turner wrote:
 >
 > On 1/29/19 10:05 AM, Alan Feuerbacher wrote:
 >
 >> Hi,
 >>
 >> I recently learned of the existence of R through a physicist friend
 >> who uses it in his research. I've used Octave for a decade, and
C for
 >> 35 years, but would like to learn R. These all have advantages and
 >> disadvantages for certain tasks, but as I'm new to R I hardly
know how
 >> to evaluate them. Any suggestions?
 >
 > * C is fast, but with a syntax that is (to my mind) virtually
 >    incomprehensible.  (You probably think differently about this.)

I've been doing it long enough that I have little problem with it,
except for pointers. :-)

 > * In C, you essentially have to roll your own for all tasks; in R,
 >    practically anything (well ...) that you want to do has already
 >    been programmed up.  CRAN is a wonderful resource, and there's
more
 >    on github.
  >
 > * The syntax of R meshes beautifully with *my* thought patterns;
YMMV.
 >
 > * Why not just bog in and try R out?  It's free, it's readily
available,
 >    and there are a number of good online tutorials.

I just installed R on my Linux Fedora system, so I'll do that.

I wonder if you'd care to comment on my little project that prompted
this? As part of another project, I wanted to model population growth
starting from a handful of starting individuals. This is exponential in
the long run, of course, but I wanted to see how a few basic parameters
affected the outcome. Using Octave, I modeled a single person as a
"cell", which in Octave has a good deal of overhead. The program
basically looped over the entire population, and updated each person
according to the parameters, which included random statistical
variations. So when the total population reached, say 10,000, and an
update time of 1 day, the program had to execute 10,000 x 365 update
operations for each year of growth. For large populations, say 100,000,
the program did not return even after 24 hours of run time.

So I switched to C, and used its "struct" declaration and an array of
structs to model the population. This allowed the program to
complete in
under a minute as opposed to 24 hours+. So in line with your
comments, C
is far more efficient than Octave.

How do you think R would fare in this simulation?

Alan


---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

__
R-help@r-project.org  mailing list --
To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [FORGED] Newbie Question on R versus Matlab/Octave versus C

2019-01-28 Thread David Winsemius



On 1/28/19 4:00 PM, Alan Feuerbacher wrote:

On 1/28/2019 4:20 PM, Rolf Turner wrote:


On 1/29/19 10:05 AM, Alan Feuerbacher wrote:


Hi,

I recently learned of the existence of R through a physicist friend 
who uses it in his research. I've used Octave for a decade, and C 
for 35 years, but would like to learn R. These all have advantages 
and disadvantages for certain tasks, but as I'm new to R I hardly 
know how to evaluate them. Any suggestions?

>

snpped

* The syntax of R meshes beautifully with *my* thought patterns; YMMV.

* Why not just bog in and try R out?  It's free, it's readily available,
   and there are a number of good online tutorials.


I just installed R on my Linux Fedora system, so I'll do that.

I wonder if you'd care to comment on my little project that prompted 
this? As part of another project, I wanted to model population growth 
starting from a handful of starting individuals. This is exponential 
in the long run, of course, but I wanted to see how a few basic 
parameters affected the outcome. Using Octave, I modeled a single 
person as a "cell", which in Octave has a good deal of overhead. The 
program basically looped over the entire population, and updated each 
person according to the parameters, which included random statistical 
variations. So when the total population reached, say 10,000, and an 
update time of 1 day, the program had to execute 10,000 x 365 update 
operations for each year of growth. For large populations, say 
100,000, the program did not return even after 24 hours of run time.


So I switched to C, and used its "struct" declaration and an array of 
structs to model the population. This allowed the program to complete 
in under a minute as opposed to 24 hours+. So in line with your 
comments, C is far more efficient than Octave.


How do you think R would fare in this simulation?

This sounds like a problem that would fit into a stochastic differential 
equation.  There are at least three packages in CRAN (and I suspect a 
few more) that will handle simulations of stochastic differential 
equations. Bert's suggestion to use Rseek should serve you well.



--

David.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [FORGED] Newbie Question on R versus Matlab/Octave versus C

2019-01-28 Thread Jeff Newmiller
If you forge on with your preconceptions of how such a simulation should be 
implemented then you will be able to reproduce your failure just as 
spectacularly using R as you did using Octave. It is crucial to employ 
vectorization of your algorithms if you want good performance with either 
Octave or R. That vectorization may either be over time or over separate 
simulations.

I am running simulations of a million cases of power plant performance over 25 
years in about a minute. I know someone who used R to simulate a CFD river flow 
problem in a class in a few minutes, while others using Fortran or Matlab were 
struggling to get comparable runs completed in many hours. I believe the 
difference was in how the data were structured and manipulated more than the 
language that was being used. I think the strong capabilities for presenting 
results using R makes using it advantageous over Octave, though.

If your problems truly need a compiled language, the Rcpp package lets you mix 
C++ with R quite easily and then you get the best of both worlds. (C and 
Fortran are supported, but they are a bit more finicky to setup than C++).

On January 28, 2019 4:00:07 PM PST, Alan Feuerbacher  
wrote:
>On 1/28/2019 4:20 PM, Rolf Turner wrote:
>> 
>> On 1/29/19 10:05 AM, Alan Feuerbacher wrote:
>> 
>>> Hi,
>>>
>>> I recently learned of the existence of R through a physicist friend 
>>> who uses it in his research. I've used Octave for a decade, and C
>for 
>>> 35 years, but would like to learn R. These all have advantages and 
>>> disadvantages for certain tasks, but as I'm new to R I hardly know
>how 
>>> to evaluate them. Any suggestions?
>> 
>> * C is fast, but with a syntax that is (to my mind) virtually
>>    incomprehensible.  (You probably think differently about this.)
>
>I've been doing it long enough that I have little problem with it, 
>except for pointers. :-)
>
>> * In C, you essentially have to roll your own for all tasks; in R,
>>    practically anything (well ...) that you want to do has already
>>    been programmed up.  CRAN is a wonderful resource, and there's
>more
>>    on github.
> >
>> * The syntax of R meshes beautifully with *my* thought patterns;
>YMMV.
>> 
>> * Why not just bog in and try R out?  It's free, it's readily
>available,
>>    and there are a number of good online tutorials.
>
>I just installed R on my Linux Fedora system, so I'll do that.
>
>I wonder if you'd care to comment on my little project that prompted 
>this? As part of another project, I wanted to model population growth 
>starting from a handful of starting individuals. This is exponential in
>
>the long run, of course, but I wanted to see how a few basic parameters
>
>affected the outcome. Using Octave, I modeled a single person as a 
>"cell", which in Octave has a good deal of overhead. The program 
>basically looped over the entire population, and updated each person 
>according to the parameters, which included random statistical 
>variations. So when the total population reached, say 10,000, and an 
>update time of 1 day, the program had to execute 10,000 x 365 update 
>operations for each year of growth. For large populations, say 100,000,
>
>the program did not return even after 24 hours of run time.
>
>So I switched to C, and used its "struct" declaration and an array of 
>structs to model the population. This allowed the program to complete
>in 
>under a minute as opposed to 24 hours+. So in line with your comments,
>C 
>is far more efficient than Octave.
>
>How do you think R would fare in this simulation?
>
>Alan
>
>
>---
>This email has been checked for viruses by Avast antivirus software.
>https://www.avast.com/antivirus
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

-- 
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [FORGED] Newbie Question on R versus Matlab/Octave versus C

2019-01-28 Thread William Dunlap via R-help
S (R's predecessor) was designed by and for data analysts.  R generally
follows that tradition.  I think that simulations such as yours are not its
strength, although it can make analyzing (graphically and numerically) the
results of the simulation fun.

Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Mon, Jan 28, 2019 at 4:00 PM Alan Feuerbacher 
wrote:

> On 1/28/2019 4:20 PM, Rolf Turner wrote:
> >
> > On 1/29/19 10:05 AM, Alan Feuerbacher wrote:
> >
> >> Hi,
> >>
> >> I recently learned of the existence of R through a physicist friend
> >> who uses it in his research. I've used Octave for a decade, and C for
> >> 35 years, but would like to learn R. These all have advantages and
> >> disadvantages for certain tasks, but as I'm new to R I hardly know how
> >> to evaluate them. Any suggestions?
> >
> > * C is fast, but with a syntax that is (to my mind) virtually
> >incomprehensible.  (You probably think differently about this.)
>
> I've been doing it long enough that I have little problem with it,
> except for pointers. :-)
>
> > * In C, you essentially have to roll your own for all tasks; in R,
> >practically anything (well ...) that you want to do has already
> >been programmed up.  CRAN is a wonderful resource, and there's more
> >on github.
>  >
> > * The syntax of R meshes beautifully with *my* thought patterns; YMMV.
> >
> > * Why not just bog in and try R out?  It's free, it's readily available,
> >and there are a number of good online tutorials.
>
> I just installed R on my Linux Fedora system, so I'll do that.
>
> I wonder if you'd care to comment on my little project that prompted
> this? As part of another project, I wanted to model population growth
> starting from a handful of starting individuals. This is exponential in
> the long run, of course, but I wanted to see how a few basic parameters
> affected the outcome. Using Octave, I modeled a single person as a
> "cell", which in Octave has a good deal of overhead. The program
> basically looped over the entire population, and updated each person
> according to the parameters, which included random statistical
> variations. So when the total population reached, say 10,000, and an
> update time of 1 day, the program had to execute 10,000 x 365 update
> operations for each year of growth. For large populations, say 100,000,
> the program did not return even after 24 hours of run time.
>
> So I switched to C, and used its "struct" declaration and an array of
> structs to model the population. This allowed the program to complete in
> under a minute as opposed to 24 hours+. So in line with your comments, C
> is far more efficient than Octave.
>
> How do you think R would fare in this simulation?
>
> Alan
>
>
> ---
> This email has been checked for viruses by Avast antivirus software.
> https://www.avast.com/antivirus
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [FORGED] Newbie Question on R versus Matlab/Octave versus C

2019-01-28 Thread Gabor Grothendieck
This would be a suitable application for NetLogo.  The R package
RNetLogo provides an interface.  In a few lines of code you get a
simulation with graphics.

On Mon, Jan 28, 2019 at 7:00 PM Alan Feuerbacher  wrote:
>
> On 1/28/2019 4:20 PM, Rolf Turner wrote:
> >
> > On 1/29/19 10:05 AM, Alan Feuerbacher wrote:
> >
> >> Hi,
> >>
> >> I recently learned of the existence of R through a physicist friend
> >> who uses it in his research. I've used Octave for a decade, and C for
> >> 35 years, but would like to learn R. These all have advantages and
> >> disadvantages for certain tasks, but as I'm new to R I hardly know how
> >> to evaluate them. Any suggestions?
> >
> > * C is fast, but with a syntax that is (to my mind) virtually
> >incomprehensible.  (You probably think differently about this.)
>
> I've been doing it long enough that I have little problem with it,
> except for pointers. :-)
>
> > * In C, you essentially have to roll your own for all tasks; in R,
> >practically anything (well ...) that you want to do has already
> >been programmed up.  CRAN is a wonderful resource, and there's more
> >on github.
>  >
> > * The syntax of R meshes beautifully with *my* thought patterns; YMMV.
> >
> > * Why not just bog in and try R out?  It's free, it's readily available,
> >and there are a number of good online tutorials.
>
> I just installed R on my Linux Fedora system, so I'll do that.
>
> I wonder if you'd care to comment on my little project that prompted
> this? As part of another project, I wanted to model population growth
> starting from a handful of starting individuals. This is exponential in
> the long run, of course, but I wanted to see how a few basic parameters
> affected the outcome. Using Octave, I modeled a single person as a
> "cell", which in Octave has a good deal of overhead. The program
> basically looped over the entire population, and updated each person
> according to the parameters, which included random statistical
> variations. So when the total population reached, say 10,000, and an
> update time of 1 day, the program had to execute 10,000 x 365 update
> operations for each year of growth. For large populations, say 100,000,
> the program did not return even after 24 hours of run time.
>
> So I switched to C, and used its "struct" declaration and an array of
> structs to model the population. This allowed the program to complete in
> under a minute as opposed to 24 hours+. So in line with your comments, C
> is far more efficient than Octave.
>
> How do you think R would fare in this simulation?
>
> Alan
>
>
> ---
> This email has been checked for viruses by Avast antivirus software.
> https://www.avast.com/antivirus
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [FORGED] Newbie Question on R versus Matlab/Octave versus C

2019-01-28 Thread Bert Gunter
I would say your question is foolish -- you disagree no doubt! -- because
the point of using R (or Octave or C++) is to take advantage of the
packages (= "libraries" in some languages; a library is something different
in R) it (or they) offers to simplify your task. Many of R's libraries are
written in C (or Fortran) an thus **are** fast as well as having
task-appropriate functionality and UI's .

So I think instead of pursuing this discussion you would do well to search.
I find rseek.org to be especially good for this sort of thing. Searching
there on "demography" brought up what appeared to be many appropriate hits
-- including the "demography" package! -- which you could then examine to
see whether and to what extent they provide the functionality you seek.

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Mon, Jan 28, 2019 at 4:00 PM Alan Feuerbacher 
wrote:

> On 1/28/2019 4:20 PM, Rolf Turner wrote:
> >
> > On 1/29/19 10:05 AM, Alan Feuerbacher wrote:
> >
> >> Hi,
> >>
> >> I recently learned of the existence of R through a physicist friend
> >> who uses it in his research. I've used Octave for a decade, and C for
> >> 35 years, but would like to learn R. These all have advantages and
> >> disadvantages for certain tasks, but as I'm new to R I hardly know how
> >> to evaluate them. Any suggestions?
> >
> > * C is fast, but with a syntax that is (to my mind) virtually
> >incomprehensible.  (You probably think differently about this.)
>
> I've been doing it long enough that I have little problem with it,
> except for pointers. :-)
>
> > * In C, you essentially have to roll your own for all tasks; in R,
> >practically anything (well ...) that you want to do has already
> >been programmed up.  CRAN is a wonderful resource, and there's more
> >on github.
>  >
> > * The syntax of R meshes beautifully with *my* thought patterns; YMMV.
> >
> > * Why not just bog in and try R out?  It's free, it's readily available,
> >and there are a number of good online tutorials.
>
> I just installed R on my Linux Fedora system, so I'll do that.
>
> I wonder if you'd care to comment on my little project that prompted
> this? As part of another project, I wanted to model population growth
> starting from a handful of starting individuals. This is exponential in
> the long run, of course, but I wanted to see how a few basic parameters
> affected the outcome. Using Octave, I modeled a single person as a
> "cell", which in Octave has a good deal of overhead. The program
> basically looped over the entire population, and updated each person
> according to the parameters, which included random statistical
> variations. So when the total population reached, say 10,000, and an
> update time of 1 day, the program had to execute 10,000 x 365 update
> operations for each year of growth. For large populations, say 100,000,
> the program did not return even after 24 hours of run time.
>
> So I switched to C, and used its "struct" declaration and an array of
> structs to model the population. This allowed the program to complete in
> under a minute as opposed to 24 hours+. So in line with your comments, C
> is far more efficient than Octave.
>
> How do you think R would fare in this simulation?
>
> Alan
>
>
> ---
> This email has been checked for viruses by Avast antivirus software.
> https://www.avast.com/antivirus
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [FORGED] Newbie Question on R versus Matlab/Octave versus C

2019-01-28 Thread Alan Feuerbacher

On 1/28/2019 4:20 PM, Rolf Turner wrote:


On 1/29/19 10:05 AM, Alan Feuerbacher wrote:


Hi,

I recently learned of the existence of R through a physicist friend 
who uses it in his research. I've used Octave for a decade, and C for 
35 years, but would like to learn R. These all have advantages and 
disadvantages for certain tasks, but as I'm new to R I hardly know how 
to evaluate them. Any suggestions?


* C is fast, but with a syntax that is (to my mind) virtually
   incomprehensible.  (You probably think differently about this.)


I've been doing it long enough that I have little problem with it, 
except for pointers. :-)



* In C, you essentially have to roll your own for all tasks; in R,
   practically anything (well ...) that you want to do has already
   been programmed up.  CRAN is a wonderful resource, and there's more
   on github.

>

* The syntax of R meshes beautifully with *my* thought patterns; YMMV.

* Why not just bog in and try R out?  It's free, it's readily available,
   and there are a number of good online tutorials.


I just installed R on my Linux Fedora system, so I'll do that.

I wonder if you'd care to comment on my little project that prompted 
this? As part of another project, I wanted to model population growth 
starting from a handful of starting individuals. This is exponential in 
the long run, of course, but I wanted to see how a few basic parameters 
affected the outcome. Using Octave, I modeled a single person as a 
"cell", which in Octave has a good deal of overhead. The program 
basically looped over the entire population, and updated each person 
according to the parameters, which included random statistical 
variations. So when the total population reached, say 10,000, and an 
update time of 1 day, the program had to execute 10,000 x 365 update 
operations for each year of growth. For large populations, say 100,000, 
the program did not return even after 24 hours of run time.


So I switched to C, and used its "struct" declaration and an array of 
structs to model the population. This allowed the program to complete in 
under a minute as opposed to 24 hours+. So in line with your comments, C 
is far more efficient than Octave.


How do you think R would fare in this simulation?

Alan


---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [FORGED] Newbie Question on R versus Matlab/Octave versus C

2019-01-28 Thread Rolf Turner



On 1/29/19 10:05 AM, Alan Feuerbacher wrote:


Hi,

I recently learned of the existence of R through a physicist friend who 
uses it in his research. I've used Octave for a decade, and C for 35 
years, but would like to learn R. These all have advantages and 
disadvantages for certain tasks, but as I'm new to R I hardly know how 
to evaluate them. Any suggestions?


* C is fast, but with a syntax that is (to my mind) virtually
  incomprehensible.  (You probably think differently about this.)

* In C, you essentially have to roll your own for all tasks; in R,
  practically anything (well ...) that you want to do has already
  been programmed up.  CRAN is a wonderful resource, and there's more
  on github.

* The syntax of R meshes beautifully with *my* thought patterns; YMMV.

* Why not just bog in and try R out?  It's free, it's readily available,
  and there are a number of good online tutorials.

cheers,

Rolf Turner

--
Honorary Research Fellow
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.