Re: [R] Using interactive plots to get information about data points

2008-08-23 Thread Michael Bibo
jcarmichael jcarmichael314 at gmail.com writes:

 
 
 I have been experimenting with interactive packages such iplots and playwith. 
 Consider the following sample dataset:
 
 A  B  C  D
 1  5  5  9
 3  2  8  4
 1  7  3  0
 7  2  2  6
 
 Let's say I make a plot of variable A.  I would like to be able to click on
 a data point (e.g. 3) and have a pop-up window tell me the corresponding
 value for variable D (e.g. 4).  

?identify with labels argument.

Another approach you might like to consider is to use GGobi (www.ggobi.org) with
the rggobi package linking directly to it from R.  GGobi is built specifically
for this kind of interactive purpose.

 I am also trying to produce multiple small
 plots.  For example, four side-by-side boxplots for each of the four
 variables A, B, C, D.

?par... eg par(mfrow=c(1,4)) (for base graphics).


Hope this helps.

Michael Bibo
Queensland Health

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lost in the SNOW at 4 AM; parallelization confusion...

2008-08-23 Thread Eric Rupley



Apologies at what must be a very basic question, but I have not found  
any clear examples on how to design the following


I would like to run iterative analysis over several processors.  A toy  
example of the analysis is attached; for a resampling function run 1k  
times, with two different sets of conditioning variables i,j on some  
data vec...


What is the usual way to attack such a problem using snow?  My  
understanding up to this point is that one should:


(1) set the random seed to uncorrelate the processors' actions in  
select()


(2) make a function myfunc(vec,i,j) which returns the item of interest

(3) set up a wrapper which iterates through i,j, and makes the call to  
the cluster


(4) call the cluster using clusterApply(cl,vec, myfunc)

I must be terribly confused based on the results attached belowany  
advice will be appreciated...



Many thanks,
Best,
Eric

--
 Eric Rupley
 University of Michigan, Museum of Anthropology
 1109 Geddes Ave, Rm. 4013
 Ann Arbor, MI 48109-1079

 [EMAIL PROTECTED]
 +1.734.276.8572



# set up
#
# cl - makeCluster(7)
#   8 slaves are spawned successfully. 0 failed.
#clusterSetupRNG(cl)
#[1] RNGstream


vec - runif(1000,1,100)
d - NULL; c.j - NULL;c.i - NULL

# the toy function

analysis.func - function (vec,i,j) {
b - NULL
for (k in c(1:1000)) {
a - sample(vec,1000,replace=T) #requires randoms...
b - append(b, mean(a))
}
c - (sum(b)*j)/i
return(c)
}


# the analysis

system.time(for (i in c(2,4)) { # a series of nested iterations...

for (j in c(5:6)) {

d -  
append( mean( as.numeric( clusterApply(cl,vec,analysis.func,i,j) ) ) ,  
d)

# this is ugly and contorted; there has to be a better way?
c.j - append(j, c.j)
c.i - append(i, c.i)
}
})

#   user  system elapsed
#  9.758   0.291  48.771
#

# but the old way is faster...

d - NULL; c.j - NULL; c.i - NULL # set up again

system.time(for (i in c(2,4)) { # a series of nested iterations...

for (j in c(5:6)) {

d -append( mean( as.numeric( analysis.func(vec,i,j) )) ,d)
# keeping it ugly for timing comparision...
c.j - append(j, c.j)
c.i - append(i, c.i)
}
})


#   user  system elapsed
#  0.299   0.002   0.299
#  # arrgrgrgrgrg!!!

stopCluster(cl)
#[1] 1
sessionInfo()
#R version 2.7.1 (2008-06-23)
#i386-apple-darwin8.10.1
#
#locale:
#en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
#
#attached base packages:
#[1] stats graphics  grDevices utils datasets  methods   base
#
#other attached packages:
#[1] rlecuyer_0.1 boot_1.2-33  snow_0.3-3   Rmpi_0.5-5
#
#loaded via a namespace (and not attached):
#[1] tools_2.7.1
date()
#[1] Sat Aug 23 04:25:50 2008
#
#Too late for a drink. Pity.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Coordinate systems for geostatistics in R (imicola)

2008-08-23 Thread Patrick Giraudoux
 If you use the spatial objects provided by the 
sp-package (http://cran.r-project.org/web/packages/sp/vignettes/sp.pdf) 
you transform your data to other projections using the spTransform package.


Thus you will need the rgdal package in complement (it actually includes 
spTransform). This function is extremely convenient: you can manage 
coordinate transformations extremely easily for common systems (WGS84, 
UTM) within the R environment.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using lme, how to specify: (1) repeated measures, and (2) Toeplitz covariance structure?

2008-08-23 Thread David Hajage
1) I think that the repeated measure is not years, but, as you said, the
count of birds. If you are interested with the effect of the time variable
(years), perhaps you need to introduce it as a fixed effect ?

2) See ?corARMA.

2008/8/22 mtb954 [EMAIL PROTECTED]

 We are attempting to use nlme to fit a linear mixed model to explain bird
 abundance as a function of habitat:

 lme(abundance~habitat-1,data=data,method=ML,random=~1|sampleunit)

 The data consist of repeated counts of birds in sample units across
 multiple
 years, and we have two questions:

 1) Is it necessary (and, if so, how) to specify the repeated measure
 (years)?

 2) How can we specify a Toeplitz heterogeneous covariance structure for
 this
 model? We have searched the help file for lme, and the R-help archives, but
 cannot find any pertinent information.

 Any help would be appreciated.

 Thanks, Mark

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] simple generation of artificial data with defined features

2008-08-23 Thread drflxms
Dear Mr. Christos Hatzis,

thank you so much for your answer which is in my eyes just brilliant! I
followed it step by step (great and detailed explanation) and nearly
everything is fine. - Except a problem in the very end, I haven't found
a solution for until now. (Despite playing arround quite a lot...)
Please let me explain:

 election.2005 - c(16194,13136,3494,3838,4648,4118) #cut of last 3
digits, cause my laptop can't handle millions of rows...
 attr(election.2005, class) - table
 attr(election.2005, dim) - c(1,6)
 attr(election.2005, dimnames) - list(c(votes), c(spd, cdu,
csu, gruene, fdp, pds))
 head(election.2005)
spd   cdu  csu gruene  fdp  pds
votes 16194 13136 3494   3838 4648 4118
 el.dt - as.data.frame(election.2005)
 el.dt.exp - el.dt[rep(1:nrow(el.dt), el.dt$Freq), -ncol(el.dt)]
 dim(el.dt.exp)
[1] 45428 2
 head(el.dt.exp)
 Var1 Var2
1   votes  spd
1.1 votes  spd
1.2 votes  spd
1.3 votes  spd
1.4 votes  spd
1.5 votes  spd

My problem now is, that I would need either an autoincrementing
identifier instead of votes in Var1 or the possibility to access the
numbering by a column name (i.e. Var0). In addition I need a 3rd
Variable for the year oft the election (2005, which is the same for all,
but needed later on). So this is what it should look like:

 voter.id party election.year
1   1spd2005
1.1 2 spd  2005
1.2 3spd   2005
1.3 4spd2005
1.4 5spd2005
1.5 6spd2005

The reason for that is the input format of the kappam.fleiss function of
the irr package I use for calculation. It accepts a data.frame with the
categories as rows (here we would have only one catgory: the year of the
election) and the raters (here the voters) as columns. In the data.frame
there will be the chosen party for each combination of electionyear and
voter.

This format can be easily achieved using the reshape package. Assuming
voter.id would be an autoincrementing identifier, the command should be:

library(reshape)
el.dt.exp.molten-melt(el.dt.exp, id=c(voter.id)) #which would
propably change not really anything in this case, because the data is
already in a molten form
kappa.frame-cast(el.dt.exp.molten, election.year ~ voter.id,
subset=variable==party)

I'd be extremely happy in case you might help me out again!
Have a nice weekend and many thanks so far!
Greetings from Munich,

Felix Mueller-Sarnowski


Christos Hatzis wrote:
 On the general question on how to create a dataset that matches the
 frequencies in a table, function as.data.frame can be useful.  It takes as
 argument an object of a class 'table' and returns a data frame of
 frequencies.

 Consider for example table 6.1 of Fleiss et al (3rd Ed):

   
 birth.weight - c(10,15,40,135)
 attr(birth.weight, class) - table
 attr(birth.weight, dim) - c(2,2)
 attr(birth.weight, dimnames) - list(c(A, Ab), c(B, Bb))
 birth.weight
 
  B  Bb
 A   10  40
 Ab  15 135
   
 summary(birth.weight)
 
 Number of cases in table: 200 
 Number of factors: 2 
 Test for independence of all factors:
 Chisq = 3.429, df = 1, p-value = 0.06408
   
 bw.dt - as.data.frame(birth.weight)
 

 Observations (rows) in this table can then be replicated according to their
 corresponding frequencies to yield the expanded dataset that conforms with
 the original table. 

   
 bw.dt.exp - bw.dt[rep(1:nrow(bw.dt), bw.dt$Freq), -ncol(bw.dt)]
 dim(bw.dt.exp)
 
 [1] 200   2
   
 table(bw.dt.exp)
 
 Var2
 Var1   B  Bb
   A   10  40
   Ab  15 135 

 The above approach is not restricted to 2x2 tables, and should be
 straightforward generate datasets that conform to arbitrary nxm frequency
 tables.

 -Christos Hatzis


   
 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Greg Snow
 Sent: Friday, August 22, 2008 12:41 PM
 To: drflxms; r-help@r-project.org
 Subject: Re: [R] simple generation of artificial data with 
 defined features

 I don't think that the election data is the right data to 
 demonstrate Kappa, you need subjects that are classified by 2 
 or more different raters/methods.  The election data could be 
 considered classifying the voters into which party they voted 
 for, but you only have 1 rater.  Maybe if you had some survey 
 data that showed which party each voter voted for in 2 or 
 more elections, then that may be a good example dataset.  
 Otherwise you may want to stick with the sample datasets.

 There are other packages that compute Kappa values as well (I 
 don't know if others calculate this particular version), but 
 some of those take the summary data as input rather than the 
 raw data, which may be easier if you just have the summary tables.


 --
 Gregory (Greg) L. Snow Ph.D.
 Statistical Data Center
 Intermountain Healthcare
 [EMAIL PROTECTED]
 (801) 408-8111



 
 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL 

Re: [R] Sending ... to a C external

2008-08-23 Thread Barry Rowlingson
2008/8/22 Emmanuel Charpentier [EMAIL PROTECTED]:
 Le vendredi 22 août 2008 à 15:16 -0400, John Tillinghast a écrit :
 I'm trying to figure this out with Writing R Extensions but there's not a
 lot of detail on this issue.
 I want to write a (very simple really) C external that will be able to take
 ... as an argument.
 (It's for optimizing a function that may have several parameters besides the
 ones being optimized.)

 !!! That's a hard one. I have never undertaken this kind of job, but I expect
 that your ... argument, if you can reach it from C (which I don't know) will
 be bound to a Lisp-like structure, notoriously hard to decode in C. Basically,
 you'll have to create very low level code (an duplicate a good chunk of the R
 parser-interpreter...).

 I'd rather treat the ... argument in a wrapper that could call the relevant
 C function with all arguments interpreted and bound... This wrapper would
 probably be an order of magnitude slower than C code, but two orders of 
 magnitude
 easier to write (and maintain !). Since ... argument parsing would be done 
 *once*
 before the grunt work is accomplished by C code, the slowdown would 
 (probably)
 be negligible...

 I think you're overstating the problem somewhat! Everything you need
to process ... in C is pretty much in the showArgs function in the
R-ext help. The problem is that John's not told us what his error was!

 It works for me:

  showArgs(x=2,y=3,z=4)
 [1] 'x' 2.00
 [2] 'y' 3.00
 [3] 'z' 4.00
 NULL
  showArgs(x=2,y=3,z=s)
 [1] 'x' 2.00
 [2] 'y' 3.00
 [3] 'z' s
 NULL

 But I can get an error if I don't name an argument:

  showArgs(x=2,y=3,s)
 [1] 'x' 2.00
 [2] 'y' 3.00
 Error in showArgs(x = 2, y = 3, s) :
   CHAR() can only be applied to a 'CHARSXP', not a 'NULL'

But that's just because the C doesn't check for this.

 Is that what you're getting? What errors are you getting?

Barry

[this was on an R 2.7.0 I had kicking around, so maybe changed for
later versions...]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] simple generation of artificial data with defined features

2008-08-23 Thread Christoph Meyer
Hi,

to add voter.id and election.year to your data frame you could try:

el.dt.exp$voter.id=seq(1:nrow(el.dt.exp))

el.dt.exp$election.year=2005

Cheers,

Christoph Meyer


***
Dr. Christoph Meyer
Institute of Experimental Ecology
University of Ulm
Albert-Einstein-Allee 11
D-89069 Ulm
Germany
Phone:  ++49-(0)731-502-2675
Fax:++49-(0)731-502-2683
Mobile: ++49-(0)1577-156-7049
E-mail: [EMAIL PROTECTED]
http://www.uni-ulm.de/index.php?id=7885
***

Saturday, August 23, 2008, 1:25:05 PM, you wrote:

 Dear Mr. Christos Hatzis,

 thank you so much for your answer which is in my eyes just brilliant! I
 followed it step by step (great and detailed explanation) and nearly
 everything is fine. - Except a problem in the very end, I haven't found
 a solution for until now. (Despite playing arround quite a lot...)
 Please let me explain:

 election.2005 - c(16194,13136,3494,3838,4648,4118) #cut of last 3
 digits, cause my laptop can't handle millions of rows...
 attr(election.2005, class) - table
 attr(election.2005, dim) - c(1,6)
 attr(election.2005, dimnames) - list(c(votes), c(spd, cdu,
 csu, gruene, fdp, pds))
 head(election.2005)
 spd   cdu  csu gruene  fdp  pds
 votes 16194 13136 3494   3838 4648 4118
 el.dt - as.data.frame(election.2005)
 el.dt.exp - el.dt[rep(1:nrow(el.dt), el.dt$Freq), -ncol(el.dt)]
 dim(el.dt.exp)
 [1] 45428 2
 head(el.dt.exp)
  Var1 Var2
 1   votes  spd
 1.1 votes  spd
 1.2 votes  spd
 1.3 votes  spd
 1.4 votes  spd
 1.5 votes  spd

 My problem now is, that I would need either an autoincrementing
 identifier instead of votes in Var1 or the possibility to access the
 numbering by a column name (i.e. Var0). In addition I need a 3rd
 Variable for the year oft the election (2005, which is the same for all,
 but needed later on). So this is what it should look like:

  voter.id party election.year
 1   1spd2005
 1.1 2 spd  2005
 1.2 3spd   2005
 1.3 4spd2005
 1.4 5spd2005
 1.5 6spd2005

...

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] simple generation of artificial data with defined features

2008-08-23 Thread drflxms
Hello Mr. Greg Snow!

Thank you very much for your prompt answer.
 I don't think that the election data is the right data to demonstrate Kappa, 
 you need subjects that are classified by 2 or more different raters/methods.  
 The election data could be considered classifying the voters into which party 
 they voted for, but you only have 1 rater.
I think, It should be possible to calculate kappa in case one has a
little different point of view from the one you described above: Take
the voters as raters who judge the category election with one party
out of the six mentioned in my previous e-mail (which are simply the top
six).
This makes sense to me, because an election is somehow nothing else but
a survey with the question who should lead our country - given six
options in this example. As kappa is a measure of agreement, it should
be able to illustrate the agreement of the voters answers to this question.
For me this is - in priciple - no different from asking Where is the
stenosis in the video of this endoscopy offering six options
representing anatomic locations each.
 Otherwise you may want to stick with the sample datasets.
   
The example data sets are of excellent quality and very interesting. I
am sure there would be brilliant examples among them. But I have to
admit that,t a I have no t a good overview of the available datasets at
the moment (as a newbie).  I just wanted to give an example out of every
days life, everybody is familiar with. An election is something which
came to my mind spontaneously.
 There are other packages that compute Kappa values as well (I don't know if 
 others calculate this particular version), but some of those take the summary 
 data as input rather than the raw data, which may be easier if you just have 
 the summary tables.

   
I chose Fleiss Kappa, because it is a more general form of Cohen's Kappa
allowing m raters and n categories (instead of only two raters and to
categories when using Cohen's kappa). Looking for another package
calculating it from summary tables might be the simplest solution to my
problem. Thank you very much for this hint!
On the other hand it would be nice to use the very same method for the
example as for the real data. The example will be part of the
methods section.

Thank you again very much for your tips and the quick reply. Have a nice
weekend!
Greetings from Munich,

Felix Mueller-Sarnowski
 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of drflxms
 Sent: Friday, August 22, 2008 6:12 AM
 To: r-help@r-project.org
 Subject: [R] simple generation of artificial data with
 defined features

 Dear R-colleagues,

 I am quite a newbie to R fighting my stupidity to solve a
 probably quite simple problem of generating artificial data
 with defined features.

 I am conducting a study of inter-observer-agreement in
 child-bronchoscopy. One of the most important measures is
 Kappa according to Fleiss, which is very comfortable
 available in R through the irr-package.
 Unfortunately medical doctors like me don't really understand
 much of statistics. Therefore I'd like to give the reader an
 easy understandable example of Fleiss-Kappa in the Methods
 part. To achieve this, I obtained a table with the results of
 the German election from 2005:

 partynumber of votespercent

 SPD1619466534,2
 CDU1313674027,8
 CSU34943097,4
 Gruene38383268,1
 FDP46481449,8
 PDS41181948,7

 I want to show the agreement of voters measured by
 Fleiss-Kappa. To calculate this with the
 kappam.fleiss-function of irr, I need a data.frame like this:

 (id of 1st voter) (id of 2nd voter)

 party spd cdu

 Of course I don't plan to calculate this with the million of
 cases mentioned in the table above (I am working on a small
 laptop). A division by 1000 would be more than perfect for
 this example. The exact format of the table is generally not
 so important, as I could reshape nearly every format with the
 help of the reshape-package.

 Unfortunately I could not figure out how to create such a
 fictive/artificial dataset as described above. Any data.frame
 would be nice, that keeps at least the percentage. String-IDs
 of parties could be substituted by numbers of course (would
 be even better for function kappam.fleiss in irr!).

 I would appreciate any kind of help very much indeed.
 Greetings from Munich,

 Felix Mueller-Sarnowski

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the 

[R] Error message in termplot

2008-08-23 Thread William Vincent
Hi 

 

I am trying to plot the following gam with termplot but keep getting the
error message:

 

Error in order(xx) : unimplemented type 'list' in 'orderVector1'

 

Is there anyway I can rectify this to get my parametric coefficients
plotted?

 

Thanks 

 

Will

 

Family: binomial 

Link function: logit 

 

Formula:

fgha$pa ~ s(fgha$wspd) + fgha$depth + fgha$slha + fgha$hooks + 

wspd:hooks + wspd:hooks:depth

 

Parametric coefficients:

   Estimate Std. Error z value Pr(|z|)

(Intercept)  -2.188e+00  4.331e-01  -5.051 4.39e-07 ***

fgha$depth1.878e-04  5.275e-05   3.559 0.000372 ***

fgha$slha-1.968e-02  7.899e-03  -2.492 0.012698 *  

fgha$hooks   -3.155e-06  1.383e-06  -2.282 0.022500 *  

wspd:hooks1.351e-06  2.650e-07   5.099 3.41e-07 ***

wspd:hooks:depth -1.959e-10  4.271e-11  -4.586 4.52e-06 ***

---

Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

 

Approximate significance of smooth terms:

   edf Ref.df Chi.sq p-value   

s(fgha$wspd) 8.732  9.232  23.87 0.00518 **

---

Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

 

R-sq.(adj) =  0.0991   Deviance explained = 8.85%

UBRE score = 0.18451  Scale est. = 1 n = 1167

 

William Vincent

07979745433

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ggplot facet: change layout of panels

2008-08-23 Thread Tom Boonen
Hi,

is there anyway to adjust how ggplot(facet=) displays the layout of
panels? I have a dataset with many 25 groups and gplot(y,x,facet=
.~group) displays all 25 y~x plots next to each other so overall the
plot is too wide. if i do the same plot in lattice xyploy(y~x|group)
the y~x plots are arranged nicely 5 in each row to overall the plots
is a nice 5 by 5 rectangular grid.

 Is there any way to adjust this in gplot?

Thank you very much.

Best,
Tom

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot facet: change layout of panels

2008-08-23 Thread hadley wickham
Hi Tom,

Not yet, but I'm working on it for the next version.

Regards,

Hadley

On Sat, Aug 23, 2008 at 10:08 AM, Tom Boonen
[EMAIL PROTECTED] wrote:
 Hi,

 is there anyway to adjust how ggplot(facet=) displays the layout of
 panels? I have a dataset with many 25 groups and gplot(y,x,facet=
 .~group) displays all 25 y~x plots next to each other so overall the
 plot is too wide. if i do the same plot in lattice xyploy(y~x|group)
 the y~x plots are arranged nicely 5 in each row to overall the plots
 is a nice 5 by 5 rectangular grid.

  Is there any way to adjust this in gplot?

 Thank you very much.

 Best,
 Tom

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sending ... to a C external

2008-08-23 Thread Douglas Bates
On Fri, Aug 22, 2008 at 2:16 PM, John Tillinghast [EMAIL PROTECTED] wrote:
 I'm trying to figure this out with Writing R Extensions but there's not a
 lot of detail on this issue.
 I want to write a (very simple really) C external that will be able to take
 ... as an argument.
 (It's for optimizing a function that may have several parameters besides the
 ones being optimized.)

 I got the showArgs code (from R-exts) to compile and install, but it only
 gives me error messages when I run it. I think I'm supposed to pass it
 different arguments from what I'm doing, but I have no idea which ones.

 What exactly are CAR, CDR, and CADR anyway? Why did the R development team
 choose this very un-C-like set of commands? They are not explained much in
 R-exts.

Emmanuel has answered that - very engagingly too.  On the train from
Dortmund to Dusseldorf last week I was describing exactly that
etymology of the names CAR, CDR, CDDR, ... to my companions but I got
it wrong.  I had remembered CDR as contents of the data register but
Emmanuel is correct that it was contents of the decrement register.

I don't think it is necessary to go through the agonies of dealing
with the argument list in a .External call., which is what I assume
you are doing. (You haven't given us very much information on how you
are trying to pass the ... argument and such information would be very
helpful.  The readers of this list are quite intelligent but none, as
far as I know, have claimed to be telepathic.)

The way that I would go about this is using .Call in something like

.Call(myCfunction, arg1, arg2, dots = list(...), PACKAGE = myPackage)

Then in your C code you check the length and the names of the dots
argument and take appropriate action.

An alternative, if you want to use the ... arguments in an R
expression to be evaluated by your optimizer, is to create an
environment, assign the elements of list(...) to the appropriate names
in that environment and pass the environment through .Call to be used
as the evaluation environment for your R expression.  There is a
somewhat complicated example of this in the nlmer function in the lme4
package which you can find at http://lme4.r-forge.r-project.org/.
However, I don't feel embarrassed about the example being complicated.
 This is complex stuff and it is not surprising that it isn't
completely straightforward to accomplish. If you feel that this is
opaque in R i can hardly wait to see what you think about writing the
SPSS version.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] graphs for pretest data

2008-08-23 Thread Juliet Hannah
Is there an easy way to make graphs for the following data. I have
pretest and posttest scores for men and
women. I would like to form a 'titlted segment' plot for the data.
That is, make segments joining the scores,
with different types of segments for men and women.

Example data:

menpre - c(43,42,26,39,60,60,46)
menpost - c(40,41,36,42,54,58,43)

womenpre - c(46,56,81,56,70,70)
womenpost - c(44,52,81,59,69,68)

Thanks!

Juliet

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lost in the SNOW at 4 AM; parallelization confusion...

2008-08-23 Thread Martin Morgan
Hi Eric --

Eric Rupley [EMAIL PROTECTED] writes:

 Apologies at what must be a very basic question, but I have not found
 any clear examples on how to design the following

 I would like to run iterative analysis over several processors.  A toy
 example of the analysis is attached; for a resampling function run 1k
 times, with two different sets of conditioning variables i,j on some
 data vec...

 What is the usual way to attack such a problem using snow?  My
 understanding up to this point is that one should:

 (1) set the random seed to uncorrelate the processors' actions in
 select()

 (2) make a function myfunc(vec,i,j) which returns the item of interest

 (3) set up a wrapper which iterates through i,j, and makes the call to
 the cluster

 (4) call the cluster using clusterApply(cl,vec, myfunc)

I think you're on the right track. You say:

for (i in c(2,4)) { # a series of nested iterations...
for (j in c(5:6)) {
clusterApply(cl, vec, analysis.func, i, j)

The clusterApply says, for each element of vec, invoke
analysis.func. vec is of length 1000, so you invoke analysis.func 1000
times, and with the outer loops you're calling analysis func 2 * 2 *
1000 times.

In your single processor code you have

for (i in c(2,4)) { # a series of nested iterations...
for (j in c(5:6)) {
res - analysis.func(vec,i,j)

which invokes analysis.func 2 * 2 times. A strategy is to convert your
'for' loops into an appropriate *apply function, which I might do as
(approximately)

 its - expand.grid(i=c(2, 4), j=c(5, 6))
 mapply(analysis.func, its$i, its$j,
+MoreArgs=list(vec=vec))
[1] 120719.09  60403.20 144993.44  72468.66

(maybe you mean i=2:4, j=5:6 ?) and then to use the appropriate
cluster* function, e.g.,

 clusterMap(cl, analysis.func, its$i, its$j,
+MoreArgs=list(vec=vec))

Maybe it is now early enough (though not too early?) for that drink?

Martin

 I must be terribly confused based on the results attached belowany
 advice will be appreciated...


 Many thanks,
 Best,
 Eric

 --
   Eric Rupley
   University of Michigan, Museum of Anthropology
   1109 Geddes Ave, Rm. 4013
   Ann Arbor, MI 48109-1079

   [EMAIL PROTECTED]
   +1.734.276.8572



 # set up
 #
 # cl - makeCluster(7)
 # 8 slaves are spawned successfully. 0 failed.
 #clusterSetupRNG(cl)
 #[1] RNGstream


 vec - runif(1000,1,100)
 d - NULL; c.j - NULL;c.i - NULL

 # the toy function

 analysis.func - function (vec,i,j) {
 b - NULL
 for (k in c(1:1000)) {
   a - sample(vec,1000,replace=T) #requires randoms...
   b - append(b, mean(a))
   }
 c - (sum(b)*j)/i
 return(c)
 }


 # the analysis

 system.time(for (i in c(2,4)) { # a series of nested iterations...

   for (j in c(5:6)) {

 d -  
 append( mean( as.numeric( clusterApply(cl,vec,analysis.func,i,j) ) ) ,
 d)
 # this is ugly and contorted; there has to be a better way?
 c.j - append(j, c.j)
 c.i - append(i, c.i)
 }
 })

 #   user  system elapsed
 #  9.758   0.291  48.771
 #

 # but the old way is faster...

 d - NULL; c.j - NULL; c.i - NULL # set up again

 system.time(for (i in c(2,4)) { # a series of nested iterations...

   for (j in c(5:6)) {

 d -append( mean( as.numeric( analysis.func(vec,i,j) )) ,d)
 # keeping it ugly for timing comparision...
 c.j - append(j, c.j)
 c.i - append(i, c.i)
 }
 })


 #   user  system elapsed
 #  0.299   0.002   0.299
 #  # arrgrgrgrgrg!!!

 stopCluster(cl)
 #[1] 1
 sessionInfo()
 #R version 2.7.1 (2008-06-23)
 #i386-apple-darwin8.10.1
 #
 #locale:
 #en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
 #
 #attached base packages:
 #[1] stats graphics  grDevices utils datasets  methods   base
 #
 #other attached packages:
 #[1] rlecuyer_0.1 boot_1.2-33  snow_0.3-3   Rmpi_0.5-5
 #
 #loaded via a namespace (and not attached):
 #[1] tools_2.7.1
 date()
 #[1] Sat Aug 23 04:25:50 2008
 #
 #Too late for a drink. Pity.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M2 B169
Phone: (206) 667-2793

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Survey Design / Rake questions

2008-08-23 Thread Thomas Lumley

On Fri, 22 Aug 2008, Farley, Robert wrote:


I *think* I'm making progress, but I'm still failing at the same step.  My rake 
call fails with:
Error in postStratify.survey.design(design, strata[[i]], 
population.margins[[i]],  :
 Stratifying variables don't match

To my naïve eyes, it seems that my factors are in the wrong order.  If so, 
how do I assert an ordering in my survey dataframe, or copy an image from 
the survey dataframe to my marginals dataframes?  I'd prefer to pull the 
original marginals dataframe(s) from the survey dataframe so that I can 
automate that in production.


It looks like a problem with the NumStn factor. One copy has been converted to 
character and then factor, giving levels in alphabetical order; the other copy 
has been converted directly to factor, giving levels in numerical order.

If you use as.factor(1:12) rather than as.character(1:12) it should work.

 -thomas




If that's not my problem, where might I look for enlightenment?  Neither ?why 
nor ?whatamimissing return citations.  :-)


**
 How I fail Now **
SurveyData - read.spss(C:/Data/R/orange_delivery.sav, use.value.labels=TRUE, 
max.value.labels=Inf, to.data.frame=TRUE)

#===
temp - sub(' +$', '', SurveyData$direction_)
SurveyData$direction_ - temp
#===
SurveyData$NumStn=abs(as.numeric(SurveyData$lineon)-as.numeric(SurveyData$lineoff))
mean(SurveyData$NumStn)

[1] 6.785276

### Kludge
SurveyData$NumStn - pmax(1,SurveyData$NumStn)
mean(SurveyData$NumStn)

[1] 6.789877

SurveyData$NumStn - as.factor(SurveyData$NumStn)
###
EBSurvey - subset(SurveyData, direction_ == EASTBOUND )
XTTable - xtabs(~direction_ , EBSurvey)
XTTable

direction_
EASTBOUND
 345

WBSurvey - subset(SurveyData, direction_ == WESTBOUND )
XTTable - xtabs(~direction_ , WBSurvey)
XTTable

direction_
WESTBOUND
 307

#
EBDesign - svydesign(id=~sampn, weights=~expwgt, data=EBSurvey)
#   svytable(~lineon+lineoff, EBDesign)
StnName - c( Warner Center, De Soto, Pierce College, Tampa, Reseda, Balboa, Woodley, Sepulveda, Van 
Nuys, Woodman, Valley College, Laurel Canyon, North Hollywood)
EBOnNewTots - c(1000,   600, 1200, 500, 
1000,  500,   200, 250,   1000,   300,  100,   
   123.65,0 )
StnTraveld  - c(as.character(1:12))
EBNumStn- c(673.65, 800, 1000, 1000,  800,  700,  600, 500, 400, 200,  
50, 50 )
ByEBOn  - data.frame(StnName,   Freq=EBOnNewTots)
ByEBNum - data.frame(StnTraveld, Freq=EBNumStn)
RakedEBSurvey - rake(EBDesign, list(~lineon, ~NumStn), list(ByEBOn, ByEBNum) )

Error in postStratify.survey.design(design, strata[[i]], 
population.margins[[i]],  :
 Stratifying variables don't match


str(EBSurvey$lineon)

Factor w/ 13 levels Warner Center,..: 3 1 1 1 2 13 1 5 1 5 ...

EBSurvey$lineon[1:5]

[1] Pierce College Warner Center  Warner Center  Warner Center  De Soto
Levels: Warner Center De Soto Pierce College Tampa Reseda Balboa Woodley 
Sepulveda Van Nuys Woodman Valley College Laurel Canyon North Hollywood

str(ByEBOn$StnName)

Factor w/ 13 levels Balboa,De Soto,..: 11 2 5 8 6 1 12 7 10 13 ...

ByEBOn$StnName[1:5]

[1] Warner Center  De SotoPierce College Tampa  Reseda
Levels: Balboa De Soto Laurel Canyon North Hollywood Pierce College Reseda 
Sepulveda Tampa Valley College Van Nuys Warner Center Woodley Woodman


str(EBSurvey$NumStn)

Factor w/ 12 levels 1,2,3,4,..: 10 12 4 12 8 1 8 8 12 4 ...

EBSurvey$NumStn[1:5]

[1] 10 12 4  12 8
Levels: 1 2 3 4 5 6 7 8 9 10 11 12

str(ByEBNum$StnTraveld)

Factor w/ 12 levels 1,10,11,..: 1 5 6 7 8 9 10 11 12 2 ...

ByEBNum$StnTraveld[1:5]

[1] 1 2 3 4 5
Levels: 1 10 11 12 2 3 4 5 6 7 8 9




**
**


Robert Farley
Metro
www.Metro.net


-Original Message-
From: Thomas Lumley [mailto:[EMAIL PROTECTED]
Sent: Thursday, August 21, 2008 13:55
To: Farley, Robert
Cc: r-help@r-project.org
Subject: Re: [R] Survey Design / Rake questions

On Tue, 19 Aug 2008, Farley, Robert wrote:


While I'm trying to catch up on the statistical basis of my task, could
someone point me to how I should fix my R error?


The variables in the formula in rake() need to be the raw variables in the
design object, not summary tables.

  -thomas



Thanks





library(survey)
SurveyData - read.spss(C:/Data/R/orange_delivery.sav,

use.value.labels=TRUE, max.value.labels=Inf, to.data.frame=TRUE)



#===


temp - sub(' +$', '', 

Re: [R] graphs for pretest data

2008-08-23 Thread John Kane
?plot ?lines

Something like this perhaps

plot( menpre, type=l, col=red)
lines(menpost, col=blue)
lines(womenpre,col=green
lines(womenpost, col= orange)

also have a look at ?par for various options




--- On Sat, 8/23/08, Juliet Hannah [EMAIL PROTECTED] wrote:

 From: Juliet Hannah [EMAIL PROTECTED]
 Subject: [R] graphs for pretest data
 To: r-help@r-project.org
 Received: Saturday, August 23, 2008, 12:04 PM
 Is there an easy way to make graphs for the following data.
 I have
 pretest and posttest scores for men and
 women. I would like to form a 'titlted segment'
 plot for the data.
 That is, make segments joining the scores,
 with different types of segments for men and women.
 
 Example data:
 
 menpre - c(43,42,26,39,60,60,46)
 menpost - c(40,41,36,42,54,58,43)
 
 womenpre - c(46,56,81,56,70,70)
 womenpost - c(44,52,81,59,69,68)
 
 Thanks!
 
 Juliet
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.


  __
[[elided Yahoo spam]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Forthcoming R Conferences

2008-08-23 Thread Friedrich . Leisch


Dear useRs and developeRs,

I hope all attending useR! in Dortmund last week had as much a good
time as I had and a safe trip home. This email is to announce our
plans for forthcoming conferences. In 2009 there will be a useR! in
Rennes, France (July 8-10), directly followed by a DSC in Copenhagen,
Denmark (July 13-14).

We would like to have a useR! 2010 in North America, and 2011 in
Europe. Locations for 2010 and 2011 have not been fixed yet, but there
are already some plans.  Proposals to host a conference are of course
more than welcome.

There have also been questions why 2009 is again in Europe. The main
reasons were:

a) we had a very good offer from Rennes, and (and at least almost one
   year ago, when planning started) none from outside Europe

b) Having European useR!s in odd years will make it easier to set a
   date: in even years there are the biannual Compstat conferences,
   which together with all their satellite meetings block mid-August
   to the beginning of September. In odd years on the other hand,
   there is no regular big conference on computational statistics in
   Europe.

So the basic plan is to be in Europe in odd years, and North America
in even years from now on. Either continent could of course be
replaced by one of the other 5 continents if we get a good
offer. Antarctica may be hard to get to, though ;-)

On behalf of the R Foundation,
Fritz Leisch

-- 
---
Prof. Dr. Friedrich Leisch 

Institut für Statistik  Tel: (+49 89) 2180 3165
Ludwig-Maximilians-Universität  Fax: (+49 89) 2180 5308
Ludwigstraße 33
D-80539 München http://www.statistik.lmu.de/~leisch
---
   Journal Computational Statistics --- http://www.springer.com/180 
  Münchner R Kurse --- http://www.statistik.lmu.de/R

___
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-announce

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] graphs for pretest data

2008-08-23 Thread Michael Kubovy
Dear Juliet,

Perhaps start here:

require(lattice)
mwpp - data.frame(y = c(43,42,26,39,60,60,46,40,41,36,42,54,
58,43,46,56,81,56,70,70,44,52,81,59,69,68),
sex = rep(c(rep('men', 14), rep('women', 12))),
pp = c(rep(c('pre', 'post'), each = 7), rep(c('pre', 'post'), each =  
6)),
sub = c(1:7, 1:7, 8:13, 8:13))
xyplot(y ~ pp | sex, groups = sub, type = 'b', mwpp)

_
Professor Michael Kubovy
University of Virginia
Department of Psychology
USPS: P.O.Box 400400Charlottesville, VA 22904-4400
Parcels:Room 102Gilmer Hall
 McCormick RoadCharlottesville, VA 22903
Office:B011+1-434-982-4729
Lab:B019+1-434-982-4751
Fax:+1-434-982-4766
WWW:http://www.people.virginia.edu/~mk9y/

On Aug 23, 2008, at 12:04 PM, Juliet Hannah wrote:

 Is there an easy way to make graphs for the following data. I have
 pretest and posttest scores for men and
 women. I would like to form a 'titlted segment' plot for the data.
 That is, make segments joining the scores,
 with different types of segments for men and women.

 Example data:

 menpre - c(43,42,26,39,60,60,46)
 menpost - c(40,41,36,42,54,58,43)

 womenpre - c(46,56,81,56,70,70)
 womenpost - c(44,52,81,59,69,68)

 Thanks!

 Juliet

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] graphs for pretest data

2008-08-23 Thread hadley wickham
On Sat, Aug 23, 2008 at 1:10 PM, Michael Kubovy [EMAIL PROTECTED] wrote:
 Dear Juliet,

 Perhaps start here:

 require(lattice)
 mwpp - data.frame(y = c(43,42,26,39,60,60,46,40,41,36,42,54,
58,43,46,56,81,56,70,70,44,52,81,59,69,68),
sex = rep(c(rep('men', 14), rep('women', 12))),
pp = c(rep(c('pre', 'post'), each = 7), rep(c('pre', 'post'), each =
 6)),
sub = c(1:7, 1:7, 8:13, 8:13))

Or in ggplot2:

library(ggplot2)
qplot(pp, y, data=mwpp, geom=c(point,line), group = sub, colour=sex)
qplot(pp, y, data=mwpp, geom=c(point,line), group = sub, facets = .  ~ sex)

The key is to get your data into a data frame with variables that
explicitly label the experimental units, as Michael did for you.

Hadley


-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Tinn-R keyboard problem

2008-08-23 Thread Sermin Gungor
 The right hand side of my keyboard (Enter, shift, arrows, etc.) just
stopped working only when I am using Tinn-R. It works perfectly fine with
any other application. To check if there was a problem with my keyboard I
connected an external keyboard and the same keys did not work with that
either. Is there anyone who had the same problem before and know the
solution to this problem?

 Thanks,

 Sermin

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.