from:"Bryan Hanson"

Re: [R] ggplot2: proper use of facet_grid inside a function

2009-10-05 Thread Bryan Hanson

Thanks Thierry for the work-around.  I was out of ideas.

I had looked around for the facet_grid() analog of aes_string(), and
concluded there wasn't one.  The only thing I found was the notion of

facet_grid(...) but apparently it is intended for some other use, as it
doesn't work as I thought it would (like a hypothetical
facet_grid_string()).

Thanks so much.  Bryan


On 10/5/09 4:12 AM, ONKELINX, Thierry thierry.onkel...@inbo.be wrote:

 Dear Bryan,
 
 In the ggplot() function you can choose between aes() and aes_string().
 In the first you need to hardwire the variable names, in the latter you
 can use objects which contain the variable names. So in your case you
 need aes_string().
 
 Unfortunatly, facet_grid() works like aes() and not like aes_string().
 That is why you are getting errors.
 
 A workaround would be to add a dummy column to your data.
 
 library(ggplot2)
 data - mpg
 fac1 - cty
 fac2 - drv
 res - displ
 data$dummy - data[, fac2]
 ggplot(data, aes_string(x = fac1, y = res)) + geom_point() +
 facet_grid(.~dummy)
 
 HTH,
 
 Thierry
 
 
 
 
 ir. Thierry Onkelinx
 Instituut voor natuur- en bosonderzoek / Research Institute for Nature
 and Forest
 Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
 methodology and quality assurance
 Gaverstraat 4
 9500 Geraardsbergen
 Belgium
 tel. + 32 54/436 185
 thierry.onkel...@inbo.be
 www.inbo.be
 
 To call in the statistician after the experiment is done may be no more
 than asking him to perform a post-mortem examination: he may be able to
 say what the experiment died of.
 ~ Sir Ronald Aylmer Fisher
 
 The plural of anecdote is not data.
 ~ Roger Brinner
 
 The combination of some data and an aching desire for an answer does not
 ensure that a reasonable answer can be extracted from a given body of
 data.
 ~ John Tukey
  
 -Oorspronkelijk bericht-
 Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 Namens Bryan Hanson
 Verzonden: vrijdag 2 oktober 2009 17:21
 Aan: R Help
 Onderwerp: [R] ggplot2: proper use of facet_grid inside a function
 
 Hello Again R Folk:
 
 I have found items about this in the archives, but I'm still not getting
 it right.  I want to use ggplot2 with facet_grid inside a function with
 user specified variables, for instance:
 
 p - ggplot(data, aes_string(x = fac1, y = res)) + facet_grid(. ~
 fac2)
 
 Where data, fac1, fac2 and res are arguments to the function.  I have
 tried
 
 p - ggplot(data, aes_string(x = fac1, y = res)) + facet_grid(. ~
 as.name(fac2))
 
 and 
 
 p - ggplot(data, aes_string(x = fac1, y = res)) + facet_grid(. ~
 fac2)
 
 But all of these produce the same error:
 
 Error in `[.data.frame`(plot$data, , setdiff(cond, names(df)), drop =
 FALSE) : 
   undefined columns selected
 
 If I hardwire the true identity of fac2 into the function, it works as
 desired, so I know this is a problem of connecting the name with the
 proper value.
 
 I'm up to date on everything:
 
 R version 2.9.2 (2009-08-24)
 i386-apple-darwin8.11.1
 
 locale:
 en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
 
 attached base packages:
 [1] grid  datasets  tools utils stats graphics
 grDevices methods
 [9] base 
 
 other attached packages:
  [1] Hmisc_3.6-0ggplot2_0.8.3  reshape_0.8.3
 proto_0.3-8  
  [5] mvbutils_2.2.0 ChemoSpec_1.1  lattice_0.17-25
 mvoutlier_1.4
  [9] plyr_0.1.8 RColorBrewer_1.0-2 chemometrics_0.4   som_0.3-4

 [13] robustbase_0.4-5   rpart_3.1-45   pls_2.1-0  pcaPP_1.7

 [17] mvtnorm_0.9-7  nnet_7.2-48mclust_3.2
 MASS_7.2-48  
 [21] lars_0.9-7 e1071_1.5-19   class_7.2-48
 
 loaded via a namespace (and not attached):
 [1] cluster_1.12.0
 
 Thanks for any help!  Bryan
 *
 Bryan Hanson
 Professor of Chemistry  Biochemistry
 DePauw University, Greencastle IN USA
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 Druk dit bericht a.u.b. niet onnodig af.
 Please do not print this message unnecessarily.
 
 Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer
 en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd
 is
 door een geldig ondertekend document. The views expressed in  this message
 and any annex are purely those of the writer and may not be regarded as
 stating 
 an official position of INBO, as long as the message is not confirmed by a
 duly 
 signed document.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented

[R] ggplot2: proper use of facet_grid inside a function

2009-10-02 Thread Bryan Hanson

Hello Again R Folk:

I have found items about this in the archives, but I’m still not getting
it right.  I want to use ggplot2 with facet_grid inside a function with
user specified variables, for instance:

p - ggplot(data, aes_string(x = fac1, y = res)) + facet_grid(. ~
fac2)

Where data, fac1, fac2 and res are arguments to the function.  I have
tried

p - ggplot(data, aes_string(x = fac1, y = res)) + facet_grid(. ~
as.name(fac2))

and 

p - ggplot(data, aes_string(x = fac1, y = res)) + facet_grid(“. ~
fac2”)

But all of these produce the same error:

Error in `[.data.frame`(plot$data, , setdiff(cond, names(df)), drop =
FALSE) : 
  undefined columns selected

If I hardwire the true identity of fac2 into the function, it works as
desired, so I know this is a problem of connecting the name with the
proper value.

I'm up to date on everything:

R version 2.9.2 (2009-08-24) 
i386-apple-darwin8.11.1 

locale:
en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] grid  datasets  tools utils stats graphics 
grDevices methods  
[9] base 

other attached packages:
 [1] Hmisc_3.6-0ggplot2_0.8.3  reshape_0.8.3 
proto_0.3-8   
 [5] mvbutils_2.2.0 ChemoSpec_1.1  lattice_0.17-25   
mvoutlier_1.4 
 [9] plyr_0.1.8 RColorBrewer_1.0-2 chemometrics_0.4   som_0.3-4 
   
[13] robustbase_0.4-5   rpart_3.1-45   pls_2.1-0  pcaPP_1.7 
   
[17] mvtnorm_0.9-7  nnet_7.2-48mclust_3.2
MASS_7.2-48   
[21] lars_0.9-7 e1071_1.5-19   class_7.2-48  

loaded via a namespace (and not attached):
[1] cluster_1.12.0

Thanks for any help!  Bryan
*
Bryan Hanson
Professor of Chemistry  Biochemistry
DePauw University, Greencastle IN USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Teasing out logrank differences between groups using survdiff or something else?

2009-09-15 Thread Bryan Hanson

R Folk:

Please forgive what I'm sure is a fairly naïve question; I hope it's clear.
A colleague and I have been doing a really simple one-off survival analysis,
but this is an area with which we are not very familiar, we just happen to
have gathered some data that needs this type of analysis.  We've done quite
a bit of reading, but answers escape us, even though the question below
seems simple. 

Considering the following example from ?survdiff:

 survdiff(Surv(time, status) ~ pat.karno, data=lung)
Call:
survdiff(formula = Surv(time, status) ~ pat.karno, data = lung)

n=225, 3 observations deleted due to missingness.

   N Observed Expected (O-E)^2/E (O-E)^2/V
pat.karno=30   210.6580.1774 0.179
pat.karno=40   211.3370.0847 0.086
pat.karno=50   441.0797.9088 8.013
pat.karno=60  30   27   15.2379.080810.148
pat.karno=70  41   31   26.2640.8540 1.027
pat.karno=80  51   39   40.8810.0865 0.117
pat.karno=90  60   38   49.4112.6354 3.853
pat.karno=100 35   21   27.1331.3863 1.684

 Chisq= 22.6  on 7 degrees of freedom, p= 0.00202

The p value here is for the entire group (right?).  How do we go about
determining the p value for the comparison of any four arbitrary groups in
all combinations, say pat.karno = 40, 60, 80, and 100?

We know (we think) that we can't just run the coxph analysis for the only
the groups of interest, as the hazard ratio for any one group in an analysis
with several groups is computed by holding the other groups at their average
value, so the hazard ratio varies by the context.

Seems like we need some sort of t-test or chi-squared test, but being mere
chemists and molecular biologists, we don't quite see it and wouldn't trust
ourselves anyway, given the special nature of survival analysis.  Manual
instructions or a function suggestion would be great.

Thanks in Advance, Bryan
*
Bryan Hanson
Professor of Chemistry  Biochemistry
DePauw University, Greencastle IN USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Teasing out logrank differences between groups using survdiff or something else?

2009-09-15 Thread Bryan Hanson

Thomas, thanks for your comments.  We weren't entirely sure we we even
framing the question right, your comments are encouraging.  Here are our
results:

Call:
coxph(formula = Surv(lifespan, status) ~ group, data = four)

  n= 573 

  coef exp(coef) se(coef)  z Pr(|z|)
groupT1 4.1371   62.6224   0.2472 16.734   2e-16 ***
groupT1.U1  3.8921   49.0122   0.2367 16.442   2e-16 ***
groupU1-0.51770.5959   0.1232 -4.201 2.65e-05 ***
---
Signif. codes:  0 ***¹ 0.001 **¹ 0.01 *¹ 0.05 .¹ 0.1  ¹ 1

   exp(coef) exp(-coef) lower .95 upper .95
groupT1  62.62240.0159738.574  101.6637
groupT1.U1   49.01220.0204030.819   77.9462
groupU1   0.59591.67824 0.4680.7586

Rsquare= 0.697   (max possible= 1 )
Likelihood ratio test= 683.6  on 3 df,   p=0
Wald test= 348.9  on 3 df,   p=0
Score (logrank) test = 646.6  on 3 df,   p=0

Which shows that there are huge differences in our treatments.  Here's the
survdiff output on the object created by the call above:

  N Observed Expected (O-E)^2/E (O-E)^2/V
group=WISO  145  145213.2  21.8  39.2
group=T1152  152 52.9 185.5 248.1
group=T1.U1 144  144 52.1 162.0 209.3
group=U1132  132254.8  59.2 130.7

 Chisq= 618  on 3 degrees of freedom, p= 0

To make sure I understand, the null hypothesis here is that these all have
the same survival and censoring functions, and we have shown here that they
do not.

But, we are particularly interesting in comparing the differential effect of
treatments (these are actually genes inserted into Drosophila that are
generally toxic to various degrees).  What's the proper way to show/prove
that:

T1.U1 compared to U1 is more hazardous than T1 vs WISO

If in fact it is true?  Maybe the answer is already in our output, in the
sense that the CI's don't overlap much?  Maybe we are wrong to seek a p
value as well?

Thanks again, Bryan

*
Bryan Hanson
Professor of Chemistry  Biochemistry
DePauw University, Greencastle IN USA

On 9/15/09 10:43 PM, Thomas Lumley tlum...@u.washington.edu wrote:

 
 I think you do in fact want to just run the analysis for the four groups you
 are interested in. The logrank chisquared test would then be of the hypothesis
 that these four groups have the same survival and censoring distributions,
 with the greatest power for detecting proportional-hazards differences between
 the groups.
 
 You are correct in noting that the results you get for comparing these four
 groups would change depending on what other groups are in the analysis. This
 is a seriously underappreciated property of rank-based analyses. However,
 because of this dependence I think you can make a good case that restricting
 the analysis to the groups of interest is the best way to run the test.
 
  -thomas
 
 On Tue, 15 Sep 2009, Bryan Hanson wrote:
 
 R Folk:
 
 Please forgive what I'm sure is a fairly naïve question; I hope it's clear.
 A colleague and I have been doing a really simple one-off survival analysis,
 but this is an area with which we are not very familiar, we just happen to
 have gathered some data that needs this type of analysis.  We've done quite
 a bit of reading, but answers escape us, even though the question below
 seems simple.
 
 Considering the following example from ?survdiff:
 
 survdiff(Surv(time, status) ~ pat.karno, data=lung)
 Call:
 survdiff(formula = Surv(time, status) ~ pat.karno, data = lung)
 
 n=225, 3 observations deleted due to missingness.
 
   N Observed Expected (O-E)^2/E (O-E)^2/V
 pat.karno=30   210.6580.1774 0.179
 pat.karno=40   211.3370.0847 0.086
 pat.karno=50   441.0797.9088 8.013
 pat.karno=60  30   27   15.2379.080810.148
 pat.karno=70  41   31   26.2640.8540 1.027
 pat.karno=80  51   39   40.8810.0865 0.117
 pat.karno=90  60   38   49.4112.6354 3.853
 pat.karno=100 35   21   27.1331.3863 1.684
 
 Chisq= 22.6  on 7 degrees of freedom, p= 0.00202
 
 The p value here is for the entire group (right?).  How do we go about
 determining the p value for the comparison of any four arbitrary groups in
 all combinations, say pat.karno = 40, 60, 80, and 100?
 
 We know (we think) that we can't just run the coxph analysis for the only
 the groups of interest, as the hazard ratio for any one group in an analysis
 with several groups is computed by holding the other groups at their average
 value, so the hazard ratio varies by the context.
 
 Seems like we need some sort of t-test or chi-squared test, but being mere
 chemists and molecular biologists, we don't quite see it and wouldn't trust
 ourselves anyway, given the special nature of survival analysis.  Manual
 instructions or a function suggestion would be great.
 
 Thanks in Advance, Bryan
 *
 Bryan Hanson
 Professor of Chemistry  Biochemistry

[R] xyplot {lattice} are different types possible for each panel?

2009-09-07 Thread Bryan Hanson

Hello R Folks...

Using the example below, I¹d like two of the panels to be plotted with type
= ³p² but the third to be done with type = ³h².  I can¹t use type = c(³p²,
³p², ³h²) because this syntax applies all given types to every panel.  I
don¹t think I can use groups and distribute.type because these are intended
for different styles of plotting within a single panel.  As you can see, I
tried to do a panel function following something I saw in the Lattice book,
but this has no effect at all.  Looks like it may have to be more elaborate,
but I¹m stuck.  Any suggestions appreciated!

Thanks, Bryan
*
Bryan Hanson
Professor of Chemistry  Biochemistry
DePauw University, Greencastle IN USA


y - rnorm(100)
x - rnorm(100)
names - rep(c(Set 1, Set 2, Set 3), 4)
df - data.frame(y = y, x = y, names = as.factor(names))
p - xyplot(y ~ x | names,
layout = c(1, 3),
panel = function(...) {
panel.xyplot(...)
if (panel.number() == 1) type = h
})

plot(p)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] xyplot {lattice} are different types possible for each panel?

2009-09-07 Thread Bryan Hanson

Thanks Baptiste, your suggestion works wonderfully.  Bryan

For anyone following along, the following line needs to replace the similar
one in my original example:
names - rep(c(Set 1, Set 2, Set 3, Set 4), 25)
Or the data lengths will be wrong.



On 9/7/09 4:19 PM, baptiste auguie baptiste.aug...@googlemail.com wrote:

 Hi,
 
 Something like this perhaps,
 
 p - xyplot(y ~ x | names,
 Â Â  layout = c(1, 3),
 Â Â  panel = function(...,type=p) {
 Â Â Â Â Â Â  if (panel.number() == 1) {
 Â Â Â Â Â Â Â Â  panel.xyplot(...,type = h)
 Â Â Â Â Â Â Â Â Â  } else {
 Â Â Â Â Â Â Â Â  panel.xyplot(...,type = type)
 Â Â Â Â Â Â Â Â  }
 Â Â Â Â Â Â  })
 
 plot(p)
 
 HTH,
 
 baptiste
 
 2009/9/7 Bryan Hanson han...@depauw.edu
 Hello R Folks...
 
 Using the example below, IÄd like two of the panels to be plotted with type
 = ÅpË but the third to be done with type = ÅhË. Â I canÄt use type = 
 c(ÅpË,
 ÅpË, ÅhË) because this syntax applies all given types to every panel Â I
 donÄt think I can use groups and distribute.type because these are intended
 for different styles of plotting within a single panel. Â As you can see, I
 tried to do a panel function following something I saw in the Lattice book,
 but this has no effect at all. Â Looks like it may have to be more elaborate,
 but IÄm stuck. Â Any suggestions appreciated!
 
 Thanks, Bryan
 *
 Bryan Hanson
 Professor of Chemistry  Biochemistry
 DePauw University, Greencastle IN USA
 
 
 y - rnorm(100)
 x - rnorm(100)
 names - rep(c(Set 1, Set 2, Set 3), 4)
 df - data.frame(y = y, x = y, names = as.factor(names))
 p - xyplot(y ~ x | names,
  Â  Â layout = c(1, 3),
  Â  Â panel = function(...) {
  Â  Â  Â  Â panel.xyplot(...)
  Â  Â  Â  Â if (panel.number() == 1) type = h
  Â  Â  Â  Â })
 
 plot(p)
 
  Â  Â  Â  Â [[alternative HTML version deleted]]
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Matrix as input to xyplot {lattice} - proper extended formula syntax

2009-09-05 Thread Bryan Hanson

Hello R Folks...

I have a list with the following structure:

 str(df)
List of 3
 $ y: num [1:4, 1:1242] -0.005379 0.029874 -0.023274 0.000655 -0.004537
..
 $ x: num [1:1242] 501 503 505 507 509 ...
 $ names: Factor w/ 4 levels PC Loading 1,..: 1 2 3 4

I want to plot each row of df$y against df$x, and have each plot in it¹s own
panel according to the levels of df$names.  The following works in the sense
that the layout is right, but the y values have clearly been recycled or
skipped in some fashion (and an error is thrown for each panel that the
length of x and y aren¹t the same):

p - xyplot(y ~ x | names, data = df, main = title,
layout = c(1, dim(y)[1])

In reviewing the extended formula interface in the Lattice Book, what I want
to happen is y1 + y2 + y3 + y4 ~ x | names, outer = TRUE

I see two options: figure out a way to create the extended formula on the
fly (and the actual number of rows in y may vary), which seems potentially
tricky, or create a data frame by stacking each row of y and repeating x and
names to match.  This seems like a waste of memory.

I¹ve looked through the archives and haven¹t come across something quite
like this, or at least I don¹t recognize it if I have!  Is there a more
elegant way to tell xyplot I want to use each row of y repeatedly with the
same x, in a loop-like fashion?

TIA.  Bryan
*
Bryan Hanson
Professor of Chemistry  Biochemistry
DePauw University, Greencastle IN USA


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Matrix as input to xyplot {lattice} - proper extended formula syntax

2009-09-05 Thread Bryan Hanson

Thanks David, your way of constructing df is much more compact than what I
was using, so I've incorporated it.  I also had my rows and columns
transposed relative to how xyplot wanted them (though I had tested for that,
other problems interfered).

In my case, I may have varying numbers of y columns, from y.1 to y.n let's
say.  Is there an easy way of creating the phrase y.1+y.2+...y.n to pass to
xyplot, or even better, some sort of syntax that says take all y.n and
plot them against x?

Thanks, Bryan


On 9/6/09 12:51 AM, David Winsemius dwinsem...@comcast.net wrote:

 I'm not exactly sure what structure df has. Here's my effort to
 duplicate it:
 
 df - data.frame(y=matrix(rnorm(24), nrow=6), x=1:6)
 df
   y.1y.2y.3y.4 x
 1  0.1734636  0.2348417 -1.2375648 -1.3246439 1
 2  1.9551669 -1.1027262 -0.7307332  0.3953752 2
 3 -0.7645778  1.6297861  0.4743805 -0.4476145 3
 4 -0.5308756 -0.5246534 -0.3854609 -1.609 4
 5  0.7406525 -0.8691720 -0.8194084  1.6122059 5
 6 -0.9625619 -1.0774165  1.0760829  0.3659436 6
 
 And this seems to accomplish the desired task. Presumably you have
 assigned off-stage the value of title to a meaningful character string?
 
 p - xyplot(y.1+y.2+y.3+y.4 ~ x |1:4, data = df, main =
 title ,layout=c(1,4) )
 p
 
 
 
 
 On Sep 5, 2009, at 11:52 PM, Bryan Hanson wrote:
 
 Hello R Folks...
 
 I have a list with the following structure:
 
 str(df)
 List of 3
 $ y: num [1:4, 1:1242] -0.005379 0.029874 -0.023274 0.000655
 -0.004537
 ..
 $ x: num [1:1242] 501 503 505 507 509 ...
 $ names: Factor w/ 4 levels PC Loading 1,..: 1 2 3 4
 
 I want to plot each row of df$y against df$x, and have each plot in
 it¹s own
 panel according to the levels of df$names.  The following works in
 the sense
 that the layout is right, but the y values have clearly been
 recycled or
 skipped in some fashion (and an error is thrown for each panel that
 the
 length of x and y aren¹t the same):
 
 p - xyplot(y ~ x | names, data = df, main = title,
layout = c(1, dim(y)[1])
 
 In reviewing the extended formula interface in the Lattice Book,
 what I want
 to happen is y1 + y2 + y3 + y4 ~ x | names, outer = TRUE
 
 I see two options: figure out a way to create the extended formula
 on the
 fly (and the actual number of rows in y may vary), which seems
 potentially
 tricky, or create a data frame by stacking each row of y and
 repeating x and
 names to match.  This seems like a waste of memory.
 
 I¹ve looked through the archives and haven¹t come across something
 quite
 like this, or at least I don¹t recognize it if I have!  Is there a
 more
 elegant way to tell xyplot I want to use each row of y repeatedly
 with the
 same x, in a loop-like fashion?
 
 TIA.  Bryan
 *
 Bryan Hanson
 Professor of Chemistry  Biochemistry
 DePauw University, Greencastle IN USA
 
 
 [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Google's R Style Guide (has become S3 vs S4, in part)

2009-09-01 Thread Bryan Hanson

Looks like the discussion is no longer about R Style, but S3 vs S4?

To that end, I asked more or less the same question a few weeks ago, arising
from the much the same motivations.  The discussion was helpful, here's the
link:  

http://www.nabble.com/Need-Advice%3A-Considering-Converting-a-Package-from-S
3-to-S4-tc24901482.html#a24904049

For what it's worth, I decided, but with some ambivalence, to stay with S3
for now and possibly move to S4 later.  In the spirit of S4, I did write a
function that is nearly the equivalent of validObject for my S3 object of
interest.

Overall, it looked like I would have to spend a lot of time moving to S4,
while staying with S3 would allow me to get the project done and get results
going much faster (see Frank Harrell's comment in the thread above).

As a concrete example (concrete for us non-programmers, non-statisticians),
I recently decided that I wanted to add a descriptive piece of text to a
number of my plots, and it made sense to include the text with the object.
So I just added a list element to the existing S3 object, e.g.
Myobject$descrip  No further work was necessary, I could use it right away.
If instead, if I had made Myobject an S4 object, then I would have to go
back, redefine the object, update validObject, and possibly write some new
accessor and definitely constructor functions.  At least, that's how I
understand the way one uses S4 classes.

Back to trying to get something done!  Bryan
*
Bryan Hanson
Professor of Chemistry  Biochemistry
DePauw University, Greencastle IN USA





On 9/1/09 6:16 AM, Duncan Murdoch murd...@stats.uwo.ca wrote:

 Corrado wrote:
 Thanks Duncan, Spencer,
 
 To clarify, the situation is:
 
 1) I have no reasons to choose S3 on S4 or vice versa, or any other coding
 convention
 2) Our group has not done any OO developing in R and I would be the first, so
 I 
 can set up the standards
 3) I am starting from scratch with a new package, so I do not have any code I
 need to re-use.
 4) I am an R OO newbie, so whatever I can learn from the beginning what is
 better and good for me.
 
 So the questions would be two:
 
 1) What coding style guide should we / I follow? Is the google style guide
 good, or is there something better / more prescriptive which makes our
 research group life easier?
   
 
 I don't think I can answer that.  I'd recommend planning to spend some
 serious time on the decision, and then go by your personal impression.
 S4 is definitely harder to learn but richer, so don't make the decision
 too quickly.  Take a look at John Chamber's new book, try small projects
 in each style, etc.
 
 2) What class type should I use? From what you two say, I should use S3
 because is easier to use  what are the disadvantages? Is there an
 advantages / disadvantages table for S3 and S4 classes?
   
 
 S3 is much more limited than S4.  It dispatches on just one argument, S4
 can dispatch on several.  S3 allows you to declare things to be of a
 certain class with no checks that anything will actually work; S4 makes
 it easier to be sure that if you say something is of a certain class, it
 really is.  S4 hides more under the hood: if you understand how regular
 R functions work, learning S3 is easy, but there's still a lot to learn
 before you'll be able to use S4 properly.
 
 Duncan Murdoch
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Lattice in a loop does not produce output

2009-08-18 Thread Bryan Hanson

Lattice objects must be assigned and deliberately printed:

 png(test.png)
 p - xyplot(y~x|z)
 plot(p)
 dev.off()

Should fix both problems.  Bryan
*
Bryan Hanson
Professor of Chemistry  Biochemistry
DePauw University, Greencastle IN USA




On 8/18/09 8:13 AM, Alex van der Spek am...@xs4all.nl wrote:

 I cannot understand why xyplot does not work within a simple for loop.
 
 This works up to the for loop; inside the for loop the png files are
 opened and closed, but nothing is plotted. No error messages are written
 to the console either. This is the case on both Windows and Linux.
 
 By the way, running the script below on Linux using source() does not
 even produce the first xyplot. This is less of an issue for me though.
 
 #! usr/bin/env R
 # Test lattice loop
 
 rm(list=ls())
 
 x-1:16
 y-2*x-1
 z-rep(c('A','B','C','D'),4)
 
 xyz-data.frame(x=x,y=y,z=z)
 
 require(lattice)
 
 png('Test.png')
 xyplot(y~x|z)
 dev.off()
 
 for (i in 1:5) {
 f-paste('Test',i,'.png',sep='')
 png(f)
 xyplot(y~x|z)
 dev.off()
 }
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Selecting/Accessing the last vector in a list of a list of data.frames

2009-08-11 Thread Bryan Hanson

Hello Again R Folks:

I¹m trying to clean up some code.  Suppose I have an object like this:

 str(test)
List of 2
 $ G:List of 2
  ..$ cls:'data.frame':101 obs. of  2 variables:
  .. ..$ V1: num [1:101] -0.0019 -0.0019 -0.00189 -0.00188 -0.00186 ...
  .. ..$ V2: num [1:101] 0.000206 0.000247 0.000288 0.000329 0.000371 ...
  ..$ rob:'data.frame':101 obs. of  2 variables:
  .. ..$ V1: num [1:101] -0.00142 -0.00141 -0.0014 -0.00139 -0.00137 ...
  .. ..$ V2: num [1:101] 0.000424 0.000456 0.000487 0.000517 0.000546 ...
 $ T:List of 2
  ..$ cls:'data.frame':101 obs. of  2 variables:
  .. ..$ V1: num [1:101] -0.00222 -0.00222 -0.00221 -0.00219 -0.00216 ...
  .. ..$ V2: num [1:101] -0.00077 -0.000742 -0.000712 -0.000681 -0.000648
..
  ..$ rob:'data.frame':101 obs. of  2 variables:
  .. ..$ V1: num [1:101] -0.000981 -0.000979 -0.000972 -0.000961 -0.000946
..
  .. ..$ V2: num [1:101] -0.000332 -0.000303 -0.000274 -0.000245 -0.000216
..

I need to perform some operations on each value of V1 in turn, then each
value of V2 in turn (so for instance I want test$G$cls$V1).  The structure
of this object is nearly constant except the first elements of the list (G,
T in the example) may vary in number and name, so I need something that
accommodates this.

I can do this with loops, but it seems like a job for lapply or rapply, but
these don't quite work.  I've played with quite a few variations, searched
the help archives and found a number of useful ideas, but not quite what I
need.  The only thing that nearly works is do.call(cbind, object) enough
times to bring V1 and V2 to the surface but then I've lost my carefully
constructed naming.

Any suggestions appreciated.  It seems like there might be a simple
approach, but I may be too tired right now to see it!

Thanks, Bryan
*
Bryan Hanson
Professor of Chemistry  Biochemistry
DePauw University, Greencastle IN USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Selecting/Accessing the last vector in a list of a list of data.frames

2009-08-11 Thread Bryan Hanson

Thanks Henrique.  I would have not thought of the syntax you suggest, though
it embodies the sort of multilevel (not quite recursive) application of
lapply I was thinking of.  However, it returns ³test² with V2 missing,
everything else intact.  Strange; I can't really state in words what I think
it should do, much less what it does do!

I think an easier approach for me will be to re-write the function that
generates test so it is simpler to extract what I need.  I will think on
it.

Thanks, Bryan

 temp - lapply(test, lapply, '[', 'V1')
 str(temp)
List of 2
 $ G:List of 4
  ..$ cls:'data.frame':101 obs. of  1 variable:
  .. ..$ V1: num [1:101] -0.0019 -0.0019 -0.00189 -0.00188 -0.00186 ...
  ..$ rob:'data.frame':101 obs. of  1 variable:
  .. ..$ V1: num [1:101] -0.00142 -0.00141 -0.0014 -0.00139 -0.00137 ...
  ..$ c  : num NA
  ..$ r  : num NA
 $ T:List of 4
  ..$ cls:'data.frame':101 obs. of  1 variable:
  .. ..$ V1: num [1:101] -0.00222 -0.00222 -0.00221 -0.00219 -0.00216 ...
  ..$ rob:'data.frame':101 obs. of  1 variable:
  .. ..$ V1: num [1:101] -0.000981 -0.000979 -0.000972 -0.000961 -0.000946
..
  ..$ c  : num NA
  ..$ r  : num NA


On 8/11/09 7:28 PM, Henrique Dallazuanna www...@gmail.com wrote:

 If I understand correctly your question, you can try something about like
 this:
 
 # Access all elements named 'V1' in your list
 lapply(test, lapply, '[', 'V1')
 
 
 On Tue, Aug 11, 2009 at 3:49 PM, Bryan Hanson han...@depauw.edu wrote:
 Hello Again R Folks:
 
 I¹m trying to clean up some code.  Suppose I have an object like this:
 
 str(test)
 List of 2
  $ G:List of 2
   ..$ cls:'data.frame':    101 obs. of  2 variables:
   .. ..$ V1: num [1:101] -0.0019 -0.0019 -0.00189 -0.00188 -0.00186 ...
   .. ..$ V2: num [1:101] 0.000206 0.000247 0.000288 0.000329 0.000371 ...
   ..$ rob:'data.frame':    101 obs. of  2 variables:
   .. ..$ V1: num [1:101] -0.00142 -0.00141 -0.0014 -0.00139 -0.00137 ...
   .. ..$ V2: num [1:101] 0.000424 0.000456 0.000487 0.000517 0.000546 ...
  $ T:List of 2
   ..$ cls:'data.frame':    101 obs. of  2 variables:
   .. ..$ V1: num [1:101] -0.00222 -0.00222 -0.00221 -0.00219 -0.00216 ...
   .. ..$ V2: num [1:101] -0.00077 -0.000742 -0.000712 -0.000681 -0.000648
 ..
   ..$ rob:'data.frame':    101 obs. of  2 variables:
   .. ..$ V1: num [1:101] -0.000981 -0.000979 -0.000972 -0.000961 -0.000946
 ..
   .. ..$ V2: num [1:101] -0.000332 -0.000303 -0.000274 -0.000245 -0.000216
 ..
 
 I need to perform some operations on each value of V1 in turn, then each
 value of V2 in turn (so for instance I want test$G$cls$V1).  The structure
 of this object is nearly constant except the first elements of the list (G,
 T in the example) may vary in number and name, so I need something that
 accommodates this.
 
 I can do this with loops, but it seems like a job for lapply or rapply, but
 these don't quite work.  I've played with quite a few variations, searched
 the help archives and found a number of useful ideas, but not quite what I
 need.  The only thing that nearly works is do.call(cbind, object) enough
 times to bring V1 and V2 to the surface but then I've lost my carefully
 constructed naming.
 
 Any suggestions appreciated.  It seems like there might be a simple
 approach, but I may be too tired right now to see it!
 
 Thanks, Bryan
 *
 Bryan Hanson
 Professor of Chemistry  Biochemistry
 DePauw University, Greencastle IN USA
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Need Advice: Considering Converting a Package from S3 to S4

2009-08-10 Thread Bryan Hanson

Hello R Folks...  

Not a technical question, but I need some advice and perspective.

I¹ve got a set of functions I¹m planning to put together into a package.
The main hunk of data that gets used by different functions is currently an
S3 list.  I¹ve been reading about S4 objects, and I see the (numerous)
advantages of them.  I have seen the recommendation that all new packages be
done with S4.  Before I get much farther, I need to decide if I will go to
S4 for this central hunk of data.

My questions are about making the conversion, whether it is worth the
trouble and what pitfalls I might encounter.  I can easily (re)define my key
list as an S4 object.  But after that...

1.  It seems the the simplest/minimalist approach is to update all the
functions so that where I use ³data$element² I replace it with ³d...@slot².
Is it really this easy, or have I missed something?  Easy or not, this by
itself doesn't take advantage of much, except the ability to define
subclasses at a later date (maybe that is sufficient reason though).

2.  I also see in my reading that I should consider writing accessor
functions for my object.  What I can't quite see is why I would want to do
this, if I can get the contents with d...@slot?  What am I missing here?

3. At this point, I'm not sure that I would write specific methods for this
proposed S4 object.  It would not be necessary in the short run.  Making it
S4 would mainly allow for future expansion as they say.  If methods are
not critical, does it make sense to spend the time making the change?

Any perspective and advice would be welcomed.  Thanks in advance, Bryan
*
Bryan Hanson
Professor of Chemistry  Biochemistry
DePauw University, Greencastle IN USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] str(data.frame) after subsetting reflects original structure, not subsetted structure?

2009-07-24 Thread Bryan Hanson

I find that after subsetting (you may prefer conditional selection) a data
frame and assigning it to a new object, the str(new object) reflects the
original data frame, not the new one:

A - rnorm(20)
B - factor(rep(c(t, g), 10))
C - factor(rep(c(h, l), 10))
D - data.frame(A, B, C)

str(D) # reports correctly

E - D[D$C == h,]

str(E) # reports that D$C still has 2 levels, but
E # or E$C shows that subsetting worked properly
Summary(E) # shows the original structure and that subsetting worked

Is this the expected behavior, and if so, is there a particular rationale?
I would be pretty certain that the information about E was inherited from D,
but why wasn't it updated to reflect the revised object?  Is there an
argument that I can use to force the updating?

For better or worse, I use str() a lot to check my work, and in this case,
it seems to have misled me.

Thanks as always, Bryan
*
Bryan Hanson
Professor of Chemistry  Biochemistry
DePauw University, Greencastle IN USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] str(data.frame) after subsetting reflects original structure, not subsetted structure?

2009-07-24 Thread Bryan Hanson

Thanks Marc and Ben...

Your answers were most helpful.

I suspected something had been written about it,  but was having trouble
formulating a reasonable search query.  I was looking in the help page for
str(), which was sort of a dead end.

Bryan
*
Bryan Hanson
Professor of Chemistry  Biochemistry
DePauw University, Greencastle IN USA



On 7/24/09 9:46 AM, Marc Schwartz marc_schwa...@me.com wrote:

 On Jul 24, 2009, at 8:17 AM, Bryan Hanson wrote:
 
 I find that after subsetting (you may prefer conditional
 selection) a data
 frame and assigning it to a new object, the str(new object) reflects
 the
 original data frame, not the new one:
 
 A - rnorm(20)
 B - factor(rep(c(t, g), 10))
 C - factor(rep(c(h, l), 10))
 D - data.frame(A, B, C)
 
 str(D) # reports correctly
 
 E - D[D$C == h,]
 
 str(E) # reports that D$C still has 2 levels, but
 E # or E$C shows that subsetting worked properly
 Summary(E) # shows the original structure and that subsetting worked
 
 Is this the expected behavior, and if so, is there a particular
 rationale?
 I would be pretty certain that the information about E was inherited
 from D,
 but why wasn't it updated to reflect the revised object?  Is there an
 argument that I can use to force the updating?
 
 For better or worse, I use str() a lot to check my work, and in this
 case,
 it seems to have misled me.
 
 Thanks as always, Bryan
 
 See ?[.factor which is the extract (subset) method for factors. Note
 that the 'drop' argument is FALSE by default. It is this argument that
 controls the retention of unused factor levels.
 
 The reason that it is FALSE by default is to ensure that if you are
 comparing factors from more than one data source, the comparisons of
 or the use of the factor levels are consistent.
 
 For one approach to dropping unused factor levels from a data frame,
 see:
 

 http://wiki.r-project.org/rwiki/doku.php?id=tips:data-manip:drop_unused_levels
 
 HTH,
 
 Marc Schwartz


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] panel.lmline - are m, b, and r^2 accessible somehow?

2009-07-22 Thread Bryan Hanson

Hi R Folks...

Are the results of a fit carried out by panel.lmline readily available for
use in a lattice plot?  I¹d like to put r^2, m, and b on each panel.  I can
certainly write something that does this manually and then use it with
panel.text, but if it¹s already available, that would be preferable,
especially as lattice permits condition and subsetting so readily.

Looking at panel.abline, I see that it returns invisibly (which makes sense
as one doesn¹t assign panel.abline to a variable) so my guess is the answer
is no.

The only thing I could find in the archives had the calculations done
manually within panel.groups (
http://www.nabble.com/add-trend-line-to-each-group-of-data-in%3A-xyplot(y1%2
By2-~-x-|-grp...-td3344023.html#a3382909) but that was a few versions back.

Other suggestions?

Thanks, Bryan
*
Bryan Hanson
Professor of Chemistry  Biochemistry
DePauw University, Greencastle IN USA


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Managing Packages: Which functions call other functions in package?

2009-07-01 Thread Bryan Hanson

R Colleagues:

I¹m moving toward building my own package, and it occurs to me that it might
be useful to have some method of listing or better, graphically displaying,
which functions call other functions within the package.  In other words,
I¹m seeking some means of seeing how the functions relate to each other.

I¹ve looked around a bit, and one way to do this might be via one of the
network graphing approaches used inter alia in Bioconductor.  But I suspect
someone has created such a tool already and I¹m just lacking the proper key
words.  It seems like something a version control system might provide;
maybe that¹s where I should be looking?

Does such a thing exist?

Thanks, Bryan
*
Bryan Hanson
Professor of Chemistry  Biochemistry
DePauw University, Greencastle IN USA


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How do I define the method for gcheckboxgroup in gWidgets?

2009-06-27 Thread Bryan Hanson

Thanks Michael...

I was working by analogy to the gbuttons, so I was trying to ³add² the
gcheckboxgroup, which is apparently not necessary (due to, I guess, the
intrinsic differences between the widgets).  The index thing I was just
screwed up on!  I have it working now.

Nice package.  Bryan


On 6/27/09 8:25 AM, Michael Lawrence mflaw...@fhcrc.org wrote:

 
 
 On Thu, Jun 25, 2009 at 8:29 AM, Bryan Hanson han...@depauw.edu wrote:
 Hi All...
 
 I¹m trying to build a small demo using gWidgets which permits interactive
 scaling and selection among different things to plot.  I can get the widgets
 for scaling to work just fine.  I am using gcheckboxgroup to make the
 (possibly multiple) selections.  However, I can¹t seem to figure out how to
 properly define the gcheckboxgroup; I can draw the widget properly, I think
 my handler would use the svalue right if it actually received it.  Part of
 the problem is using the index of the possible values rather than the values
 themselves, but I'm pretty sure this is not all of the problem.  I've been
 unable to find an example like this in any of the various resources I've
 come across.
 
 BTW,  report.which is really only there for troubleshooting.  It works to
 return the values, I can't get it to return the indices, which are probably
 what I need in this case.
 
 A demo script is at the bottom and the error is just below.
 
  tmp - gcheckboxgroup(stuff, handler = report.which, index = TRUE,
 +     checked = c(TRUE, FALSE, FALSE, FALSE, FALSE), container = leftPanel)
 
 The above code should define the gcheckboxgroup.
 
  add(tmp, value = 1, expand = TRUE)
 
 I'm not sure what you are trying to add here.
  
 
 Error in function (classes, fdef, mtable)  :
   unable to find an inherited method for function .add, for signature
 gCheckboxgroupRGtk, guiWidgetsToolkitRGtk2, numeric
 
 This error suggests that I don't have a method - I agree, but I don't know
 what goes into the method for gcheckboxgroup.
 
 For the sliders, it's clear to me how the actions and drawing of the widgets
 differ, but not so for gcheckboxgroup.
 
 A big TIA, Bryan
 *
 Bryan Hanson
 Professor of Chemistry  Biochemistry
 DePauw University, Greencastle IN USA
 
 Full Script:
 
 x - 1:10
 y1 - x
 y2 - x^2
 y3 - x^0.5
 y4 - y^3
 df - as.data.frame(cbind(x, y1, y2, y3, y4))
 stuff - c(y = x, y = x^2, y = x^0.5, y = x^3)
 which.y - 2 # inital value, to be changed later by the widget
 
 # Define a function for the widget handlers
 
 update.Plot - function(h,...) {
     plot(df[,1], df[,svalue(which.y)], type = l,
     ylim = c(0, svalue(yrange)), main = Interactive Selection  Scaling,
     xlab = x values, ylab = y values)
     }
 
 report.which - function(h, ...) { print(svalue(h$obj), index = TRUE) }
 
 
 In the above handler, do you mean to pass the 'index' parameter to the
 svalue() function?
  
 
 
 # Define the actions  type of widget, along with returned values.
 # Must be done before packing widgets.
 
 yrange - gslider(from = 0, to = max(y), by = 1.0,
     value = max(y), handler = update.Plot)
 which.y - gcheckboxgroup(stuff, handler = report.which, index = TRUE,
     checked = c(TRUE, FALSE, FALSE, FALSE, FALSE))
 
 # Assemble the graphics window  groups of containers
 
 mainWin - gwindow(Interactive Plotting)
 bigGroup - ggroup(cont = mainWin)
 leftPanel - ggroup(horizontal = FALSE, container = bigGroup)
 
 # Format and pack the widgets,  link to their actions/type
 
 tmp - gframe(y range, container = leftPanel)
 add(tmp, yrange, expand = TRUE)
 tmp - gcheckboxgroup(stuff, handler = report.which, index = TRUE,
     checked = c(TRUE, FALSE, FALSE, FALSE, FALSE), container = leftPanel)
 add(tmp, value = 1, expand = TRUE)
 
 
 # Put it all together
 
 add(mainWin, ggraphics()) # puts the active graphic window w/i mainWin
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How do I define the method for gcheckboxgroup in gWidgets?

2009-06-25 Thread Bryan Hanson

Hi All...

I¹m trying to build a small demo using gWidgets which permits interactive
scaling and selection among different things to plot.  I can get the widgets
for scaling to work just fine.  I am using gcheckboxgroup to make the
(possibly multiple) selections.  However, I can¹t seem to figure out how to
properly define the gcheckboxgroup; I can draw the widget properly, I think
my handler would use the svalue right if it actually received it.  Part of
the problem is using the index of the possible values rather than the values
themselves, but I'm pretty sure this is not all of the problem.  I've been
unable to find an example like this in any of the various resources I've
come across.

BTW,  report.which is really only there for troubleshooting.  It works to
return the values, I can't get it to return the indices, which are probably
what I need in this case.

A demo script is at the bottom and the error is just below.

 tmp - gcheckboxgroup(stuff, handler = report.which, index = TRUE,
+ checked = c(TRUE, FALSE, FALSE, FALSE, FALSE), container = leftPanel)
 add(tmp, value = 1, expand = TRUE)
Error in function (classes, fdef, mtable)  :
  unable to find an inherited method for function .add, for signature
gCheckboxgroupRGtk, guiWidgetsToolkitRGtk2, numeric

This error suggests that I don't have a method - I agree, but I don't know
what goes into the method for gcheckboxgroup.

For the sliders, it's clear to me how the actions and drawing of the widgets
differ, but not so for gcheckboxgroup.

A big TIA, Bryan
*
Bryan Hanson
Professor of Chemistry  Biochemistry
DePauw University, Greencastle IN USA

Full Script:

x - 1:10
y1 - x
y2 - x^2
y3 - x^0.5
y4 - y^3
df - as.data.frame(cbind(x, y1, y2, y3, y4))
stuff - c(y = x, y = x^2, y = x^0.5, y = x^3)
which.y - 2 # inital value, to be changed later by the widget

# Define a function for the widget handlers

update.Plot - function(h,...) {
plot(df[,1], df[,svalue(which.y)], type = l,
ylim = c(0, svalue(yrange)), main = Interactive Selection  Scaling,
xlab = x values, ylab = y values)
} 

report.which - function(h, ...) { print(svalue(h$obj), index = TRUE) }

# Define the actions  type of widget, along with returned values.
# Must be done before packing widgets.

yrange - gslider(from = 0, to = max(y), by = 1.0,
value = max(y), handler = update.Plot)
which.y - gcheckboxgroup(stuff, handler = report.which, index = TRUE,
checked = c(TRUE, FALSE, FALSE, FALSE, FALSE))

# Assemble the graphics window  groups of containers

mainWin - gwindow(Interactive Plotting)
bigGroup - ggroup(cont = mainWin)
leftPanel - ggroup(horizontal = FALSE, container = bigGroup)

# Format and pack the widgets,  link to their actions/type

tmp - gframe(y range, container = leftPanel)
add(tmp, yrange, expand = TRUE)
tmp - gcheckboxgroup(stuff, handler = report.which, index = TRUE,
checked = c(TRUE, FALSE, FALSE, FALSE, FALSE), container = leftPanel)
add(tmp, value = 1, expand = TRUE)


# Put it all together

add(mainWin, ggraphics()) # puts the active graphic window w/i mainWin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Questíon regarding the use of write.csv2 , write.table ...

2009-06-18 Thread Bryan Hanson

Write.table will give you all the control you need to get exactly what you
want.  Bryan


On 6/18/09 7:50 AM, xavier.char...@free.fr xavier.char...@free.fr wrote:

 Hi,

It sounds like the first column that is added is actually the row
 names. That's why a previous answer pointed this argumented. Default for
 write.csv is to write the row names along with the data. So, this should
 work:
write.csv2(exampleDataframe,file=exampleDataframe.csv,
 row.names=FALSE)

Xavier


- Mail Original -
De: Lavri Labi
 lavri.l...@tu-dortmund.de
À: Jorge Ivan Velez
 jorgeivanve...@gmail.com
Cc: r-help@r-project.org
Envoyé: Jeudi 18 Juin 2009
 12h35:31 GMT +01:00 Amsterdam / Berlin / Berne / Rome / Stockholm /
 Vienne
Objet: Re: [R] Questíon regarding the use of write.csv2, write.table
 ...

Dear Jorge,

thank you for the quick answer. But I am afraid you didn´t
 understand my
problem. I want to write the following data frame
 exampleDataframe in a
csv2-file.

a;b;c;d
 1 ; 2 ; 3 ; 4
 5 ; 6 ; 7
 ; 8
 9 ; 0 ; 1 ; 2

After sending the
 command:

write.csv2(exampleDataframe,file=exampleDataframe.csv)

I become
 the following file:

;a;b;c;d
1 ; 1 ; 2 ; 3 ; 4
2 ; 5 ; 6 ; 7 ; 8
3
 ; 9 ; 0 ; 1 ; 2

How can I delete the first column added, which I do not
 need?

The row.names you suggest me is not reallly helpful in this
 case.

Cheers,

Lavri



 Dear Lavri,
 Take a look at the row.names argument
 in ?write.table.

 HTH,

 Jorge


 On Thu, Jun 18, 2009 at 4:09 AM,
 Lavri Labi
 lavri.l...@tu-dortmund.dewrote:

 Hi all,

 I use
 write.csv and write.table to write a data frame in a file like

 following:
 write.csv2(allRandomTestCase_XDroped,
 allRandomTestCase.csv)
 But in the created file allRandomTestCase.csv an
 additional column
 with
 consecutive numbers is automatically added to the
 column of the data
 frame
 allRandomTestCase_XDroped.

 That is why
 my question, how can I write data in a file without this
 added

 column?

 Cheers,

 Lavri


 __
 R-help@r-project.org mailing
 list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the
 posting guide
 http://www.R-project.org/posting-guide.html
 and provide
 commented, minimal, self-contained, reproducible
 code.



__
r-h...@r-project.or
 g mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the
 posting guide http://www.R-project.org/posting-guide.html
and provide
 commented, minimal, self-contained, reproducible
 code.

__
R-help@r-project.org
 mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the
 posting guide http://www.R-project.org/posting-guide.html
and provide
 commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R in the NY Times

2009-01-07 Thread Bryan Hanson

I believe the SAS person shot themselves in the foot more in more ways than
one.  In my mind, the reason you would pay, as Frank said, for
 
 non-peer-reviewed software with hidden implementations of analytic
 methods that cannot be reproduced by others

Would be so that you can sue them later when a software problem in the
designing of the engine makes your plane fall out of the sky!

Bryan
*
Bryan Hanson
Professor of Chemistry  Biochemistry
DePauw University, Greencastle IN USA


 ³I think it addresses a niche market for high-end data analysts that
 want free, readily available code, said Anne H. Milley, director of
 technology product marketing at SAS. She adds, ³We have customers who
 build engines for aircraft. I am happy they are not using freeware
 when I get on a jet.²
 
 
 Thanks for posting.  Does anyone else find the statement by SAS to be
 humourous yet arrogant and short-sighted?
 
 Kevin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Are there any guis out there, which will allow editing of the graph?

2008-08-04 Thread Bryan Hanson

A colleague of mine, quite by accident, discovered that Adobe Illustrator
can manipulate plots made by base graphics, and when you do, many pieces of
the plot are separate items that can be manipulated with Illustrator.  He
cuts and pastes from a Quartz window on his Mac, into Illustrator.
Apparently Illustrator has two kinds of selection arrows, one of which
selects groups of things, the other selects individual things.  He confirms
that text from R may be changed, colors in polygonal areas may be changed,
objects may be moved etc once they are selected.

Apparently he saw a notation that this was possible on some R code that went
with a Wikipedia entry, and he tried it, and it works.

YMMV. Bryan
*
Bryan Hanson
Professor of Chemistry  Biochemistry
DePauw University, Greencastle IN USA



On 8/4/08 1:09 PM, Bert Gunter [EMAIL PROTECTED] wrote:

 No. Can't be. Editable graphs require that the graph be produced via code
 that produces changeable components. All R graphs are essentially static.
 
 That said, caveats: graphs drawn via the grid package functionality -- for
 example lattice graphs -- **are** produced via changeable code. If you read
 the lattice docs carefully, you'll see that there are a few features there
 that allow some graph editing. There may be other packages that also have
 some editing capabilties. R's base graphics also allow a little interaction
 via identify() and locator(), which can be useful (e.g. for positioning
 legends).
 
 One can also simulate interactivity by recording various components of
 graph construction and then modifying and redrawing them. But this is just
 manually doing what you're looking for, so probably a dumb suggestion.
 
 While graph editing certainly can be a nice feature, it is very difficult to
 implement without severely constraining graphing flexibility (IMO, of
 course). Graphs are very complex beasties, so it's hard to write clean code
 that allows flexibile editing capabilities. Look at S-Plus's graph editing,
 which I always found harder to use (and more buggy) than just issuing the
 commands. (To be fair, it's been some years since I tried).
 
 Again, just my 2 bits. Others may well disagree (and perhaps point you to
 what you seek).
 
 Cheers,
 Bert
 
 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
 Behalf Of Arthur Roberts
 Sent: Monday, August 04, 2008 9:51 AM
 To: [EMAIL PROTECTED]
 Subject: [R] Are there any guis out there,which will allow editing of the
 graph?
 
 Hi, all,
 
 I would like to know if there is any gui interface out there
 (academic or commercial) that allows one to edit R-language generated
 graphs (e.g positioning x axis labels.)  It would be nice to have
 something like the user interface of Igor or Origin.  I have already
 used JGR and R-gui.  These are good, but they don't allow one to
 easily edit graphs.  I have also tried locator() and the package
 iplots.  Your input is greatly appreciated.
 
 Best wishes,
 Art Roberts
 University of Washington
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Properly Parsing Pre-Superscripts Displaying Them With grid.text

2008-08-02 Thread Bryan Hanson

Thanks Gavin, that nicely solved one problem.  On a fresh look at the
archives, I see my other problem was trying to paste expressions, a bad
idea.  So, I'm writing each line separately.  All problems are fixed!

By the way, I discovered from the archives that to get a % in the final
output, you have to quote it in the expression: % which I suppose is a
general feature.  I may have missed it, but that behavior doesn't seem to be
mentioned in the plotmath help page - perhaps it's too obvious?

Thanks, Bryan 


On 8/2/08 2:19 PM, Gavin Simpson [EMAIL PROTECTED] wrote:

 On Fri, 2008-08-01 at 17:23 -0400, Bryan Hanson wrote:
 Hi all... I¹m making a chart dealing with frequencies of isotopes of various
 elements.  For instance, I'd like the following text to appear on a chart
 with the 35 and 37 as superscripts:
 
 Based upon:
 35Cl: 75%
 37Cl: 25%
 
 I am having problems properly parsing the superscript that preceeds the
 Cl, since there is no character ahead of the superscript (I saw examples
 in the archives where there was a preceeding character).  Also, the
 construction of the string seems to not be working as I expect either.  So,
 I think there are two problems here.  Here is a sample of what doesn't quite
 work:
 
 expression(phantom()^{35}*Cl[1])
 
 works if I understand what you want. phantom() is documented
 on ?plotmath (?phantom is an alias for this help page also) and allows
 you to leave space as though argument was there, but I use it here with
 no object so no space left but this has the side effect of allowing the
 superscript for this space.
 
 Note that you need to wrap multiple character superscripts in {} ([] for
 subscripts). Also, you need to produce a valid expression so the *
 achieves this between the two components (the phantom()^{35} and the
 Cl[1] bits). You could also achieve the same result by pasting the bits
 together:
 
 expression(paste(phantom()^{35}, Cl[1]))
 
 but the former seems more familiar and intuitive to me now after
 grappling with plotmath for a while.
 
 G
 
 
 Cl1 - rbinom(1000, size = 1, prob = 0.25)
 pCl1 - histogram(Cl1, main = expression(Cl[1]), xlab = , ylab = ,
 scales = list(draw = FALSE), ylim = c(0:80))
 plot(pCl1)
 # This works fine but doesn't have everything I want:
 leg.txt1 - paste(Based upon:\n, : 75%\n, : 25%, sep = )
 grid.text(leg.txt1, 0.5, 0.5)
 # This paste doesn't work due to the expression statements:
 leg.txt2 - paste(Based upon:\n, expression(^35*Cl), : 75%\n,
 expression(^37*Cl), : 25%, sep = )
 # This doesnt' produce an error, but doesn't produce what is wanted either,
 # as the expression is taken (almost) literally:
 leg.txt3 - paste(Based upon:\n, expression(^35*Cl), : 75%\n,
 expression(^37*Cl), : 25%, sep = )
 grid.text(leg.txt3, 0.5, 0.3)
 
 From watching the help list, I know parsing things can be tricky.
 
 TIA, Bryan
 
 sessionInfo()
 R version 2.7.1 (2008-06-23)
 i386-apple-darwin8.10.1
 
 locale:
 en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
 
 attached base packages:
 [1] datasets  grid  grDevices graphics  stats utils methods
 [8] base 
 
 other attached packages:
  [1] fastICA_1.1-9 DescribeDisplay_0.1.3 ggplot_0.4.2
  [4] RColorBrewer_1.0-2reshape_0.8.0 MASS_7.2-42
  [7] pcaPP_1.5 mvtnorm_0.9-0 hints_1.0.1-1
 [10] mvoutlier_1.3 robustbase_0.2-8  lattice_0.17-8
 [13] rggobi_2.1.9  RGtk2_2.12.5-3
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Properly Parsing Pre-Superscripts Displaying Them With grid.text

2008-08-01 Thread Bryan Hanson

Hi all... I¹m making a chart dealing with frequencies of isotopes of various
elements.  For instance, I'd like the following text to appear on a chart
with the 35 and 37 as superscripts:

Based upon:
35Cl: 75%
37Cl: 25%

I am having problems properly parsing the superscript that preceeds the
Cl, since there is no character ahead of the superscript (I saw examples
in the archives where there was a preceeding character).  Also, the
construction of the string seems to not be working as I expect either.  So,
I think there are two problems here.  Here is a sample of what doesn't quite
work:

Cl1 - rbinom(1000, size = 1, prob = 0.25)
pCl1 - histogram(Cl1, main = expression(Cl[1]), xlab = , ylab = ,
scales = list(draw = FALSE), ylim = c(0:80))
plot(pCl1)
# This works fine but doesn't have everything I want:
leg.txt1 - paste(Based upon:\n, : 75%\n, : 25%, sep = )
grid.text(leg.txt1, 0.5, 0.5)
# This paste doesn't work due to the expression statements:
leg.txt2 - paste(Based upon:\n, expression(^35*Cl), : 75%\n,
expression(^37*Cl), : 25%, sep = )
# This doesnt' produce an error, but doesn't produce what is wanted either,
# as the expression is taken (almost) literally:
leg.txt3 - paste(Based upon:\n, expression(^35*Cl), : 75%\n,
expression(^37*Cl), : 25%, sep = )
grid.text(leg.txt3, 0.5, 0.3)

From watching the help list, I know parsing things can be tricky.

TIA, Bryan

 sessionInfo()
R version 2.7.1 (2008-06-23)
i386-apple-darwin8.10.1

locale:
en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] datasets  grid  grDevices graphics  stats utils methods
[8] base 

other attached packages:
 [1] fastICA_1.1-9 DescribeDisplay_0.1.3 ggplot_0.4.2
 [4] RColorBrewer_1.0-2reshape_0.8.0 MASS_7.2-42
 [7] pcaPP_1.5 mvtnorm_0.9-0 hints_1.0.1-1
[10] mvoutlier_1.3 robustbase_0.2-8  lattice_0.17-8
[13] rggobi_2.1.9  RGtk2_2.12.5-3  

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Conditionally Updating Lattice Plots

2008-07-20 Thread Bryan Hanson

Hi All...

I can¹t seem to find an answer to this in the help pages, archives, or
Deepayan¹s Lattice Book.

I want to do a Lattice plot, and then update it, possibly more than once,
depending upon some logical options.  Code below; it produces a second plot
page when the second update is called, from which I would infer that you
can't update the update or I'm not calling it correctly.  I have a nagging
sense too that the real way to do this is with a non-standard use of
panel.superpose but I don't quite see how to do that from available
examples.

TIF for any suggestions, Bryan


 Example: a function then, the call to the function

fancy.lm - function(x, y, fit = TRUE, resid = TRUE){

model - lm(y ~ x)

y.pred - predict(model) # Compute residuals for plotting
res.x - as.vector(rbind(x, x, rep(NA,length(x # NAs induce breaks in
line
res.y - as.vector(rbind(y, y.pred, rep(NA,length(x # after Fig 5.1 of
DAAG (clever!)

p - xyplot(y ~ x, pch = 20,
panel = function(...) {
panel.xyplot(...) # not strictly necessary if I understand correctly
})

plot(p, more = TRUE)

if (fit) {
plot(update(p, more = TRUE,
panel = function(...){
panel.xyplot(...)
panel.abline(model, col = red)
}))}

if (resid) {
plot(update(p, more = TRUE,
panel = function(...){
panel.xyplot(res.x, res.y, col = lightblue, type = l)
}))}

}

x - jitter(c(1:10), factor = 5)
y - jitter(c(1:10), factor = 10)
fancy.lm(x, y, fit = TRUE, resid = TRUE)


 Session Info
 sessionInfo()
R version 2.7.1 (2008-06-23)
i386-apple-darwin8.10.1

locale:
en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] datasets  grid  grDevices graphics  stats utils methods
[8] base 

other attached packages:
 [1] fastICA_1.1-9 DescribeDisplay_0.1.3 ggplot_0.4.2
 [4] RColorBrewer_1.0-2reshape_0.8.0 MASS_7.2-42
 [7] pcaPP_1.5 mvtnorm_0.9-0 hints_1.0.1-1
[10] mvoutlier_1.3 robustbase_0.2-8  lattice_0.17-8
[13] rggobi_2.1.9  RGtk2_2.12.5-3

loaded via a namespace (and not attached):
[1] tools_2.7.1


 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Lattice Version of grconvertX or variant on panel.text?

2008-07-20 Thread Bryan Hanson

Still playing with Lattice...

I want to use panel.text(x, y etc) but with x and y in plot coordinates
(0,1), not user coordinates.

I think if I had this problem with traditional graphics, I could use
grconvertX to make the change.  I did come across convertX {grid} but this
doesn't seem to be what I need.

Is there a function like grconvertX in Lattice, or is there a flag or some
other method of making panel.text use plot coordinates?

Thanks, Bryan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Lattice Version of grconvertX or variant on panel.text?

2008-07-20 Thread Bryan Hanson

Never mind, I just hard-coded it using ratios.  Simpler than I thought.
Thanks, Bryan


On 7/20/08 9:03 PM, Bryan Hanson [EMAIL PROTECTED] wrote:

 Still playing with Lattice...
 
 I want to use panel.text(x, y etc) but with x and y in plot coordinates
 (0,1), not user coordinates.
 
 I think if I had this problem with traditional graphics, I could use
 grconvertX to make the change.  I did come across convertX {grid} but this
 doesn't seem to be what I need.
 
 Is there a function like grconvertX in Lattice, or is there a flag or some
 other method of making panel.text use plot coordinates?
 
 Thanks, Bryan
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] .First and .Rprofile won't run on startup

2008-07-14 Thread Bryan Hanson

I accomplish this a little differently.

On the mac, in your home directory (e.g. /Users/susanamrose) there is/could
be a hidden file called .Rprofile  You can edit it with vi for instance by
getting a terminal window and vi .Rprofile  It will be created it if it
doesn't exist.

I keep all my local functions in a particular directory, then load them via
the .Rprofile by putting the following lines in .Rprofile

funcdir - /Users/susanamrose/Functions
z - paste(funcdir, /LoadFunctions.R, sep = )
source(z, chdir = TRUE)

This will source/execute whatever you put in the file LoadFunctions.R in the
specified directory when R starts up.  So, for instance, LoadFunctions.R
could be a bunch of source(func.R) statements.

This also gives you a short cut to get to your functions directory by
setwd(fundir).  I actually have a number of commonly used directories
defined this way for convenience.

HTH Bryan

On 7/14/08 2:34 PM, Susan Amrose [EMAIL PROTECTED] wrote:

 I'm trying to source a file automatically every time I start R. I tried
 adding the following .First function in a file Rprofile.site in my
 $R_HOME/etc/ directory (verified $R_HOME by Sys.getenv()) as well as in a
 file .Rprofile in my $HOME directory and .Rprofile in the working directory:
 .First - function(){
 source(file.path(Sys.getenv(HOME), R, functions,standard.r))
cat(Actually read your file)
 }
 
 - but no luck. I'm using a Mac (OS 10.4). It never runs (the file is not
 sourced and the text does not appear). Does anyone have any suggestions?
 
 Ideally, I would like to have a directory and source all the files in the
 directory in startup, but this is just a first step (is that possible?).
 
 Thanks in advance!!
  -Susan
 
 [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Coloring Stripchart Points, or Better, Lattice Equivalent

2008-06-24 Thread Bryan Hanson

If anyone remains interested, the solution in base graphics is to modify
stripchart.default, the last couple of lines where the coloring of points
defaults in a way that depends on groups.  In my example, the groups are
being handled collectively with the coloring.  Code is below.

Deepayan has noted that stacking of this type is not possible in Lattice
graphics, and would have to be coded directly (probably not too much of a
modification of what I give here, but I'm a novice!).

Thanks, Bryan

stripchart.colsym -
function(x, method=overplot, jitter=0.1, offset=1/3, vertical=FALSE,
 group.names, add = FALSE, at = NULL,
 xlim=NULL, ylim=NULL, ylab=NULL, xlab=NULL, dlab=, glab=,
 log=, pch=0, col=par(fg), cex=par(cex), axes=TRUE,
 frame.plot=axes, ...)
{
method - pmatch(method, c(overplot, jitter, stack))[1]
if(is.na(method) || method==0)
stop(invalid plotting method)
groups -
if(is.list(x)) x
else if(is.numeric(x)) list(x)
if(0 == (n - length(groups)))
stop(invalid first argument)
if(!missing(group.names))
attr(groups, names) - group.names
else if(is.null(attr(groups, names)))
attr(groups, names) - 1:n
if(is.null(at))
at - 1:n
else if(length(at) != n)
stop(gettextf('at' must have length equal to the number %d of groups,
  n), domain = NA)
if (is.null(dlab)) dlab - deparse(substitute(x))

if(!add) {
dlim - c(NA, NA)
for(i in groups)
dlim - range(dlim, i[is.finite(i)], na.rm = TRUE)
glim - c(1,n)# in any case, not range(at)
if(method == 2) { # jitter
glim - glim + jitter * if(n == 1) c(-5, 5) else c(-2, 2)
} else if(method == 3) { # stack
glim - glim + if(n == 1) c(-1,1) else c(0, 0.5)
}
if(is.null(xlim))
xlim - if(vertical) glim else dlim
if(is.null(ylim))
ylim - if(vertical) dlim else glim
plot(xlim, ylim, type=n, ann=FALSE, axes=FALSE, log=log, ...)
if (frame.plot) box()
if(vertical) {
if (axes) {
if(n  1) axis(1, at=at, labels=names(groups), ...)
Axis(x, side = 2, ...)
}
if (is.null(ylab)) ylab - dlab
if (is.null(xlab)) xlab - glab
}
else {
if (axes) {
Axis(x, side = 1, ...)
if(n  1) axis(2, at=at, labels=names(groups), ...)
}
if (is.null(xlab)) xlab - dlab
if (is.null(ylab)) ylab - glab
}
title(xlab=xlab, ylab=ylab)
}
csize - cex*
if(vertical) xinch(par(cin)[1]) else yinch(par(cin)[2])
for(i in 1:n) {
x - groups[[i]]
y - rep.int(at[i], length(x))
if(method == 2) ## jitter
y - y + stats::runif(length(y), -jitter, jitter)
else if(method == 3) { ## stack
xg - split(x, factor(x))
xo - lapply(xg, seq_along)
x - unlist(xg, use.names=FALSE)
y - rep.int(at[i], length(x)) +
(unlist(xo, use.names=FALSE) - 1) * offset * csize
}
if(vertical) points(y, x, col=col,
pch=pch, cex=cex)
else points(x, y, col=col,
pch=pch, cex=cex)
}
}

samples - 100 # must be even
index - round(runif(samples, 1, 100)) # set up data
resp - rbinom(samples, 1, 0.5)
yr - rep(c(2005, 2006), samples/2)
all - data.frame(index, resp, yr)
all$sym - ifelse(all$resp == 1, 3, 1)
all$col - ifelse(all$yr == 2005, red, blue)
all$count - rep(1, length(all$index))
all - all[order(all$index, all$yr, all$resp),] # for easier inspection
row.names(all) - c(1:samples) # for easier inspection

one - all[(all$yr == 2005  all$resp == 0),] # First 2005/0 at bottom
two - all[(all$yr == 2005  all$resp == 1),] # Then 2005/1
three - all[(all$yr == 2006  all$resp == 0),] # Now 2006/0
four - all[(all$yr == 2006  all$resp == 1),] # Finally 2006/1

par(mfrow = c(5, 1))
par(plt = c(0.1, 0.9, 0.25, 0.75))
stripchart(one$index, method = stack, ylim = c(0,10), xlim = c(1,100), col
= one$col, pch = one$sym)
mtext(2005/0 only, side = 3)
stripchart(two$index, method = stack, ylim = c(0,10), xlim = c(1,100), col
= two$col, pch = two$sym)
mtext(2005/1 only, side = 3)
stripchart(three$index, method = stack, ylim = c(0,10), xlim = c(1,100),
col = three$col, pch = three$sym)
mtext(2006/0 only, side = 3)
stripchart(four$index, method = stack, ylim = c(0,10), xlim = c(1,100),
col = four$col, pch = four$sym)
mtext(2006/1 only, side = 3)
stripchart.colsym(all$index, method = stack, ylim = c(0,10), xlim =
c(1,100), col = all$col, pch = all$sym)
mtext(all data, colored and symbolized as above, side = 3)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Coloring Stripchart Points, or Better, Lattice Equivalent

2008-06-23 Thread Bryan Hanson

Thanks Deepayan. That's the conclusion I have gradually reaching!  Bryan


On 6/23/08 5:57 PM, Deepayan Sarkar [EMAIL PROTECTED] wrote:

 On 6/22/08, Bryan Hanson [EMAIL PROTECTED] wrote:
 Thanks Gabor, I'm getting closer.
 
  Is there a way to spread out resp values vertically for a given value of
  index?  In base graphics, stripchart does this with method = stack.  But
  in lattice, stack = TRUE does something rather different, and I don't see a
  combination of lattice arguments that does it like base graphics.
 
 Right, the default lattice panel function doesn't support stacking. I
 think your best best, if you want to retain vectorization of col and
 pch, is to compute the y-coordinates yourself and use xyplot() to
 plot.
 
 -Deepayan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Coloring Stripchart Points, or Better, Lattice Equivalent

2008-06-22 Thread Bryan Hanson

Below is a revised set of code that demonstrates my question a little more
clearly, I hope.

When plotting all the data (5th panel), col  sym don't seem to be passed
correctly, as the (random) first value for col  sym are used for all points
(run the code, then run it again, you'll see how the 5th panel changes
depending upon col  sym for the first data point).  The 5th panel should
ideally be the sum of the 4 panels above, keeping col  sym intact.

Also, I would rather have this in lattice or ggplot2, if anyone sees how to
convert it.

Thanks once again, several of you have made very useful suggestions off
list.  Bryan

samples - 100 # must be even
index - round(runif(samples, 1, 100)) # set up data
resp - rbinom(samples, 1, 0.5)
yr - rep(c(2005, 2006), samples/2)
all - data.frame(index, resp, yr)
all$sym - ifelse(all$resp == 1, 1, 3)
all$col - ifelse(all$yr == 2005, red, blue)
all$count - rep(1, length(all$index))
all - all[order(all$index, all$yr, all$resp),] # for easier inspection
row.names(all) - c(1:samples) # for easier inspection

one - all[(all$yr == 2005  all$resp == 0),] # First 2005/0 at top
two - all[(all$yr == 2005  all$resp == 1),] # Then 2005/1
three - all[(all$yr == 2006  all$resp == 0),] # Now 2006/0
four - all[(all$yr == 2006  all$resp == 1),] # Finally 2006/1

par(mfrow = c(5, 1))
par(plt = c(0.1, 0.9, 0.25, 0.75))
stripchart(one$index, method = stack, ylim = c(0,10), xlim = c(1,100), col
= one$col, pch = one$sym)
mtext(2005/0, side = 3)
stripchart(two$index, method = stack, ylim = c(0,10), xlim = c(1,100), col
= two$col, pch = two$sym)
mtext(2005/1, side = 3)
stripchart(three$index, method = stack, ylim = c(0,10), xlim = c(1,100),
col = three$col, pch = three$sym)
mtext(2006/0, side = 3)
stripchart(four$index, method = stack, ylim = c(0,10), xlim = c(1,100),
col = four$col, pch = four$sym)
mtext(2006/1, side = 3)
stripchart(all$index, method = stack, ylim = c(0,10), xlim = c(1,100), col
= all$col, pch = all$sym)
mtext(col  sym always taken from 1st data point when all data is
plotted!, side = 3)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Coloring Stripchart Points, or Better, Lattice Equivalent

2008-06-22 Thread Bryan Hanson

Thanks Gabor, I'm getting closer.

Is there a way to spread out resp values vertically for a given value of
index?  In base graphics, stripchart does this with method = stack.  But
in lattice, stack = TRUE does something rather different, and I don't see a
combination of lattice arguments that does it like base graphics.

Thanks, Bryan


On 6/22/08 12:48 PM, Gabor Grothendieck [EMAIL PROTECTED] wrote:

 Actually I am not sure if my prior answer was correct.  I think its ok
 with one panel but
 you might have to use a panel function is there are several. With one
 panel it seems
 ok:
 
 stripplot(~ index, all, col = all$col, pch = all$sym)
 
 On Sun, Jun 22, 2008 at 12:28 PM, Gabor Grothendieck
 [EMAIL PROTECTED] wrote:
 Try this:
 
 library(lattice)
 all$resp - as.factor(all$resp)
 stripplot(~ index | resp * yr, all, col = all$col, pch = all$sym,
 layout = c(1, 4))
 
 
 On Sun, Jun 22, 2008 at 10:43 AM, Bryan Hanson [EMAIL PROTECTED] wrote:
 Below is a revised set of code that demonstrates my question a little more
 clearly, I hope.
 
 When plotting all the data (5th panel), col  sym don't seem to be passed
 correctly, as the (random) first value for col  sym are used for all points
 (run the code, then run it again, you'll see how the 5th panel changes
 depending upon col  sym for the first data point).  The 5th panel should
 ideally be the sum of the 4 panels above, keeping col  sym intact.
 
 Also, I would rather have this in lattice or ggplot2, if anyone sees how to
 convert it.
 
 Thanks once again, several of you have made very useful suggestions off
 list.  Bryan
 
 samples - 100 # must be even
 index - round(runif(samples, 1, 100)) # set up data
 resp - rbinom(samples, 1, 0.5)
 yr - rep(c(2005, 2006), samples/2)
 all - data.frame(index, resp, yr)
 all$sym - ifelse(all$resp == 1, 1, 3)
 all$col - ifelse(all$yr == 2005, red, blue)
 all$count - rep(1, length(all$index))
 all - all[order(all$index, all$yr, all$resp),] # for easier inspection
 row.names(all) - c(1:samples) # for easier inspection
 
 one - all[(all$yr == 2005  all$resp == 0),] # First 2005/0 at top
 two - all[(all$yr == 2005  all$resp == 1),] # Then 2005/1
 three - all[(all$yr == 2006  all$resp == 0),] # Now 2006/0
 four - all[(all$yr == 2006  all$resp == 1),] # Finally 2006/1
 
 par(mfrow = c(5, 1))
 par(plt = c(0.1, 0.9, 0.25, 0.75))
 stripchart(one$index, method = stack, ylim = c(0,10), xlim = c(1,100), col
 = one$col, pch = one$sym)
 mtext(2005/0, side = 3)
 stripchart(two$index, method = stack, ylim = c(0,10), xlim = c(1,100), col
 = two$col, pch = two$sym)
 mtext(2005/1, side = 3)
 stripchart(three$index, method = stack, ylim = c(0,10), xlim = c(1,100),
 col = three$col, pch = three$sym)
 mtext(2006/0, side = 3)
 stripchart(four$index, method = stack, ylim = c(0,10), xlim = c(1,100),
 col = four$col, pch = four$sym)
 mtext(2006/1, side = 3)
 stripchart(all$index, method = stack, ylim = c(0,10), xlim = c(1,100), col
 = all$col, pch = all$sym)
 mtext(col  sym always taken from 1st data point when all data is
 plotted!, side = 3)
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Coloring Stripchart Points, or Better, Lattice Equivalent

2008-06-21 Thread Bryan Hanson

Hi All.

I have the commands below to create a stripchart/plot.  I was hoping to
color the points in the plot by yr, and use a symbol that varied with resp.
However, the outcome makes it appear as though the point by point col and
pch data is not being passed properly.  Any suggestions?

And truthfully, I¹d rather be doing this with Lattice, but I¹ve tried
several variations of stripplot and can¹t even get something with the
general layout of the stripchart version.

Thanks, Bryan
*
Bryan Hanson
Professor of Chemistry  Biochemistry
DePauw University, Greencastle IN USA

index - round(runif(100, 1, 100))
resp - rbinom(100, 1, 0.5)
yr - rep(c(2005, 2006), 50)
all - data.frame(index, resp, yr)
for (n in 1:length(all$index)) {
if (all$yr[n] == 2005) {all$col[n] - red}
else {all$col[n] - blue}
}
for (n in 1:length(all$index)) {
if (all$resp[n] == 1) {all$sym[n] - 1}
else {all$sym[n] - 3}
}
stripchart(all$index, method = stack, ylim = c(0,10), col = all$col, pch =
all$sym)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Pointwise Confidence Bounds on Logistic Regression

2008-06-19 Thread Bryan Hanson

[I've ommitted some of the conversation so far...]

 E.g. in a logistic model, with (say) eta = beta_0 + beta_1*x one may
 find, on the
 linear predictor scale, A and B (say) such that P(A = eta = B) = 0.95.
 
 Then P(expit(A) = expit(eta) = expit(B)) = 0.95, which is exactly
 what is wanted.

I think I follow the above conceptually, but I don't know how to implement
it, though I fooled around (unsuccessfully) with some of the variations on
predict().

I'm trying to learn this in response to a biology colleague who did
something similar in SigmaPlot. I can already tell that SigmaPlot did a lot
of stuff for him in the background.  The responses are 0/1 of a particular
observation by date.  The following code simulates what's going on (note
that I didn't use 0/1 since this leads to a recognized condition/warning of
fitting 1's and 0's. I've requested Brian's Pattern Recognition book so I
know what the problem is and how to solve it).  My colleague is looking at
two populations in which the LD50 would differ.  I'd like to be able to
put the pointwise confidence bounds on each curve so he can tell if the
populations are really different.

By the way, this code does give a (minor?) error from glm (which you will
see).

Can you make a suggestion about how to get those confidence bounds on
there?

Also, is a probit link more appropriate here?

Thanks,  Bryan

x - c(1:40)
y1 - c(rep(0.1,10), rep(NA, 10), rep(0.9,20))
y2 - c(rep(0.1,15), rep(NA, 10), rep(0.9,15))
data - as.data.frame(cbind(x,y1,y2))
plot(x, y1, pch = 1, ylim = c(0,1), col = red)
points(x, y2, pch = 3, col = blue)
abline(h = 0.5, col = gray)
fit1 - glm(y1~x, family = binomial(link = logit), data, na.action =
na.omit)
fit2 - glm(y2~x, family = binomial(link = logit), data, na.action =
na.omit)
lines(fit1$model$x, fit1$fitted.values, col = red)
lines(fit2$model$x, fit2$fitted.values, col = blue)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Pointwise Confidence Bounds on Logistic Regression

2008-06-19 Thread Bryan Hanson

Thanks so much to all who offered assistance.  I have to say it would have
taken me a long time to figure this out, so I am most grateful.  Plus,
studying your examples greatly improves my understanding.

As a follow up, the fit process gives the following error:

Warning messages:
1: In eval(expr, envir, enclos) :
  non-integer #successes in a binomial glm!

Is this something I should worry about?  It doesn't arise from glm or
glm.fit 

Thanks, Bryan

For the record, a final complete working code is appended below.

x - c(1:40)
y1 - c(rep(0.1,10), rep(NA, 10), rep(0.9,20))
y2 - c(rep(0.1,15), rep(NA, 10), rep(0.9,15))
data - as.data.frame(cbind(x,y1,y2))
plot(x, y1, pch = 1, ylim = c(0,1), col = red, main = Logistic Regression
w/Confidence Bounds, ylab = y values,  xlab = x values)
points(x, y2, pch = 3, col = blue)
abline(h = 0.5, col = gray)
fit1 - glm(y1~x, family = binomial(link = logit), data, na.action =
na.omit)
fit2 - glm(y2~x, family = binomial(link = logit), data, na.action =
na.omit)
lines(fit1$model$x, fit1$fitted.values, col = red)
lines(fit2$model$x, fit2$fitted.values, col = blue)

## predictions on scale of link function
pred1 - predict(fit1, se.fit = TRUE)
pred2 - predict(fit2, se.fit = TRUE)

## constant for 95% confidence bands
## getting two t values is redundant here as fit1 and fit2
## have same residual df, but the real world may be different
res.df - c(fit1$df.residual, fit2$df.residual)
## 0.975 because we want 2.5% in upper and lower tail
const - qt(0.975, df = res.df)

## confidence bands on scale of link function
upper1 - pred1$fit + (const[1] * pred1$se.fit)
lower1 - pred1$fit - (const[1] * pred1$se.fit)
upper2 - pred2$fit + (const[2] * pred2$se.fit)
lower2 - pred2$fit - (const[2] * pred2$se.fit)

## bind together into a data frame
bands - data.frame(upper1, lower1, upper2, lower2)

## transform on to scale of response
bands - data.frame(lapply(bands, binomial(link = logit)$linkinv))

## plot confidence bands
lines(fit1$model$x, bands$upper1, col = pink)
lines(fit1$model$x, bands$lower1, col = pink)
lines(fit2$model$x, bands$upper2, col = lightblue)
lines(fit2$model$x, bands$lower2, col = lightblue)



On 6/19/08 12:28 PM, Gavin Simpson [EMAIL PROTECTED] wrote:

 On Thu, 2008-06-19 at 10:42 -0400, Bryan Hanson wrote:
 [I've ommitted some of the conversation so far...]
 
 E.g. in a logistic model, with (say) eta = beta_0 + beta_1*x one may
 find, on the
 linear predictor scale, A and B (say) such that P(A = eta = B) = 0.95.
 
 Then P(expit(A) = expit(eta) = expit(B)) = 0.95, which is exactly
 what is wanted.
 
 I think I follow the above conceptually, but I don't know how to implement
 it, though I fooled around (unsuccessfully) with some of the variations on
 predict().
 
 I'm trying to learn this in response to a biology colleague who did
 something similar in SigmaPlot. I can already tell that SigmaPlot did a lot
 of stuff for him in the background.  The responses are 0/1 of a particular
 observation by date.  The following code simulates what's going on (note
 that I didn't use 0/1 since this leads to a recognized condition/warning of
 fitting 1's and 0's. I've requested Brian's Pattern Recognition book so I
 know what the problem is and how to solve it).  My colleague is looking at
 two populations in which the LD50 would differ.  I'd like to be able to
 put the pointwise confidence bounds on each curve so he can tell if the
 populations are really different.
 
 By the way, this code does give a (minor?) error from glm (which you will
 see).
 
 Can you make a suggestion about how to get those confidence bounds on
 there?
 
 Also, is a probit link more appropriate here?
 
 Thanks,  Bryan
 
 x - c(1:40)
 y1 - c(rep(0.1,10), rep(NA, 10), rep(0.9,20))
 y2 - c(rep(0.1,15), rep(NA, 10), rep(0.9,15))
 data - as.data.frame(cbind(x,y1,y2))
 plot(x, y1, pch = 1, ylim = c(0,1), col = red)
 points(x, y2, pch = 3, col = blue)
 abline(h = 0.5, col = gray)
 fit1 - glm(y1~x, family = binomial(link = logit), data, na.action =
 na.omit)
 fit2 - glm(y2~x, family = binomial(link = logit), data, na.action =
 na.omit)
 lines(fit1$model$x, fit1$fitted.values, col = red)
 lines(fit2$model$x, fit2$fitted.values, col = blue)
 
 The point is to get predictions on the scale of the link function,
 generate 95% confidence bands in the normal way and then transform the
 confidence bands onto the scale of the response using the inverse of the
 link function used to fit the model.
 
 [note, am doing this from memory, so best to check this is right -- I'm
 sure someone will tell me very quickly if I have gone wrong anywhere!]
 
 ## predictions on scale of link function
 pred1 - predict(fit1, se.fit = TRUE)
 pred2 - predict(fit2, se.fit = TRUE)
 
 ## constant for 95% confidence bands
 ## getting two t values is redundant here as fit1 and fit2
 ## have same residual df, but the real world may be different
 res.df - c(fit1$df.residual, fit2$df.residual)
 ## 0.975 because we want 2.5% in upper and lower tail

[R] Pointwise Confidence Bounds on Logistic Regression

2008-06-18 Thread Bryan Hanson

Hi all.  I hope I have my terminology right here...

For a simple lm, one can add ³pointwise confidence bounds² to a fitted line
using something like

predict(results.lm, newdata = something, interval = confidence)

(I'm following DAAG page 154-155 for this)

I would like to do the same thing for a glm of the logistic regression type,
for instance, the example in MASS pg 190-192 (available in the help page for
predict.glm).

However, predict.glm does not have the same kind of features as plain old
predict, i.e. One cannot specify interval = confidence

From what I've read, pointwise confidence bounds are computed from the
SE's for each point.  However, I don't see quite where to extract this
information with a glm

So, is there an existing function that does what I am describing for a glm,
or can someone point me in the right direction to start writing my own?

TIA as always, Bryan
*
Bryan Hanson
Professor of Chemistry  Biochemistry
DePauw University, Greencastle IN USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Suggestions: Terminology Pkgs for following spectra over time

2008-04-16 Thread Bryan Hanson

Hi Folks... No code to troubleshoot here.  I need some suggestions about the
right terminology to use in further searching, and any suggestions about R
pkgs that might be appropriate.

I am in the planning stages of a project in which IR, NMR and other spectra
(I'm a chemist) would be collected on various samples, and individual
samples would be followed over time.  The spectra will be feature
rich/complex, so one can't see the changes by visual inspection.  The
spectra are basically 2D matrices: peaks as a function of frequencies.  So
the data set is in the form of spectra of a single sample over time, for
multiple samples.

I am wondering about methods  R pkgs that can be used to analyze changes in
the spectra over time.  For instance, I would like to find specific peaks
that are changing over time, sets of peaks that are changing in a correlated
way over time etc.  I'd like to do this in an efficient and statistically
valid way.  What I am thinking of is somewhat like a time series, somewhat
like image analysis (but only 2D), but it's not quite either of those and I
need to know what it's really called to investigate further.

Any suggestions as to R pkgs and key words/phrases will be appreciated.

TIA, Bryan
*
Bryan Hanson
Professor of Chemistry  Biochemistry
DePauw University, Greencastle Indiana USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Definition of wrapper?

2008-03-30 Thread Bryan Hanson

I think I more or less understand what a ³wrapper² is, but I¹d like to hear
how more experienced R users define it, and especially I'd like to know if
there is a formal definition.  In my reading, it seems like there are a
fairly wide range of meanings, but they are all conceptually similar.

I've looked in a couple of the classic R texts, the extensions and
developers' manuals, and R help archives, and didn't find a definition.  Of
course, I may have missed it.

Thanks in advance.  Bryan

**
Bryan Hanson
Professor of Chemistry  Biochemistry
DePauw University
602 S. College Avenue
Greencastle, IN 46135
PHONE 765-658-4602
FAX 765-658-6084
[EMAIL PROTECTED]
http://academic.depauw.edu/~hanson/deadpezsociety.html
http://www.depauw.edu/acad/chemistry/
http://academic.depauw.edu/~hanson/UMP/index.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem with graphics device in Mac OS X

2007-12-10 Thread Bryan Hanson

For whatever reason, on the Mac, you have to open a new Quartz device window
before making the graphics call.  So, from the menu, pull down under Window
to New Quartz Device Window.  Then all graphics calls go to that (initially
empty) window, and any further calls replace the previous contents of the
window.  This window doesn't print so well, but your students can divert
their output to a pdf easily for really nice plots.

BTW, people were reporting problems with OS 10.5 and R.  These may have been
fixed, but if you have trouble, it's discussed in the archives.

Bryan


On 12/10/07 3:37 PM, WAYNE KING [EMAIL PROTECTED] wrote:

 Hello List,
I am teaching a basic course where students are encouraged to use R. There
 are a few students using Mac OS X. As a test we downloaded and installed the
 latest .dmg file (R-2.6.1.dmg) onto a intel Mac running 10.5.1. A device query
 yields
 
 getOption(device)
 quartz
 
 But any plot command does not bring up a plot (e.g. plot(), boxplot(),
 hist()).
 
 I found a thread concerning X11 windows under Mac OS X but I feel these users
 will most likely be just using the native quartz device.
 
 Invoking a call to quartz() first does not seem to help, e.g.
 
 quartz()
 plot(rnorm(100,0,1))
 
 produces no output and no error message (Nothing happens). A call to dev.cur()
 seems to indicate a device is active.
 quartz()
 dev.cur()
 quartz
 2
 
 but again a plot command produces no figure. Sorry am I not a Mac OS user and
 I did check the archives but found mostly discussions on X11() under Mac OS X.
 
 Wayne
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Graphical Manova: Fails When There Are Three Factors

2007-11-16 Thread Bryan Hanson

Hi R Gurus   Lurkers...  Thanks in advance to anyone who is willing to
tackle this!  Bryan

I have been implementing the graphical manova method described in An
Introduction to Ggobi (from the Ggobi web site).  A stand alone working
code is appended below.  The code is almost the same as described in the
Introduction document, with one bug fix.  A quick summary of the code is
that it takes one's data, and fits an ellipsoid to it at a requested
confidence level, then color codes everything for display.  If you don't
have ggobi installed, remove the ggobi from the last line and just use
graphic_manova(response,class)
You will probably have to comment out the last 4 lines of the graphic_manova
function as well to avoid trivial errors.

Here's the  R question: If the variable class has more than two levels
(factors) in it, the code executes but runs into an error because

cis = lapply(sub.groups, combined, cl=cl)

creates cis with a bunch of NA's, which then cause havoc when one tries to
do any matrix operations on it (not surprisingly).  The NA's follow an
interesting pattern: the ellipsoid points are generated for the first two
dimensions (pc1 and pc2), but NA's are generated for the third dimension
(pc3).  So cis contains the 3 original data dimensions, 1000 added ellipsoid
points to go with pc1, and 1000 added ellipsoid points to go with pc2, and
1000 NA's to go with pc3 I don't see why the third set of data is any
different than the first two, and the first two execute correctly.


# generate sample data

pc1=rnorm(20, sd=1)
pc2=rnorm(20, mean = 10, sd=2)
pc3=rnorm(20, sd=3)
class=factor(c(group 1,group 1,group 1,group 2,group 2,group
2,group 2,group 2,group 2,group 2,group 1,group 1,group
1,group 1,group 2,group 2,group 2,group 2,group 1,group 2),
ordered=TRUE)
response=cbind(pc1, pc2, pc3)

# Now generate confidence ellipsoids using the method described
# in An Introduction to RGGOBI with minor modifications

# Define 3 functions to do the heavy lifting

# First: a function that generates a random set of points on a sphere
# centered on the mean of the passed data, skewed to match the variance
# of the passed data (which turns the sphere into an ellipsoid),
# and adjusted in size to match the desired confidence level.

ellipse = function(data, npoints=1000, cl=0.95, mean=colMeans(data),
cov=var(data),n=nrow(data))
{
norm.vec = function(x) x/sqrt(sum(x^2))
p = length(mean)
ev = eigen(cov)

sphere = matrix(rnorm(npoints*p), ncol=p)
cntr = t(apply(sphere, 1, norm.vec))  # normalized sphere

cntr = cntr %*% diag(sqrt(ev$values)) %*% t(ev$vectors) # ellipsoid of
correct shape
Fcrit = qf(cl, p, n-p)
scalefactor = sqrt((p*(n-1))/(n*(n-p)))*Fcrit
cntr = cntr*scalefactor # ellipsoid of correct size
if (!missing(data)) # only relevant when no data passed
colnames(cntr) = colnames(data)
cntr+rep(mean, each=npoints)
}

# Next a function that combines the original data with the generated
ellipsoid

combined = function(data, cl=0.95)
{
dm = data.matrix(data)
ellipse = as.data.frame(ellipse(dm, npoints=1000, cl=cl))
both = rbind(data, ellipse)
both$SIM = factor(rep(c(FALSE,TRUE),c(nrow(data),1000)))
both
}

# Now a function to separate the dataset into categories

graphic_manova = function(data, catvar, cl=0.68)
{
sub.groups = data.frame(cbind(data,catvar))
sub.groups = split(sub.groups,catvar)
cis = lapply(sub.groups, combined, cl=cl)
df = as.data.frame(do.call(rbind, cis))
df$var = factor(rep(names(cis), sapply(cis, nrow)))
g = ggobi(df)
glyph_type(g[1]) = c(6,1)[df$SIM] # makes dots of ellipsoids tiny
glyph_color(g[1]) = df$var # properly colors the two groups
invisible(g)
}

# Now actually do the computations  plot the data!

# ggobi(combined(response))  # This is a debugging check point

ggobi(graphic_manova(response,class))

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reading a file with read.csv: two character rows not interpreted as I hope

2007-10-31 Thread Bryan Hanson

OK, I fixed it myself!  Here's the code.  Of course, it mostly seems simple
once one gets it working... Thanks Jim.  Bryan

sample.info = read.table(input.file.name, sep=,, as.is=TRUE, nrows=3) #
get the first three lines with sample info in character format
sample.names = sample.info[1,]
sample.colors = sample.info[2,]; sample.colors =
as.character(sample.colors[-1])
sample.class = sample.info[3,]; sample.class =
as.character(sample.class[-1])
data = read.table(input.file.name, sep=,, skip=3)
colnames(data) = sample.names


On 10/30/07 10:53 PM, Bryan Hanson [EMAIL PROTECTED] wrote:

 Jim, thanks for the suggestion.  There is still something subtle
 
non-intuitive going on here.  I adapted your code with minor changes
 as
follows (I had to add the sep argument) but get different
 behavior:

c.names - scan(file.csv, what='', nlines=1, sep=,)  # read
 column names
c.options - read.table(file.csv, as.is=TRUE, nrows=2, sep=,)
 # get
lines 2-3
c.data - read.table(file.csv, sep=,)  # rest of the
 data
colnames(file.csv) - c.names

Your code works perfectly (you knew
 that!).  My adaptation runs, but
c.options contains the first two lines, not
 lines 2  3, and c.data contains
the contents of the entire file as *factors*
 (data type of c.names 
c.options is correct - character). How strange!

Also,
 and this is an observation rather than a question: in your code, you
call scan
 and get the first line as characters, then you do read.table which
gets lines
 2  3 presumably because the first line, from read.table's
perspective is a
 hidden label (?), then the second time you use read.table
the hidden first
 line is ignored, as are the two lines with character data.
I really don't
 understand these behaviors, which is probably why I'm having
trouble parsing
 the file!

Thanks, Bryan 

On 10/30/07 8:40 PM, jim holtman
 [EMAIL PROTECTED] wrote:

 Here is one way.  You will probably use 'file'
 instead of textConnection
 
 x.in - textConnection('wavelength SampleA
 SampleB SampleC SampleD
 +  color green black black green
 +  class
 Class 1 Class 2 Class 2 Class 1
 +  403 1.94E-01 2.14E-01 2.11E-01
 1.83E-01
 +  409 1.92E-01 1.89E-01 2.00E-01 1.82E-01
 +  415 1.70E-01
 1.99E-01 1.94E-01 1.86E-01
 +  420 1.59E-01 1.91E-01 2.16E-01 1.74E-01
 +
 426 1.50E-01 1.66E-01 1.72E-01 1.58E-01
 +  432 1.42E-01 1.50E-01 1.62E-01
 1.48E-01')
 
 c.names - scan(x.in, what='', nlines=1)  # read column
 names
 Read 5 items
 c.options - read.table(x.in, as.is=TRUE, nrows=2) #
 get lines 2-3
 c.data - read.table(x.in)  # rest of the data

 colnames(c.data) - c.names
 close(x.in)
 c.options  # here are lines
 2-3
  V1  V2  V3  V4  V5
 1 color   green   black
 black   green
 2 class Class 1 Class 2 Class 2 Class 1
 c.data  # your
 data
   wavelength SampleA SampleB SampleC SampleD
 1403   0.194
 0.214   0.211   0.183
 2409   0.192   0.189   0.200   0.182
 3
 415   0.170   0.199   0.194   0.186
 4420   0.159   0.191   0.216
 0.174
 5426   0.150   0.166   0.172   0.158
 6432   0.142
 0.150   0.162   0.148
 
 
 On 10/30/07, Bryan Hanson [EMAIL PROTECTED]
 wrote:
 Hi Folks... Œbeen playing with this for a while, with no luck, so
 I¹m hoping
 someone knows it off the top of their head...  Difficult to find
 this nuance
 in the archives, as so many msgs deal with read.csv!
 
 I¹m
 trying to read a data file with the following structure (a little piece
 of
 the actual data, they are actually csv just didn¹t paste with the

 commas):
 
  wavelength SampleA SampleB SampleC SampleD
  color green
 black black green
  class Class 1 Class 2 Class 2 Class 1

 403 1.94E-01 2.14E-01 2.11E-01 1.83E-01
  409 1.92E-01 1.89E-01 2.00E-01
 1.82E-01
  415 1.70E-01 1.99E-01 1.94E-01 1.86E-01
  420 1.59E-01 1.91E-01
 2.16E-01 1.74E-01
  426 1.50E-01 1.66E-01 1.72E-01 1.58E-01
  432 1.42E-01
 1.50E-01 1.62E-01 1.48E-01
 
 Columns after the first one are sample
 names.  2nd row is the list of colors
 to use in later plotting.  3rd row is
 the class for later manova.  The rest
 of it is x data in the first column
 with y1, y2...following for plotting.
 
 I can read the file w/o the color
 or class rows with read.csv just fine,
 makes a nice data frame with proper
 data types.  The problem comes when
 parsing the 2nd and 3rd rows.  Here¹s
 the code:
 
 data = read.csv(filename, header=TRUE) # read in data

 color = data[1,]; color = data[-1] # capture color info  throw out 1st

 value
 class = data[2,]; class = class[-1] # capture category info  throw
 out 1st
 value
 
 cleaned.data = data[-1,] # remove color  category
 info for matrix
 operations
 cleaned.data = data[-1,]
 freq = data[,1] #
 capture frequency info
 
 What happens is that freq is parsed as factors,
 and the color and class are
 parsed as a data frames of factors.
 I need
 color and class to be characters which I can pass to functions in the

 typical way one uses colors and levels.
 I need the freq  the cleaned.data
 info as numeric

[R] Reading a file with read.csv: two character rows not interpreted as I hope

2007-10-30 Thread Bryan Hanson

Hi Folks... been playing with this for a while, with no luck, so I¹m hoping
someone knows it off the top of their head...  Difficult to find this nuance
in the archives, as so many msgs deal with read.csv!

I¹m trying to read a data file with the following structure (a little piece
of the actual data, they are actually csv just didn¹t paste with the
commas):

 wavelength SampleA SampleB SampleC SampleD
 color green black black green
 class Class 1 Class 2 Class 2 Class 1
 403 1.94E-01 2.14E-01 2.11E-01 1.83E-01
 409 1.92E-01 1.89E-01 2.00E-01 1.82E-01
 415 1.70E-01 1.99E-01 1.94E-01 1.86E-01
 420 1.59E-01 1.91E-01 2.16E-01 1.74E-01
 426 1.50E-01 1.66E-01 1.72E-01 1.58E-01
 432 1.42E-01 1.50E-01 1.62E-01 1.48E-01

Columns after the first one are sample names.  2nd row is the list of colors
to use in later plotting.  3rd row is the class for later manova.  The rest
of it is x data in the first column with y1, y2...following for plotting.

I can read the file w/o the color or class rows with read.csv just fine,
makes a nice data frame with proper data types.  The problem comes when
parsing the 2nd and 3rd rows.  Here¹s the code:

data = read.csv(filename, header=TRUE) # read in data
color = data[1,]; color = data[-1] # capture color info  throw out 1st
value
class = data[2,]; class = class[-1] # capture category info  throw out 1st
value

cleaned.data = data[-1,] # remove color  category info for matrix
operations
cleaned.data = data[-1,]
freq = data[,1] # capture frequency info

What happens is that freq is parsed as factors, and the color and class are
parsed as a data frames of factors.
I need color and class to be characters which I can pass to functions in the
typical way one uses colors and levels.
I need the freq  the cleaned.data info as numeric for plotting.

I don¹t feel I¹m far off from things working, but that¹s where you all come
in!  Seems like an argument of as.something is needed, but the ones I¹ve
tried don¹t work.  Would it help to put color and class above the x,y data
in the file, then clean it off?

Btw, I¹m on a Mac using R 2.6.0.

Thanks in advance, Bryan
*
Bryan Hanson
Professor of Chemistry  Biochemistry




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reading a file with read.csv: two character rows not interpreted as I hope

2007-10-30 Thread Bryan Hanson

Jim, thanks for the suggestion.  There is still something subtle 
non-intuitive going on here.  I adapted your code with minor changes as
follows (I had to add the sep argument) but get different behavior:

c.names - scan(file.csv, what='', nlines=1, sep=,)  # read column names
c.options - read.table(file.csv, as.is=TRUE, nrows=2, sep=,) # get
lines 2-3
c.data - read.table(file.csv, sep=,)  # rest of the data
colnames(file.csv) - c.names

Your code works perfectly (you knew that!).  My adaptation runs, but
c.options contains the first two lines, not lines 2  3, and c.data contains
the contents of the entire file as *factors* (data type of c.names 
c.options is correct - character). How strange!

Also, and this is an observation rather than a question: in your code, you
call scan and get the first line as characters, then you do read.table which
gets lines 2  3 presumably because the first line, from read.table's
perspective is a hidden label (?), then the second time you use read.table
the hidden first line is ignored, as are the two lines with character data.
I really don't understand these behaviors, which is probably why I'm having
trouble parsing the file!

Thanks, Bryan 

On 10/30/07 8:40 PM, jim holtman [EMAIL PROTECTED] wrote:

 Here is one way.  You will probably use 'file' instead of textConnection
 
 x.in - textConnection('wavelength SampleA SampleB SampleC SampleD
 +  color green black black green
 +  class Class 1 Class 2 Class 2 Class 1
 +  403 1.94E-01 2.14E-01 2.11E-01 1.83E-01
 +  409 1.92E-01 1.89E-01 2.00E-01 1.82E-01
 +  415 1.70E-01 1.99E-01 1.94E-01 1.86E-01
 +  420 1.59E-01 1.91E-01 2.16E-01 1.74E-01
 +  426 1.50E-01 1.66E-01 1.72E-01 1.58E-01
 +  432 1.42E-01 1.50E-01 1.62E-01 1.48E-01')
 
 c.names - scan(x.in, what='', nlines=1)  # read column names
 Read 5 items
 c.options - read.table(x.in, as.is=TRUE, nrows=2) # get lines 2-3
 c.data - read.table(x.in)  # rest of the data
 colnames(c.data) - c.names
 close(x.in)
 c.options  # here are lines 2-3
  V1  V2  V3  V4  V5
 1 color   green   black   black   green
 2 class Class 1 Class 2 Class 2 Class 1
 c.data  # your data
   wavelength SampleA SampleB SampleC SampleD
 1403   0.194   0.214   0.211   0.183
 2409   0.192   0.189   0.200   0.182
 3415   0.170   0.199   0.194   0.186
 4420   0.159   0.191   0.216   0.174
 5426   0.150   0.166   0.172   0.158
 6432   0.142   0.150   0.162   0.148
 
 
 On 10/30/07, Bryan Hanson [EMAIL PROTECTED] wrote:
 Hi Folks... Œbeen playing with this for a while, with no luck, so I¹m hoping
 someone knows it off the top of their head...  Difficult to find this nuance
 in the archives, as so many msgs deal with read.csv!
 
 I¹m trying to read a data file with the following structure (a little piece
 of the actual data, they are actually csv just didn¹t paste with the
 commas):
 
  wavelength SampleA SampleB SampleC SampleD
  color green black black green
  class Class 1 Class 2 Class 2 Class 1
  403 1.94E-01 2.14E-01 2.11E-01 1.83E-01
  409 1.92E-01 1.89E-01 2.00E-01 1.82E-01
  415 1.70E-01 1.99E-01 1.94E-01 1.86E-01
  420 1.59E-01 1.91E-01 2.16E-01 1.74E-01
  426 1.50E-01 1.66E-01 1.72E-01 1.58E-01
  432 1.42E-01 1.50E-01 1.62E-01 1.48E-01
 
 Columns after the first one are sample names.  2nd row is the list of colors
 to use in later plotting.  3rd row is the class for later manova.  The rest
 of it is x data in the first column with y1, y2...following for plotting.
 
 I can read the file w/o the color or class rows with read.csv just fine,
 makes a nice data frame with proper data types.  The problem comes when
 parsing the 2nd and 3rd rows.  Here¹s the code:
 
 data = read.csv(filename, header=TRUE) # read in data
 color = data[1,]; color = data[-1] # capture color info  throw out 1st
 value
 class = data[2,]; class = class[-1] # capture category info  throw out 1st
 value
 
 cleaned.data = data[-1,] # remove color  category info for matrix
 operations
 cleaned.data = data[-1,]
 freq = data[,1] # capture frequency info
 
 What happens is that freq is parsed as factors, and the color and class are
 parsed as a data frames of factors.
 I need color and class to be characters which I can pass to functions in the
 typical way one uses colors and levels.
 I need the freq  the cleaned.data info as numeric for plotting.
 
 I don¹t feel I¹m far off from things working, but that¹s where you all come
 in!  Seems like an argument of as.something is needed, but the ones I¹ve
 tried don¹t work.  Would it help to put color and class above the x,y data
 in the file, then clean it off?
 
 Btw, I¹m on a Mac using R 2.6.0.
 
 Thanks in advance, Bryan
 *
 Bryan Hanson
 Professor of Chemistry  Biochemistry
 
 
 
 
[[alternative HTML version deleted]]
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R

[R] Accessing scripts in a different directory on a Mac

2007-10-25 Thread Bryan Hanson

Hi all.  A question for knowledgeable folks using R on an Intel Mac running
OS X 10.4.10

For ease of maintenance, I have broken a large R script into a main script
which ³oversees² things by calling other scripts, using ³source².  Let¹s
call the secondary scripts ³sub-scripts.²

I¹d like for the sub-scripts to reside in a different directory (again, for
ease of maintenance, and so I can access them from many other directories).
I¹ve looked all over the documentation about paths and filenames, but I¹m
having trouble deciding which of the many functions is the one I need.

As a more specific example, my main script currently contains
source(³test.R²) and I need to do something like source(pathtest.R).

Ideally, I'd like to specify path early in the file one time, and have it
apply automatically later.

Stuff in the documentation only seems to tease!

Thanks in advance, Bryan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

< 1 2

101 - 144 of 144 matches

Mail list logo