Re: [R] Rggobi compilation error: display.c

2007-09-06 Thread Prof Brian Ripley

On Wed, 5 Sep 2007, Yuelin Li wrote:


On a ubuntu linux computer (Feisty, i386), I compile R and additional
packages from source.  The compiler is gcc 4.1.2.

The problem is, I can run sudo R and successfully compile all
packages (e.g., MASS, lattice) except rggobi.  The error seems to be
in display.c.  My ggobi is in /usr/local/, which R can find.  I don't
think this is a dependence issue because install.packages(...,
dependence=TRUE).  The script complains about not finding
/usr/local/lib/R/library/rggobi/libs/*, but that directory is there,
with one file rggobi.so (possibly from an earlier successful
compilation).


I don't think the file was there at the time, before the previous 
installation was restored.



Any suggestions on what I am doing wrong?  Many thanks in advance.


Here is the error:


display.c:37: error: too many arguments to function ÿÿklass-createWithVarsÿÿ


That is a symptom of installing rggobi_2.1.6 against ggobi 2.1.4: 
unfortunately rggobi's configure did not check the ggobi version.
It looks like you have an earlier rggobi installed, probably the one 
appropriate to your ggobi version.



Yuelin.


- R output ---

install.packages(rggobi, repos = http://lib.stat.cmu.edu/R/CRAN;, 
dependencies = TRUE, clean = TRUE)

trying URL 'http://lib.stat.cmu.edu/R/CRAN/src/contrib/rggobi_2.1.6.tar.gz'
Content type 'application/x-gzip' length 424483 bytes
opened URL
==
downloaded 414Kb

* Installing *source* package 'rggobi' ...
checking for pkg-config... /usr/bin/pkg-config
checking pkg-config is at least version 0.9.0... yes
checking for GGOBI... yes
configure: creating ./config.status
config.status: creating src/Makevars
** libs
gcc -std=gnu99 -I/usr/local/lib/R/include -I/usr/local/lib/R/include -g 
-DUSE_EXT_PTR=1 -D_R_=1 -I/usr/local/include/ggobi -I/usr/include/gtk-2.0 
-I/usr/include/libxml2 -I/usr/lib/gtk-2.0/include -I/usr/include/atk-1.0 
-I/usr/include/cairo -I/usr/include/pango-1.0 -I/usr/include/glib-2.0 
-I/usr/lib/glib-2.0/include -I/usr/include/freetype2 -I/usr/include/libpng12   
-I/usr/local/include-fpic  -g -O2 -c brush.c -o brush.o

[... snipped ...]

gcc -std=gnu99 -I/usr/local/lib/R/include -I/usr/local/lib/R/include -g 
-DUSE_EXT_PTR=1 -D_R_=1 -I/usr/local/include/ggobi -I/usr/include/gtk-2.0 
-I/usr/include/libxml2 -I/usr/lib/gtk-2.0/include -I/usr/include/atk-1.0 
-I/usr/include/cairo -I/usr/include/pango-1.0 -I/usr/include/glib-2.0 
-I/usr/lib/glib-2.0/include -I/usr/include/freetype2 -I/usr/include/libpng12   
-I/usr/local/include-fpic  -g -O2 -c dataset.c -o dataset.o
gcc -std=gnu99 -I/usr/local/lib/R/include -I/usr/local/lib/R/include -g 
-DUSE_EXT_PTR=1 -D_R_=1 -I/usr/local/include/ggobi -I/usr/include/gtk-2.0 
-I/usr/include/libxml2 -I/usr/lib/gtk-2.0/include -I/usr/include/atk-1.0 
-I/usr/include/cairo -I/usr/include/pango-1.0 -I/usr/include/glib-2.0 
-I/usr/lib/glib-2.0/include -I/usr/include/freetype2 -I/usr/include/libpng12   
-I/usr/local/include-fpic  -g -O2 -c display.c -o display.o
display.c: In function ÿÿRS_GGOBI_createDisplayÿÿ:
display.c:37: warning: passing argument 3 of ÿÿklass-createWithVarsÿÿ makes 
pointer from integer without a cast
display.c:37: warning: passing argument 4 of ÿÿklass-createWithVarsÿÿ from 
incompatible pointer type
display.c:37: warning: passing argument 5 of ÿÿklass-createWithVarsÿÿ from 
incompatible pointer type
display.c:37: error: too many arguments to function ÿÿklass-createWithVarsÿÿ
display.c:39: warning: passing argument 4 of ÿÿklass-createÿÿ from 
incompatible pointer type
display.c:39: error: too many arguments to function ÿÿklass-createÿÿ
make: *** [display.o] Error 1
chmod: cannot access `/usr/local/lib/R/library/rggobi/libs/*': No such file or 
directory
ERROR: compilation failed for package 'rggobi'
** Removing '/usr/local/lib/R/library/rggobi'
** Restoring previous '/usr/local/lib/R/library/rggobi'

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to do ANOVA with fractional values and overcome the error: Error in `storage.mode-`(`*tmp*`, value = double) : invalid to change the storage mode of a factor

2007-09-06 Thread Prof Brian Ripley
Your data file has commas as the decimal point. Use read.csv2 for such 
files.

What happened was that PercentError was read as a factor, and you can't do 
ANOVA on factors.  The warning

 In addition: Warning message:
 using type=numeric with a factor response will be ignored in:
 model.response(mf, numeric)

told you and us this quite explicitly.  If you get an error, also look at 
the warnings which may well (as here) tell you what precipitated the 
error.

On Thu, 6 Sep 2007, Emre Sevinc wrote:

 I have exported a CSV file from my EXCEL worksheet and its last column 
 contained decimal values:

 Subject;Group;Side;Difference;PercentError
 M3;1;1;;
 M5;1;1;375;18,75
 M8;1;1;250;14,58
 M10;1;1;500;12,50
 M12;1;1;375;25,00
 .
 .
 .


 When I tried to do ANOVA test on it, R complained by givin error:

 Anova3LongAuditoryFemaleError.data - read.csv(C:\\Documents\ and\ 
 Settings\\Administrator\\My 
 Documents\\CogSci\\tez\\Anova3LongAuditoryFemaleError.csv, header = TRUE, 
 sep = ;)

 Anova3LongAuditoryFemaleError.aov = aov(PercentError ~ (Group * Side), data 
 = Anova3LongAuditoryFemaleError.data)

 Error in `storage.mode-`(`*tmp*`, value = double) :
invalid to change the storage mode of a factor
 In addition: Warning message:
 using type=numeric with a factor response will be ignored in:
 model.response(mf, numeric)

 What must I do in order to make the ANOVA test on these fractional data?

 Regards.

 --
 Emre Sevinc

   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] capture.out(system())?

2007-09-05 Thread Prof Brian Ripley
On Wed, 5 Sep 2007, Gustaf Rydevik wrote:

 On 9/4/07, Werner Wernersen [EMAIL PROTECTED] wrote:
 Hi,

 I am trying to capture the console output of program I
 call via system() but that always returns only
 character(0).

 For example:
 capture.output(system(pdflatex out.tex) )

 will yield:
 character(0)

 and the output still written to the R console.

 Is there a command for intercepting this output?

 Thank you!
   Werner


 ?sink()

That is used by capture.output() to capture R output, but this question is 
about output that never goes near R.

The answer is in ?system, but might depend on the unstated OS.  Arguments 
'intern' and 'show.output.on.console' are relevant.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Choosing the optimum lag order of ARIMA model

2007-09-05 Thread Prof Brian Ripley
On Wed, 5 Sep 2007, Megh Dal wrote:

 Hi Leeds, Thanx for this reply. Actually I did not want to know whether 
 any differentiation is needed or not. My question was that : what is the 
 difference between two models :

  arima(data, c(2,1,2))

  and

  arima(diff(data), c(2,0,2))

  If I am correct then those two models are same. Therefore I should get 
 same results for both of the cases. Am I doing something wrong?

They are not the same.  Please do study the help page, and in particular 
the 'include.mean' argument.  One is a model for n observations and 
the other for n-1 observations, and how that affects the issue is 
discussed on the help page.  With the right options you will get similar 
but not identical results.

 arima(x, c(2,1,2), method=ML)

Coefficients:
  ar1  ar2  ma1 ma2
   0.0786  -0.3561  -0.0869  0.1272
s.e.  0.6135   0.4296   0.6564  0.4549

sigma^2 estimated as 0.01368:  log likelihood = 46.46,  aic = -82.92

 arima(diff(x), c(2,0,2), method=ML, include.mean=FALSE)

Coefficients:
  ar1  ar2  ma1 ma2
   0.0786  -0.3561  -0.0869  0.1272
s.e.  0.6135   0.4296   0.6564  0.4549

sigma^2 estimated as 0.01329:  log likelihood = 47.38,  aic = -82.76


And did you have permission to copy private (and impolite) messages from 
Mr Leeds to this list?  If you did, please say so in your own posting for 
the record.  Since I don't have such permission I have deleted them from 
this reply.

Professor Ripley

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] bootstrap confidence intervals with previously existing bootstrap sample

2007-09-04 Thread Prof Brian Ripley
On Tue, 4 Sep 2007, [EMAIL PROTECTED] wrote:

 Dear R users,

 I am new to R. I would like to calculate bootstrap confidence intervals
 using the BCa method for a parameter of interest. My situation is this: I
 already have a set of 1000 bootstrap replicates created from my original
 data set. I have already calculated the statistic of interest for each
 bootstrap replicate, and have also calculated the mean for this statistic
 across all the replicates. Now I would like to calculate Bca confidence
 intervals for this statistic. Is there a way to import my
 previously-calculated set of 1000 statistics into R, and then calculate
 bootstrap confidence intervals around the mean from this imported data?

 I have found the code for boot.ci in the manual for the boot package, but
 it looks like it requires that I first use the boot function, and then
 apply the output to boot.ci. Because my bootstrap samples already exist,
 I don't want to use boot, but just want to import the 1000 values I have
 already calculated, and then get R to calculate the mean and Bca confidence
 intervals based on these values. Is this possible?

Yes, it is possible but you will have to study the internal structure of 
an object of class boot (which is documented on the help page) and mimic 
it.  You haven't told us which type of bootstrap you used, which is one of 
the details you need to supply.

It might be slightly easier to work with function bcanon in package 
bootstrap, which you would need to edit to suit your purposes.

I don't know why you have picked on the BCa method: my experience is that 
if you need to correct the basic method you often need far more than 1000 
samples to get reliable results.

 Hopefully this makes sense. Thanks so much for any help or advice,

 Christy Dolph

 Graduate Student
 Water Resources Science
 University of Minnesota-Twin Cities

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sin(pi)?

2007-09-03 Thread Prof Brian Ripley
On Mon, 3 Sep 2007, Nguyen Dinh Nguyen wrote:

 Dear all,
 I found something strange when calculating sin of pi value

What exactly?  Comments below on two guesses as to what.

 sin(pi)
 [1] 1.224606e-16

That is non-zero due to using finite-precision arithmetic.  The number 
stored as pi is not exactly the mathematics quantity, and so 
sin(representation of pi) should be non-zero (although there is also 
rounding error in calculating what it is).

Note that sin() is computed by your C runtime, so the exact result will 
depend on your OS, compiler and possibly CPU.

 pi
 [1] 3.141593

That is the printout of pi to the default 7 significant digits.  R knows 
pi to higher accuracy:

 print(pi, digits=16)
[1] 3.141592653589793
 sin(3.141592653589793)
[1] 1.224606e-16

but note that printing to 16 digits and reading back in might not have 
given the same number, but happens to for pi at least on my system:

 3.141592653589793 == pi
[1] TRUE


 sin(3.141593)
 [1] -3.464102e-07

 Any help and comment should be appreciated.
 Regards
 Nguyen

 
 Nguyen Dinh Nguyen
 Garvan Institute of Medical Research
 Sydney, Australia



-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Different behavior of mtext

2007-09-03 Thread Prof Brian Ripley

On Sun, 2 Sep 2007, Sébastien wrote:


Dear R Users,

I am quite surprised to see that mtext gives different results when it
is used with 'pairs' and with plot'. In the two following codes, it
seems that the 'at' argument in mtext doesn't consider the same unit system.


It is stated to be in 'user coordinates'.  Your code does not work because 
unit() is missing.  If you mean the one from package grid, npc is not 
user coordinates (and refers to a grid viewport which you have not set up 
and coincidentally is the same as the initial user coordinate system to 
which pairs() has reverted).


Try par(usr) after your pairs() and plot() calls to see the difference.
Plotting a 2x2 array of plots _is_ different from plotting one, so this 
should be as expected.


Since centring is the default for 'adj', it is unclear what you are trying 
to achieve here.



I would appreciate your comments on this issue.

Sebastien

# Pairs

mydata-data.frame(x=1:10,y=1:10)

par(cex.main=1, cex.axis=1, cex.lab=1, lwd=1,
   mar=c(5 + 5,4,4,2)+0.1)

pairs(mydata,oma=c(5 + 5,4,4,2))

mylegend-c(mylegend A,mylegend B,mylegend C,mylegend test)
mylegend.width = strwidth(mylegend[which.max(nchar(mylegend))], figure)

for (i in 1:4) {
mtext(text=mylegend[i],
   side = 1,
   line = 3+i,
   at = unit((1-mylegend.width)/2,npc),# centers the
legend at the bottom
   adj=0,
   padj=0)}

# plot

mydata-data.frame(x=1:10,y=1:10)

par(cex.main=1, cex.axis=1, cex.lab=1, lwd=1,
   mar=c(5 + 5,4,4,2)+0.1)

plot(mydata,oma=c(5 + 5,4,4,2))

mylegend-c(mylegend A,mylegend B,mylegend C,mylegend test)
mylegend.width = strwidth(mylegend[which.max(nchar(mylegend))], figure)

for (i in 1:4) {
mtext(text=mylegend[i],
   side = 1,
   line = 3+i,
   at = unit((1-mylegend.width)/2,npc),# should
center the legend at the bottom but doesn't do it !
   adj=0,
   padj=0)}


--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Different behavior of mtext

2007-09-03 Thread Prof Brian Ripley

On Mon, 3 Sep 2007, Sébastien wrote:

Ok, the problem is clear now. I did not get that 'user-coordinates' was 
refering to par(usr), when I read the help of mtext. If I may ask you some 
additional questions:
- you mentioned a missing unit() call ; at which point should it be done in 
my code examples ?


Before it is used.  The problem is that I believe more than one package 
has a unit() function.


- could you give me some advices or helpful links about how to set up a grid 
viewport ? - and finally, probably a stupid question: is a gridview 
automatically set up when a plotting function is called ?


If you want to mix grid and base graphics, you need package gridBase, but 
really I would not advise a beginner to be using grid directly (that is, 
not via lattice to ggplot*).




Sebastien

PS: To answer to your final question, my goal is to center a block of legend 
text on the device but to align the text to the left of this block.


Prof Brian Ripley a écrit :

On Sun, 2 Sep 2007, Sébastien wrote:


Dear R Users,

I am quite surprised to see that mtext gives different results when it
is used with 'pairs' and with plot'. In the two following codes, it
seems that the 'at' argument in mtext doesn't consider the same unit 
system.


It is stated to be in 'user coordinates'.  Your code does not work because 
unit() is missing.  If you mean the one from package grid, npc is not 
user coordinates (and refers to a grid viewport which you have not set up 
and coincidentally is the same as the initial user coordinate system to 
which pairs() has reverted).


Try par(usr) after your pairs() and plot() calls to see the difference.
Plotting a 2x2 array of plots _is_ different from plotting one, so this 
should be as expected.


Since centring is the default for 'adj', it is unclear what you are trying 
to achieve here.



I would appreciate your comments on this issue.

Sebastien

# Pairs

mydata-data.frame(x=1:10,y=1:10)

par(cex.main=1, cex.axis=1, cex.lab=1, lwd=1,
   mar=c(5 + 5,4,4,2)+0.1)

pairs(mydata,oma=c(5 + 5,4,4,2))

mylegend-c(mylegend A,mylegend B,mylegend C,mylegend test)
mylegend.width = strwidth(mylegend[which.max(nchar(mylegend))], figure)

for (i in 1:4) {
mtext(text=mylegend[i],
   side = 1,
   line = 3+i,
   at = unit((1-mylegend.width)/2,npc),# centers the
legend at the bottom
   adj=0,
   padj=0)}

# plot

mydata-data.frame(x=1:10,y=1:10)

par(cex.main=1, cex.axis=1, cex.lab=1, lwd=1,
   mar=c(5 + 5,4,4,2)+0.1)

plot(mydata,oma=c(5 + 5,4,4,2))

mylegend-c(mylegend A,mylegend B,mylegend C,mylegend test)
mylegend.width = strwidth(mylegend[which.max(nchar(mylegend))], figure)

for (i in 1:4) {
mtext(text=mylegend[i],
   side = 1,
   line = 3+i,
   at = unit((1-mylegend.width)/2,npc),# should
center the legend at the bottom but doesn't do it !
   adj=0,
   padj=0)}






--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Efficient sampling from a discrete distribution in R

2007-09-03 Thread Prof Brian Ripley
On Mon, 3 Sep 2007, Issac Trotts wrote:

 Hello r-help,

 As far as I've seen, there is no function in R dedicated to sampling
 from a discrete distribution with a specified mass function.  The
 standard library doesn't come with anything called rdiscrete or rpmf,
 and I can't find any such thing on the cheat sheet or in the
 Probability Distributions chapter of _An Introduction to R_.  Googling
 also didn't bring back anything.  So, here's my first attempt at a
 solution.  I'm hoping someone here knows of a more efficient way.

It's called sample().

There are much more efficient algorithms than the one you used, and 
sample() sometimes uses one of them (Walker's alias method): see any good 
book on simulation (including my 'Stochastic Simulation, 1987).

 # Sample from a discrete distribution with given probability mass function
 rdiscrete = function(size, pmf) {
  stopifnot(length(pmf)  1)
  cmf = cumsum(pmf)
  icmf = function(p) {
min(which(p  cmf))
  }
  ps = runif(size)
  sapply(ps, icmf)
 }

 test.rdiscrete = function(N = 1) {
  err.tol = 6.0 / sqrt(N)
  xs = rdiscrete(N, c(0.5, 0.5))
  err = abs(sum(xs == 1) / N - 0.5)
  stopifnot(err  err.tol)
  list(e = err, xs = xs)
 }

 Thanks,
 Issac

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Choosing the optimum lag order of ARIMA model

2007-08-31 Thread Prof Brian Ripley
On Fri, 31 Aug 2007, Megh Dal wrote:

 Dear all R users,

  I am really struggling to determine the most appropriate lag order of 
 ARIMA model. My understanding is that, as for MA [q] model the auto 
 correlation coeff vanishes after q lag, it says the MA order of a ARIMA 
 model, and for a AR[p] model partial autocorrelation vanishes after p 
 lags it helps to determine the AR lag. And most appropriate model 
 choosed by this argument gives min AIC.

The last part is fallacious.  Also, you are applying your rules to 
selecting the orders in ARMA models, and they apply only to pure MA or AR 
models.

The R test file src/library/stats/tests/ts-tests.R has an example of model 
selection by AIC.


  Now I considered following data :

  2.1948 2.2275 2.2669 2.2839 1.9481 2.1319 2.0238 2.3109 2.5727 2.5176
 2.5728 2.6828 2.8221 2.879 2.8828 2.9955 2.9906 2.9861 3.0452 3.068
 2.9569 3.0256 3.0977 2.985 2.9572 3.0877 3.1009 3.1149 2.8886 2.9631
 3.0325 2.9175 2.7231 2.7905 2.8493 2.8208 2.8156 2.9115 2.701 2.6928
 2.7881 2.723 2.7266 2.9494 3.113 3.0566 3.0358 3.05 3.0724 3.1365
 3.1083 3.0257 3.2211 3.4269 3.327 3.1205 2.9997 3.0201 3.0803 3.2059
 3.1997 3.038 3.1613 3.2802 3.2194

  ACF for 1st diff series:
  Autocorrelations of series 'diff(data1)', by lag
   0  1  2  3  4  5  6  7  8  9 10
 1.000 -0.022 -0.258 -0.016  0.066  0.034  0.035 -0.001 -0.089  0.028  0.222
11 12 13 14 15 16 17 18
 -0.132 -0.184 -0.038  0.048 -0.026 -0.041 -0.067  0.059

PACF for 1st diff series:
  Partial autocorrelations of series 'diff(data1)', by lag
   1  2  3  4  5  6  7  8  9 10 11
 -0.022 -0.258 -0.031 -0.002  0.026  0.057  0.021 -0.069  0.029  0.194 -0.124
12 13 14 15 16 17 18
 -0.100 -0.111 -0.043 -0.078 -0.056 -0.085  0.086

  On basis of that I choose ARIMA[2,1,2] for the original data

  But I got error while doing that :

   arima(data1, c(2,1,2))
 Error in arima(data1, c(2, 1, 2)) : non-stationary AR part from CSS

  And AIC for other combination of lags are:
   arima(data1, c(2,1,1))$aic
 [1] -84.83648
 arima(data1, c(1,1,2))$aic
 [1] -84.35737
 arima(data1, c(1,1,1))$aic
 [1] -83.79392

  Hence on basis of AIC criteria if I choose ARIMA[2,1,1] model, then the 
 first rule that I said earlier does not support.

  Am I making anything wrong? Can anyone give me any suggestion on what 
 is the universal rule for choosing the best lag?

  Regards,








 -


   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R and Windows Vista

2007-08-31 Thread Prof Brian Ripley
On Thu, 30 Aug 2007, Jan Budczies wrote:


 Hello group,

 it is reported (R for Windows FAQ) that R runs under Windows Vista.
 However, does someone here have experience with R under Vista 64
 and large (3 or 4 GB) memory?

Yes, the person who wrote the FAQ entry does.

Note that the distributed Windows binary of R is a 32-bit executable, so 
the maximum memory it can address is 4GB (and it can do that in Vista 64 
on a 4GB RAM machine, unlike any 32-bit version of Windows).

If you want to use R on Vista 64, I suggest you use a current R-devel 
snapshot, as some changes have been made based on this experience.


 Greeting - Jan Budczies



   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Incomplete Gamma function

2007-08-31 Thread Prof Brian Ripley
On Fri, 31 Aug 2007, Robin Hankin wrote:

 Hi Kris


 lgamma() gives the log of the gamma function.

Yes, but he used Igamma.  According to ?pgamma,

  'pgamma' is closely related to the incomplete gamma function.  As
  defined by Abramowitz and Stegun 6.5.1

  P(a,x) = 1/Gamma(a) integral_0^x t^(a-1) exp(-t) dt

  P(a, x) is 'pgamma(x, a)'.  Other authors (for example Karl
  Pearson in his 1922 tables) omit the normalizing factor, defining
  the incomplete gamma function as 'pgamma(x, a) * gamma(a)'.

and that seems to be what Igamma is following.  GSL on the other hand has 
the other tail, so

 a - 9
 x - 11.1
 pgamma(x, a, lower=FALSE)*gamma(a)
[1] 9000.501


 You need gamma_inc() of the gsl package, a wrapper for the
 GSL library:

  gamma_inc(9,11.1)
 [1] 9000.501
 

As the above shows, you don't *need* this, but you do need the GSL 
documentation to find out what R package gsl does.  Why it differs from 
the usual references is something for you to explain.  Wikipedia
http://en.wikipedia.org/wiki/Incomplete_gamma_function
distinguishes them, as does MathWorld.

I suggest you add a clarification to the gsl package as to what the 
'incomplete gamma function' means there.


 On 31 Aug 2007, at 00:29, [EMAIL PROTECTED] wrote:

 Hello

 I am trying to evaluate an Incomplete gamma function
 in R. Library Zipfr gives the Igamma function. From
 Mathematica, I have:

 Gamma[a, z] is the incomplete gamma function.

 In[16]: Gamma[9,11.1]
 Out[16]: 9000.5

 Trying the same in R, I get

 Igamma(9,11.1)
 [1] 31319.5
 OR
 Igamma(11.1,9)
 [1] 1300998

 I know I have to understand the theory and the math
 behind it rather than just ask for help, but while I
 am trying to do that (and only taking baby steps, I
 must admit), I was hoping someone could help me out.

 Regard

 Kris.



 __
 __
 Got a little couch potato?
 Check out fun summer activities for kids.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

 --
 Robin Hankin
 Uncertainty Analyst
 National Oceanography Centre, Southampton
 European Way, Southampton SO14 3ZH, UK
  tel  023-8059-7743

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] About = in command line in windows.

2007-08-31 Thread Prof Brian Ripley
On Fri, 31 Aug 2007, Vladimir Eremeev wrote:

 It seems, I don't understand something, or there is a bug in R.

A limitation in command-line parsing which is Windows-specific.

Don't use -e for complex expressions, as the quoting is getting removed by 
your shell.  In Windows both the shell (and it matters which shell) and 
the (compiled C) executable parse the command line, and that leads to 
surprises as in your (1).  At that point the rule about NAME=VALUE on the 
command line meaning 'set environment variable NAME' comes into play.

We could try harder (and maybe one day we will), but this really is 
'quoting hell' and there is no hardship in using alternatives such as

% cat  test.R
mean(x=1:3)
^D
% rscript test.R

and

% echo mean(x=1:3) | rscript -



 I have made some experiments after my yesterday post about using = with -e
 switch to the Rscript.

 Now, I've found:

 (1)
 C:\users\wl\trainings\rrscript --verbose -e mean(x=1:3)
 running
  'C:\Program Files\R\bin\Rterm.exe --slave --no-restore -e mean(x=1:3)'

 Error in -args : invalid argument to unary operator
 Execution halted

 (2)
 C:\users\wl\trainings\rRterm --slave --no-restore -e mean(x=1:3)

 Nothing is printed on the console, but the window appears, saying R for
 Windows terminal front-end has encountered a problem and needs to close. We
 are sorry for the inconvenience.

 (3)
 C:\users\wl\trainings\rrscript --verbose -e mean(1:3)
 running
  'C:\Program Files\R\bin\Rterm.exe --slave --no-restore -e mean(1:3)'

 [1] 2

 (4)
 C:\users\wl\trainings\rRterm.exe --slave --no-restore -e mean(1:3)
 [1] 2

 (5)
 C:\users\wl\trainings\rRterm.exe --slave --no-restore -e 'mean(1:3)'
 [1] mean(1:3)

 Points (1) and (2) don't seem normal to me, however, I don't see, what I am
 doing wrong.
 I use windowsXP Pro, my colleague uses windows 2000 and reports the same
 problems.
 My sessionInfo():

 sessionInfo()
 R version 2.5.1 Patched (2007-08-19 r42614)
 i386-pc-mingw32

 locale:
 LC_COLLATE=Russian_Russia.1251;LC_CTYPE=Russian_Russia.1251;LC_MONETARY=Russian_Russia.1251;LC_NUMERIC=C;LC_TIME=Russian_Russia.1251

 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods
 [7] base




-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Another issue with the Matrix package *under R-devel*

2007-08-30 Thread Prof Brian Ripley
I suspect you have not using a re-installed Matrix after re-building R.
I can reproduce the problem using a version of Matrix I installed under 
2.5.1, but not with one installed under R-devel this week.

Since R-devel is 'Under development' you may need to reinstall packages 
when it changes.  This is particularly prevalent with S4-using packages, 
as the methods code tends to capture the value of objects as they existed 
when a package was installed.  In this case log() was changed quite a few 
weeks ago, but Matrix needs to be reinstalled after other changes last 
weekend.

For the record, the packages I know need to reinstalled under a recent 
R-devel are Matrix, distr, kernlab, kinship and matlab (but there may be 
others).

And please use the appropriately named R-devel list for questions about 
R-devel.


On Thu, 30 Aug 2007, Tony Chiang wrote:

 Hi all,

 I am encountering a strange issue with the Matrix package. I have just built
 R-devel from source on my macbook pro, and I wonder if others can reproduce
 this problem. I will give example code to go along:

 Starting a fresh R session:

 R version 2.6.0 Under development (unstable) (2007-08-30 r42697)
 Copyright (C) 2007 The R Foundation for Statistical Computing
 ISBN 3-900051-07-0

 ...

 log(2)
 [1] 0.6931472
 library(Matrix)
 Loading required package: lattice
 log(2)
 Error in log(2) :
  could not find symbol base in environment of the generic function

 There seems to be something wrong here and I cannot figure out what it is.
 Am I doing something wrong or is it an issue with Matrix (which is what I
 have narrowed it down to). I think that it might be a namespace collision or
 something, but I am certainly not sure.

 Here is my sessionInfo() output:
 sessionInfo()
 R version 2.6.0 Under development (unstable) (2007-08-30 r42697)
 i386-apple-darwin8.10.1

 locale:
 C

 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base

 other attached packages:
 [1] Matrix_0.99875-2 lattice_0.16-3

 loaded via a namespace (and not attached):
 [1] grid_2.6.0

 Thanks.

 Tony

   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Behaviour of very large numbers

2007-08-30 Thread Prof Brian Ripley
On Thu, 30 Aug 2007, willem vervoort wrote:

 Dear all,
 I am struggling to understand this.

 What happens when you raise a negative value to a power and the result
 is a very large number?

Where are the 'very large numbers' here?  R can cope with much larger 
numbers (over 10^300).

 B
 [1] 47.73092

 -51^B
 [1] -3.190824e+81

Yes, that is -(51^B).

 # seems fine
 # now this:
 x - seq(-51,-49,length=100)

 x^B
  [1] NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 
 snip
 is.numeric(x^B)
 [1] TRUE
 is.real(x^B)
 [1] TRUE
 is.infinite(x^B)
  [1] FALSE FALSE FALSE FALSE FALSE

 I am lost, I checked the R mailing help, but could not find anything
 directly. I loaded package Brobdingnag and tried:
 as.brob(x^B)
  [1] NAexp(NaN) NAexp(NaN) NAexp(NaN) NAexp(NaN) NAexp(NaN)
 as.brob(x)^B

 I guess I must be misunderstanding something fundamental.

You are.  A negative number to a non-integer power is undefined in the 
real number system.

Look at (x+0i)^B.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Q: how to interrupt long calculation?

2007-08-30 Thread Prof Brian Ripley
On Thu, 30 Aug 2007, D. R. Evans wrote:

 Paul Smith said the following at 08/29/2007 04:32 PM :

 The instance of R running will be immediately killed and then you can
 start R again.

 But then I would lose all the work. There must be some way to merely
 interrupt the current calculation. Mustn't there?

Only if it is long-running in R code, when ctrl-C or equivalent (Esc in 
Rgui) works. If it is long-running in C or Fortran code, there is not.

Assuming a Unix-alike, sending SIGUSR1 will save the current workspace and 
quit.  Even that is a little dangerous as the workspace need not be in a 
consistent state.

People who have been bitten will learn safer programming practices, for 
example to call save.image() at suitable checkpoint times.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Q: how to interrupt long calculation?

2007-08-30 Thread Prof Brian Ripley
On Thu, 30 Aug 2007, D. R. Evans wrote:

 Prof Brian Ripley said the following at 08/30/2007 11:00 AM :
 On Thu, 30 Aug 2007, D. R. Evans wrote:

 Paul Smith said the following at 08/29/2007 04:32 PM :

 The instance of R running will be immediately killed and then you can
 start R again.
 But then I would lose all the work. There must be some way to merely
 interrupt the current calculation. Mustn't there?

 Only if it is long-running in R code, when ctrl-C or equivalent (Esc in
 Rgui) works. If it is long-running in C or Fortran code, there is not.


 It's inside loess()... so isn't that R code?

No, it is mainly Fortran, called from C called from R.

 I can sit hitting ctrl-C all day (well, it seems like it), but the code
 does not get interrupted :-(

 Assuming a Unix-alike, sending SIGUSR1 will save the current workspace and
 quit.  Even that is a little dangerous as the workspace need not be in a
 consistent state.


 That's helpful, thank you; at least it means I stand a chance of being able
 to interrupt the code and recover.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sql query over local tables

2007-08-29 Thread Prof Brian Ripley
On Tue, 28 Aug 2007, Jorge Cornejo Donoso wrote:

 Hi i have to table with IDs in each one.

And what is a 'table'?  If these are data frames, see ?merge.  If they are 
tables (which are arrays in R), then still use merge() if they can be 
converted to data frames.

 I want to make a join (as in sql) by the ID. Is any way to use the RODBC
 package (or other) in local tables (not a access, mysql, sql, etc. )  and
 made the join?

No, they use the DBMS to do the hard work.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] xeon processor and ATLAS

2007-08-29 Thread Prof Brian Ripley
On Tue, 28 Aug 2007, hui xie wrote:

 hi everyone:

 I have a Dell Server that has a Xeon processor, and I would like to use 
 the best ATLAS posted in the R website. I find that R has ATLAS for 
 core2duo and P4. I am not sure which one of these two is best suited for 
 Xeon processor, or is that neither of these two is good and I should 
 stick with the default one that was installed originally?

And your OS is?

There are many different 'Xeon' processors with very different 
capabilities.  You really ought to build ATLAS for yourself if numerical 
linear algebra performance matters to you (and it makes little difference 
to most people: I think Uwe Ligges quoted 10% for testing all CRAN 
packages).


 Your advice is very much appreciated!

 Best,

 Hui


 -
 Park yourself in front of a world of choices in alternative vehicles.

   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

Please do!


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Limiting size of pairs plots

2007-08-28 Thread Prof Brian Ripley

From ?pairs


 The graphical parameter 'oma' will be set by 'pairs.default'
 unless supplied as an argument.

so try

pairs(iris[1:4], main = Anderson's Iris Data -- 3 species, pch = 21,
  bg = c(red, green3, blue)[unclass(iris$Species)],
  oma = c(8,3,5,3))



On Tue, 28 Aug 2007, Sébastien wrote:


Dear R-users,

I would like to add a legend at the bottom of pairs plots (it's my first
use of this function). With the plot function, I usually add some
additional space at the bottom when I define the size of the graphical
device (using mar); grid functions then allows me to draw my legend as I
want.
Unfortunatley, this technique does not seem to work with the pairs
function as the generated plots use all the available space on the
device (see below). I guess I am missing a key argument... my attempts
to modify the oma, mar, usr arguments were unsuccesfull, and I could not
find any helpful threads on the archives.

As usual, any advice would be greatly appreciated

Sebastien


pdf(file=C:/test.pdf, width=6, height= 6 + 0.2*6)

par(mar=c(5 + 6,4,4,2)+0.1)

pairs(iris[1:4], main = Anderson's Iris Data -- 3 species, pch = 21,
bg = c(red, green3, blue)[unclass(iris$Species)])

dev.off()

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Forcing coefficients in lm object

2007-08-28 Thread Prof Brian Ripley

It is fit$coefficients, not fit$coef .


From the help page:


name: A literal character string or a name (possibly backtick
  quoted).  For extraction, this is normally (see under
  Environments) partially matched to the 'names' of the object.

Note the qualifier 'for extraction', so you assigned a new element with 
name 'coef', and predict.lm used fit$coefficients.



On Tue, 28 Aug 2007, [EMAIL PROTECTED] wrote:


Dear all,

I would like to use predict.lm() with an existing lm object but with new arbitrary 
coefficients. I modify 'fit$coef' (see example below) by hand but the actual 
model in 'fit' used for prediction does not seem to be altered (although fit$coef is!).

Can anyone please help me do this properly?

Thanks in advance,

Jérémie




dat - data.frame(y=c(0,25,32,15), x=as.factor(c(1,1,2,2)))
fit - lm(y ~ x, data=dat)
fit


Call:
lm(formula = y ~ x, data = dat)

Coefficients:
(Intercept)   x2
  12.5 11.0


fit$coef[[2]] - 100
dat.new - data.frame(x=as.factor(c(1,2,1,2)))
predict.lm(fit, dat.new)

  1234
12.5 23.5 12.5 23.5

fit


Call:
lm(formula = y ~ x, data = dat)

Coefficients:
(Intercept)   x2
  12.5 11.0


fit$coef

(Intercept)  x2
  12.5   100.0






Jérémie Lebrec
Dept. of Medical Statistics and Bioinformatics
Leiden University Medical Center
Postzone S-05-P
P.O. Box 9600
2300 RC Leiden
The Netherlands
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Rmpi and x86

2007-08-28 Thread Prof Brian Ripley
On Tue, 28 Aug 2007, Martin Morgan wrote:

 Edna --

 I'll keep this on the list, so that others will learn, and others will
 correct me when I give bad advice!

 relocation R_X86_64_32 against `lam_mpi_comm_world' can not be used
 when making a shared object; recompile with -fPIC

 This likely means that your lam was not built with the --enable-shared
 configure option, as documented in the installation guide. (It might
 also mean that R was not configure with --enable-R-shlib; the message
 is opaque to me).

The line above is

/usr/lib64/gcc/x86_64-suse-linux/4.1.0/../../../../lib64/libmpi.a(infoset.o):

so it means lam needs to be rebuilt either with --enable-shared or just 
with -fPIC added to CFLAGS.  (Users migrating from i386 Linux to x86_64 
Linux got quite used to this quirk.)

 I believe others on the list have only had success with specific
 versions of LAMMPI, so that would be the next place to look (after
 sorting out the shared library issue)

 Martin

 Edna Bell [EMAIL PROTECTED] writes:

 Here is what happens:
 Note: lam-7.1.4

 linux-tw9c:/home/bell/Desktop/R-2.5.1/bin # ./R CMD INSTALL --clean
 Rmpi_0.5-3.tar.gz
 * Installing to library '/home/bell/Desktop/R-2.5.1/library'
 * Installing *source* package 'Rmpi' ...
 checking for gcc... gcc
 checking for C compiler default output... a.out
 checking whether the C compiler works... yes
 checking whether we are cross compiling... no
 checking for suffix of executables...
 checking for suffix of object files... o
 checking whether we are using the GNU C compiler... yes
 checking whether gcc accepts -g... yes
 checking for gcc option to accept ANSI C... none needed
 checking how to run the C preprocessor... gcc -E
 checking for egrep... grep -E
 checking for ANSI C header files... yes
 checking for sys/types.h... yes
 checking for sys/stat.h... yes
 checking for stdlib.h... yes
 checking for string.h... yes
 checking for memory.h... yes
 checking for strings.h... yes
 checking for inttypes.h... yes
 checking for stdint.h... yes
 checking for unistd.h... yes
 checking mpi.h usability... yes
 checking mpi.h presence... yes
 checking for mpi.h... yes
 Try to find libmpi or libmpich ...
 checking for main in -lmpi... yes
 Try to find liblam ...
 checking for main in -llam... yes
 checking for openpty in -lutil... yes
 checking for main in -lpthread... yes
 configure: creating ./config.status
 config.status: creating src/Makevars
 ** libs
 gcc -std=gnu99 -I/home/bell/Desktop/R-2.5.1/include
 -I/home/hodgesse/Desktop/R-2.5.1/include -DPACKAGE_NAME=\\
 -DPACKAGE_TARNAME=\\ -DPACKAGE_VERSION=\\ -DPACKAGE_STRING=\\
 -DPACKAGE_BUGREPORT=\\ -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1
 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1
 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1
 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1   -DMPI2 -I/usr/local/include
 -fpic  -g -O2 -c conversion.c -o conversion.o
 gcc -std=gnu99 -I/home/bell/Desktop/R-2.5.1/include
 -I/home/bell/Desktop/R-2.5.1/include -DPACKAGE_NAME=\\
 -DPACKAGE_TARNAME=\\ -DPACKAGE_VERSION=\\ -DPACKAGE_STRING=\\
 -DPACKAGE_BUGREPORT=\\ -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1
 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1
 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1
 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1   -DMPI2 -I/usr/local/include
 -fpic  -g -O2 -c internal.c -o internal.o
 gcc -std=gnu99 -I/home/bell/Desktop/R-2.5.1/include
 -I/home/bell/Desktop/R-2.5.1/include -DPACKAGE_NAME=\\
 -DPACKAGE_TARNAME=\\ -DPACKAGE_VERSION=\\ -DPACKAGE_STRING=\\
 -DPACKAGE_BUGREPORT=\\ -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1
 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1
 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1
 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1   -DMPI2 -I/usr/local/include
 -fpic  -g -O2 -c RegQuery.c -o RegQuery.o
 gcc -std=gnu99 -I/home/bell/Desktop/R-2.5.1/include
 -I/home/bell/Desktop/R-2.5.1/include -DPACKAGE_NAME=\\
 -DPACKAGE_TARNAME=\\ -DPACKAGE_VERSION=\\ -DPACKAGE_STRING=\\
 -DPACKAGE_BUGREPORT=\\ -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1
 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1
 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1
 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1   -DMPI2 -I/usr/local/include
 -fpic  -g -O2 -c Rmpi.c -o Rmpi.o
 gcc -std=gnu99 -shared -L/usr/local/lib64 -o Rmpi.so conversion.o
 internal.o RegQuery.o Rmpi.o -lmpi -llam -lutil -lpthread
 /usr/lib64/gcc/x86_64-suse-linux/4.1.0/../../../../x86_64-suse-linux/bin/ld:
 /usr/lib64/gcc/x86_64-suse-linux/4.1.0/../../../../lib64/libmpi.a(infoset.o):
 relocation R_X86_64_32 against `lam_mpi_comm_world' can not be used
 when making a shared object; recompile with -fPIC
 /usr/lib64/gcc/x86_64-suse-linux/4.1.0/../../../../lib64/libmpi.a:
 could not read symbols: Bad value
 collect2: ld returned 1 exit status
 make: *** [Rmpi.so] Error 1
 chmod: cannot access `/home/bell/Desktop/R-2.5.1/library/Rmpi/libs/*':
 No such file or directory
 ERROR: compilation failed for package 'Rmpi'
 ** 

Re: [R] Problem with save or/and if (I think but maybe not ...)

2007-08-27 Thread Prof Brian Ripley

On Mon, 27 Aug 2007, Ptit_Bleu wrote:



Hi,

I recently discovered the R program and I thought it could be useful to me.
I have to analyse data saved as .Px file (x between 0 and 8 - .P0 files have
18 lines at the beginning that I have to skip). New files are generated
everyday.


relrfichiers-dir(chemin, pattern=.P)

does not do that, though.  Better to use

dir(chemin, pattern=\\.[0-8]$, full.names=TRUE)

or

Sys.glob(file.path(chemin, *.P[0-8]))



This is my strategy :

In order to analyse the data, I first want to copy the new data in a
database in MySQL (which already contains the previous data).
So the first task is to compare the list of the files in the directory
(object : rfichiers) to the list of the files already saved (object :
tfichiers). The list containing the new files is then given by
nfichiers-setdiff(rfichiers, tfichiers).

It sounds easy ...
... but it doesn't work !!!

Up to now, I'm am able to connect to MySQL and, if the file tfichiers.r
doesn't exist, I can copy data files to the MySQL database.
But if tfichiers.r already exists and there is no new file to save, it
ignores the condition if (nfichiers!=0) and save all the files of the
directory to the database.


What did you intend there?  It is not a test of no difference, but a test 
that each element of the difference is not 0, and furthermore if() 
expects a test of length one, not the length of nfichiers.  I suspect you 
intended to test length(nfichiers)  0.


It often helps to print (or use str on) the objects you create.  Try this 
on


nfichiers
nfichiers!=0


Is it a problem with the way I save tfichiers or is it a problem with the
condition if (nfichiers!=0) ?


Saving in R save format with extension .r is going to confuse others. 
Extension .rda is conventional for save format (and I doubt you need an 
ascii save).



Could you please give me some advices to correct my script (written with
Tinn-R) ?

I thank you in advance for your help.
Have a nice week,
Ptit Bleu.

PS : Ptit Bleu means something like Full Newbye in french. So thanks to be
patient :-)
PPS : I hope you understand my french english

--


# Connexion a la base de donnees database de MySQL

library(DBI)
library(RMySQL)
drv-dbDriver(MySQL)
con-dbConnect(drv, username=user, password=password, dbname=database,
host=localhost)


# Creation des objets contenant la liste des fichiers (rel pour chemin
relatif)
# - dans le repertoire : objet rfichiers
# - deja traites : objet tfichiers
# - nouveaux depuis la derniere connexion : objet nfichiers
# chemin est le repertoire de stockage des donnees
# RWork est le repertoire de travail de R
# sep='' pour eviter l'ajout d'un espace apres Mydata/

setwd(D:/RWork)
chemin-d:/Mydata/
relrfichiers-dir(chemin, pattern=.P)
rfichiers-paste(chemin,relrfichiers, sep='')
if (file.exists(tfichiers.r))
 {
   tfichiers-load(tfichiers.r)
   nfichiers-setdiff(rfichiers,tfichiers)
 } else {
   nfichiers-rfichiers
 }


# p0fichiers : fichiers avec l'extension .P0 (fichiers contenant des lignes
d'infos à ne pas charger)
# pxfichiers : fichiers avec les extensions P1, ..., P8 (sans infos au
debut)

if (nfichiers!=0)
{
 p0fichiers-nfichiers[grep(.P0, nfichiers)]
 pxfichiers-setdiff(nfichiers, p0fichiers)


# Fusion des colonnes jour et heure pour permettre de tracer des variations
en fonction du temps
# Chaque fichier contenu dans l'objet p0fichiers est chargé, en supprimant
les 18 premieres lignes,
# et on met dans l'objet jourheure la fusion de la colonne jour (V1) et de
la colonne heure (V2)
# L'objet jourheure est recopie dans la premiere colonne de donnees
# On supprime ensuite la deuxieme colonne (contenant les heures) qui est
maintenant superflue
# L'objet donnees est copié dans la base de donnees MySQL Mydata
# Remarque : R comprend le format jour/mois/annee - MySQL : annee/mois/jour
- stockage en CHAR dans MySQL

 for (i in 1:length(p0fichiers))
   {
 donnees-read.table(p0fichiers[i], quote=\, sep=;, dec=,,
skip=18)
 jourheure-paste(donnees$V1, donnees$V2, sep= )
 donnees[1]-jourheure
 donnees-donnees[,-2]
#  assignTable(con, Datatable, donnees, append=TRUE) - Ne marche pas
 dbWriteTable(con, Datatable, donnees, append=TRUE)
 rm(donnees, jourheure)
   }


# Idem avec les fichiers d'extension .Px en chargant toutes les lignes
(skip=0)
# Amelioration possible : creer une fonction avec en argument p0fichiers ou
pxfichiers

 for (i in 1:length(pxfichiers))
   {
 donnees-read.table(pxfichiers[i], quote=\, sep=;, dec=,,
skip=0)
 jourheure-paste(donnees$V1, donnees$V2, sep= )
 donnees[1]-jourheure
 donnees-donnees[,-2]
#   assignTable(con, Datatable, donnees, append=TRUE) - Ne marche pas
 dbWriteTable(con, Datatable, donnees, append=TRUE)
 rm(donnees, jourheure)
   }
}

tfichiers-rfichiers
save(rfichiers, file=tfichiers.r, ascii=TRUE)
rm(list=ls())

# Deconnexion à MySQL

dbDisconnect(con)



--
Brian D. Ripley,  

Re: [R] How to provide argument when opening RGui from an external application

2007-08-26 Thread Prof Brian Ripley

On Sun, 26 Aug 2007, Duncan Murdoch wrote:


On 26/08/2007 7:14 AM, Sébastien wrote:

Thanks for your reply.
When you say look into Rscript.exe, do you have a specific document in 
mind ? I tried to google it but could not find much... I forgot to mention 
in my first email that I am working under the Windows XP environment.


You could try ?Rscript within R, or Rscript --help from the command line 
(assuming you have R's bin directory on your path.


Or read 'An Introduction to R'.



Duncan Murdoch



Prof Brian Ripley a écrit :
Look into Rscript.exe (on Windows), which is a flexible way to run 
scripts.  Neither using a GUI nor using source() are recommended.


On Fri, 24 Aug 2007, Sébastien wrote:


Dear R-users,

I have written a small application (in visual basic) that automatically
generate some R scripts. I would like to execute these scripts when my
application is being closed.
My problem is that I don't know how to pass the
'source(c:/.../myscript.r)' instruction when I programmatically start
RGui. Tinn-R is capable of doing such things, so I guess there must be a
way to pass arguments to RGui.

Any advice or link to relevant references would be greatly appreciated.

Sebastien


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.





--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R-2.5.1 RedHat EL5 compilation failed

2007-08-26 Thread Prof Brian Ripley
Well, the INSTALL file said

   The main source of information on installation is the `R Installation
   and Administration Manual', an HTML copy of which is available as file
   `doc/html/R-admin.html'.  Please read that before installing R.  But
   if you are impatient, read on but please refer to the manual to
   resolve any problems.  (If you obtained R using Subversion, the manual
   is at doc/manual/R-admin.texi.)

and this _is_ discussed there.  Hint: is readline-devel installed?

On Sun, 26 Aug 2007, Wang Chengbin wrote:

 I can't get R-2.5.1 compiled under RedHat EL5 with gcc 4.1.1. Configure
 failed at the following:

 checking readline/history.h usability... no
 checking readline/history.h presence... no
 checking for readline/history.h... no
 checking readline/readline.h usability... no
 checking readline/readline.h presence... no
 checking for readline/readline.h... no
 checking for rl_callback_read_char in -lreadline... no
 checking for main in -lncurses... no
 checking for main in -ltermcap... no
 checking for main in -ltermlib... no
 checking for rl_callback_read_char in -lreadline... no
 checking for history_truncate_file... no
 configure: error: --with-readline=yes (default) and headers/libs are not
 available

 Thanks.

   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How can i inhibit this work Please select a CRAN mirror for use in this session ?

2007-08-25 Thread Prof Brian Ripley
On Sat, 25 Aug 2007, zhijie zhang wrote:

 Dear Rusers,
  When i start R, there always the following work to do first, how should i
 cancel it?
 *--- Please select a CRAN mirror for use in this session ---*
  I don't know why it does so, maybe i have done something unintentionally.

You certainly have.  Try starting R with --vanilla and it should go away.
If so, see ?Startup and look at the various files it mentions.  Do you 
have a .Rprofile?  Have you changed Rprofile.site ... ?

The message comes from contrib.url(): most likely you have a call to 
update.packages() in a startup file.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to provide argument when opening RGui from an external application

2007-08-25 Thread Prof Brian Ripley
Look into Rscript.exe (on Windows), which is a flexible way to run 
scripts.  Neither using a GUI nor using source() are recommended.


On Fri, 24 Aug 2007, Sébastien wrote:


Dear R-users,

I have written a small application (in visual basic) that automatically
generate some R scripts. I would like to execute these scripts when my
application is being closed.
My problem is that I don't know how to pass the
'source(c:/.../myscript.r)' instruction when I programmatically start
RGui. Tinn-R is capable of doing such things, so I guess there must be a
way to pass arguments to RGui.

Any advice or link to relevant references would be greatly appreciated.

Sebastien


--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Character position command

2007-08-25 Thread Prof Brian Ripley
On Sat, 25 Aug 2007, Mitchell Hoffman wrote:

 This is a very simple question, so I apologize I couldn't find it online:

 I want to shorten the string 'apples.pears' to 'apples'.

 string='apples.pears'
 string1=substr(string,0,x)

 For x above, I would like to have a command like charAt(string,.), i.e.
 the position of the period in the word, but I can't seem to find a charAt
 command in R.

See ?regexpr

But a simpler solution is sub(\\..*, , string).


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Estimate Intercept in ARIMA model

2007-08-23 Thread Prof Brian Ripley
This is described on the help page!

include.mean: Should the ARIMA model include a mean term? The default
   is 'TRUE' for undifferenced series, 'FALSE' for differenced
   ones (where a mean would not affect the fit nor predictions).

  Further, if 'include.mean' is true, this formula applies to X-m
  rather than X.  For ARIMA models with differencing, the
  differenced series follows a zero-mean ARMA model.

You can add an intercept to your own xreg: you don't need a package to 
help you, but you do need to study the documentation.

On Thu, 23 Aug 2007, doublelin15 wrote:

 Hi, All,
   This is my program

 ts1.sim - arima.sim(list(order = c(1,1,0), ar = c(0.7)), n = 200)
 ts2.sim - arima.sim(list(order = c(1,1,0), ar = c(0.5)), n = 200)
 tdata-ts(c(ts1.sim[-1],ts2.sim[-1]))
 tre-c(rep(0,200),rep(1,200))
 gender-rbinom(400,1,.5)
 x-matrix(0,2,400)
 x[1,]-tre
 x[2,]-gender
 fit - arima(tdata, c(1, 1, 0), method = CSS,xreg=t(x))

Use

arima(tdata, c(1, 1, 0), method = CSS, xreg=cbind(intercept=1, t(x)))



   I try to fit a ARIMA model and aclude some other independent
 variable in this model, but why the outcome does not have the
 intercept estimate value? and if the model is arima(tdata, c(p, 0, q)
 there will have this value, why have this difference?

   And if i want analysis Interrupted time series, what can i do?
 I find urca can help you to find the interrupted point, but if I
 already know this point, and want to compare the mean, level and
 slope, any package can help me to do this?


 Thanks for your attention!




-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] FAQ 7.x when 7 does not exist. Useability question

2007-08-23 Thread Prof Brian Ripley
On Thu, 23 Aug 2007, John Kane wrote:

 The FAQ Section 7 is a very useful place for new users
 to find out any number of R idiosycracies.  However
 there is no numbering on the FAQ Table of Content or
 on the Sections Tables of Contents.

Hmm, doc/FAQ does have a numbered table of contents and numbered sections 
and doc/manual/R-FAQ.html does have numbered sections and my browser's 
search finds 7.10 straight away.


 An R-help list reply of Read FAQ 7.10 in response to
 a question about converting a factor to numeric is  a
 bit cryptic. The only time 7.10 appears is after the
 searcher has found the entry.

It would help if you told us what you are searching that did not contain 
'7.10'.

 Would it be a good idea to actually number the entries
 for the FAQ Table of Contents and the Table of
 Contents for the Sections?

I think we do.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error building R 2-5.2.1 on Sun Solaris 8

2007-08-23 Thread Prof Brian Ripley
What is 'R 2-5.2.1'?  AFAIK there is no such version.

I can tell you the most likely issue: is your Perl is pre 5.6.1 (very old 
indeed)?

The current R-patched (2.5.1 patched) requires Perl 5.6.1, and we do 
suggest that you install that rather than 2.5.1.

(Interestingly, all versions of R since 2.4.0 have needed Perl = 5.6.1 
but this was not reported until after R 2.5.1 was released, and this is 
the third instance in a month or so.  Perl 5.8, already over 5 years' 
old, will be required for future versions of R.)


On Thu, 23 Aug 2007, Mike Box wrote:


 As shown below, the build process fails with only vague messages,
 leaving me clueless as to how to resolve.

 Thanks, in advance, for any help that you may offer.

 Mike

 --

 # ./configure --prefix=/SOURCES/R-2.5.1 --with-iconv=no
 ...
 ...
 ...
 R is now configured for sparc-sun-solaris2.8

 Source directory: .
 Installation directory: /SOURCES/R-2.5.1

 C compiler: gcc -std=gnu99 -g -O2
 Fortran 77 compiler: g77 -g -O2

 C++ compiler: g++ -g -O2
 Fortran 90/95 compiler: f95 -g
 Obj-C compiler: -g -O2

 Interfaces supported: X11
 External libraries: readline
 Additional capabilities: NLS
 Options enabled: shared BLAS, R profiling, Java

 Recommended packages: yes

 configure: WARNING: you cannot build info or HTML versions of the R manuals

 # make
 ...
 ...
 ...
 * Installing *source* package 'MASS' ...
 ** libs
 gcc -std=gnu99 -I/reserve/R-2.5.1/include -I/reserve/R-2.5.1/include
 -I/usr/local/include -fPIC -g -O2 -c MASS.c -o MASS.o
 gcc -std=gnu99 -I/reserve/R-2.5.1/include -I/reserve/R-2.5.1/include
 -I/usr/local/include -fPIC -g -O2 -c lqs.c -o lqs.o
 gcc -std=gnu99 -G -L/usr/local/lib -o MASS.so MASS.o lqs.o
 ** R
 ** data
 ** moving datasets to lazyload DB
 ** inst
 ** preparing package for lazy loading
 ** help
 Can't use an undefined value as filehandle reference at
 /reserve/R-2.5.1/share/perl/R/Rdconv.pm line 78.
 Building/Updating help pages for package 'MASS'
 Formats: text html latex example
 ERROR: building help failed for package 'MASS'
 ** Removing '/reserve/R-2.5.1/library/MASS'
 ** Removing '/reserve/R-2.5.1/library/class'
 ** Removing '/reserve/R-2.5.1/library/nnet'
 ** Removing '/reserve/R-2.5.1/library/spatial'
 *** Error code 1
 make: Fatal error: Command failed for target `VR.ts'
 Current working directory /reserve/R-2.5.1/src/library/Recommended
 *** Error code 1
 make: Fatal error: Command failed for target `recommended-packages'
 Current working directory /reserve/R-2.5.1/src/library/Recommended
 *** Error code 1
 make: Fatal error: Command failed for target `stamp-recommended'
 #

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How do i print a main title on a win.graph with several plots?

2007-08-22 Thread Prof Brian Ripley
?title, look at the 'outer' argument.

You can see further discussion of the outer margins in 'An Introduction to 
R'.

I don't know why you are using win.graph(): it is a deprecated form of 
windows() with many of the arguments taking unchangable defaults.

On Wed, 22 Aug 2007, Tom Willems wrote:

 Good Mornig All,

 How R you today? ;o)

 I have lots of questions, but i l start with the simplest one,
 to wich i am shy to say, i did not find the answer.

 It is the following:

 When i make a summary plot like for example plot( summary(glm)),
 i get one window, one main title, and 4 graph's in that window.

 Now i do know how to get several graphs in one window,
 buit i don't manage geting one main title , in the top middel of the
 window.
 I can give every plot a different main and subtitle, but i can't put no
 title in the win.graph() box.

 this is what i do:

 win.graph();   op - par(mfrow = c(1,2))
 boxplot(lg_value~labo, main=Test 1 at day 1,ylab=log(x) ,
 xlab=different lab's, data=dataset,ylim=c(-0.05,5))
 boxplot(lg_value~labo, main=Test 2 at day 1,ylab=log(x),
 xlab=different lab's, data=dataset,ylim=c(-0.05,5))
 par(op);

 Then i get one graph window, two plots each with their onw main title.
 What i'd like to have is, one main title saying  At day 1, and then two
 plots with the different tests.
 How can i do this, pls?

 Kind regards,
 Tom W.


 Disclaimer: click here
   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plotting lda results

2007-08-22 Thread Prof Brian Ripley
Read ?plot.lda, which tells you the ... arguments are (for dimen=1, the 
only option for two groups) passed to ldahist, so then read its help page.

I don't know what you want (and your example is not reproducible): I would 
expect you to get a single plot with two panels (figures), but there are 
options to have a single panel.  (Reading 'An Introduction to R' may help 
you to use standard terminology that others will be able to follow.)

On Wed, 22 Aug 2007, Silvia Lomascolo wrote:


 Hi all,
 I am trying to plot the results of a discriminant analysis done with
 lda(MASS) but my groups appear in two different plots (in the same graphics
 device) and I want to combine them in one plot. My code looks like:

 BirdTrain.lda - lda(Bdisperser~., data=BirdTrain.mx)
 predict(BirdTrain.lda)
 plot(BirdTrain.lda)

 I have two types of Bdisperser, so I only get one linear discriminant
 function. Can anyone please tell me how to combine the data in one plot?

 I work with R 2.4.1 using Windows.

But the version of MASS is what is relevant, and it would have been in 
the sessionInfo() output the R posting guide asked you for.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] prediction interval for multiple future observations

2007-08-21 Thread Prof Brian Ripley
On Mon, 20 Aug 2007, Vlad Skvortsov wrote:

 Hi!

 '?predict.lm' says that the prediction intervals returned by predict()
 are for single observation only. Is there a way to specify the desired
 number of observations to construct the interval for?

What it says in full is

  The prediction intervals are for a single observation at each case
  in 'newdata' (or by default, the data used for the fit) with error
  variance(s) 'pred.var'.

I think you misunderstand: predict.lm returns a prediction interval for 
each row of 'newdata'.  The comment in part means that those intervals are 
to be considered individually, and not as a joint prediction region for 
all the future observations.  If you want, say, a prediction interval for 
the average of 10 indepedent observations at a case, use 'pred.var' to 
specify the error variance.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] runing .r file from C#

2007-08-21 Thread Prof Brian Ripley
On Tue, 21 Aug 2007, Alex MD wrote:

 Hi,

 I know that the general subject calling R from C has been discused but I
 have been reading the manuals and also scouting the lists and I can not seam
 to find
 a working solution for my problem.

It's a C# issue.

  I want to call a R script ( let's call it test.r ) from within C# code.
  After reading about this topic I am trying to do this :

 System.Diagnostics.Process proc = new System.Diagnostics.Process();
 proc.StartInfo.FileName = E:/R/R-2.5.1 /bin/Rterm.exe;
 proc.StartInfo.Arguments =  'test.r' --no-save;
 proc.StartInfo.UseShellExecute = false;
 proc.StartInfo.RedirectStandardOutput = false;
 proc.Start();


 bun when Rterm starts it shows parameter test.r ignored

 When I try to do the same from a command line shell it DOES work just fine :
  Rterm.exe test.r --no-save  runs the file without any problems.

 Do you have any idea how to make it not to ignore the input file? Or is
 there other way to just execute a .r file from C# code?

You need a shell for redirection (  |) to work, and 'system' commands in 
Windows do not usually use one (as in C, C++, R, Perl): you seem to have 
turned off using a shell in C#.  However, I think you should be using 
RScript.exe, where this is not an issue.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problems installing updated version of vars package

2007-08-21 Thread Prof Brian Ripley
On Tue, 21 Aug 2007, sj wrote:

 All,

 I was looking onlin and noticed that the vars package (by Bernhard Pfaff)
 was recently updated (update date listed Aug 6, 2007) The updated packages
 has some features that I would find very useful. I have used the update
 packages function and vars was one of the packages identified as needing an
 update. I was able to updated and it appeared to work, however when  I load
 the package it does not seem to be the most recent version? Has anyone else
 had similar problems? Or does anyone have any suggestions?

 System Info:

 R 2.4.1

You need to update your R (as the posting guide asked you to before 
posting) or install the package from sources.

The binary builds are not updated for obsolete versions of R: the builds 
for 2.4.x stopped on 28 June.

 Windows XP
 install mirror: USA 3 (UCLA I think)

 thanks,

 Spencer

   [[alternative HTML version deleted]]
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

PLEASE do!


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R on a flash drive

2007-08-21 Thread Prof Brian Ripley
On Wed, 22 Aug 2007, Williams Scott wrote:

 I often run R via a Ceedo virtualisation on a USB drive
 (http://www.ceedo.com/) with XP. It costs a few dollars to it this way,
 but is a very low stress installation and has worked flawlessly, albeit

It is not necessary though, as R does not need 'virtualisation'.  For 
Windows this is covered in the rw-FAQ Q2.6.

 a little slower (barely noticeable).

Perhaps the overhead of Credo?  Once R starts up (which does take longer 
on a slow drive) I found no time-able difference in 2.1.x (all the files 
frequently used from disc are cached on startup).

It would be nice to give the R developers the credit for writing R in such 
a way that it works well from slow media, instead of it being credited to 
an unnecessary commercial product.

 Very handy if you are often working
 on various machines without administrator rights (as I do in clinic) -
 just plug in your USB and go directly back to your project. It then
 removes any trace of you (so they say) when you log out. And you can use
 it for other software (within limits though) you might want to carry
 around.

Many sites would not allow programs to be run from a USB drive or make it 
a breach of usage conditions to do so.


 Hope that helps.

 Scott

 
 Scott Williams MD
 Peter MacCallum Cancer Centre
 Melbourne
 Australia

 -Original Message-
 From: John Kane [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, 21 August 2007 12:28 AM
 To: John Kane; Erin Hodgess; r-help@stat.math.ethz.ch
 Subject: Re: [R] R on a flash drive

 Oops meant to send this to the list.
 --- John Kane [EMAIL PROTECTED] wrote:


 --- Erin Hodgess [EMAIL PROTECTED] wrote:

 Dear R People:

 Has anyone run R from a flash drive, please?

 If so, how did it work, please?

 Yes I run R, occasionally, on a USB with no problem
 on
 WindowsXP. It works well, albeit a bit more slowly
 than from the hard drive which is as you would
 expect.

 The last time I upgraded the USB (to 2.5.0 ?) I
 simply
 downloaded R and installed it on the USB drive
 rather
 than the C: drive and then installed all my usual
 optional packages using the normal Rgui interface.

 I usually have R, Tinn-R and portable versions of
 OpenOoffice.org, and Firefox installed on the USB.


   Get news delivered with the All new Yahoo!
 Mail.  Enjoy RSS feeds right on your Mail page.
 Start today at
 http://mrd.mail.yahoo.com/try_beta?.intl=ca


 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] open/execute/call/run an external file

2007-08-21 Thread Prof Brian Ripley
On Tue, 21 Aug 2007, STEPHEN M POWERS wrote:

 I'm trying to figure out how to trigger a process from within R. I have 
 an exectuable file that runs a Fortran model, but ideally, would like to 
 run it from R. Note that I'm not talking about importing the function at 
 all, passing variables, or anything complicated like that. I basically 
 just want a script that double-clicks on a particular file and 
 opens/runs it for me.

 The idea here is that the executable Fortran file, when double clicked, 
 simply draws all necessary inputs from text files within the same 
 directory and I have no need to change this. So I've used R to summarize 
 some raw data and format these required text input files in the way the 
 Fortran executable requires, and also have scripts to interpret the 
 Fortran text file outputs and summarize/plot them in R. The problem is I 
 must run the first part of the R script to send data from R to the 
 model, then double click the Fortran executable, then run the second 
 part of the R script to get the model outputs into R, in three separate 
 steps. Given that I may be doing this hundereds of times, I'd prefer to 
 do it all in one step.

 Any thoughts?---steve

It is described in the relevant manual: Writing R Extensions.
?system, and if this is Windows also ?shell and ?shell.exec.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

When you do, you will see that we asked for your OS which is relevant 
here.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] can't find as.family function

2007-08-19 Thread Prof Brian Ripley

On Sun, 19 Aug 2007, Mario Alfonso Morales Rivera wrote:


Hi R users,

I want to use dglm Package.
I run the examples and it give me an error:

Error en dglm(lot1 ~ log(u), ~1, data = clotting, family = Gamma) :
   no se pudo encontrar la función as.family

dglm can't find as.family function

why ?


Because it does not exist in R (nor in the current version of package 
dglm).


PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


Please do as we ask.  We asked for the output of sessionInfo(), and one 
guess is that your versions are not current and this is a problem that has 
already been solved: another is that you have attached a package that 
conflicts with dglm -- the information we asked for would have helped in 
both cases.


--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Prerequisite for running RWeka

2007-08-19 Thread Prof Brian Ripley
On Sun, 19 Aug 2007, [EMAIL PROTECTED] wrote:

 Hi -

 I have a question on RWeka. I installed the package and try to run using 
 some examples available in the package. However, it stalls my machine 
 for a while. I'm wondering if I need weka (which is java implementation) 
 installed before using RWeka? Thank you.

No, and if it did it should be listed in the DESCRIPTION file (and given 
that the maintainer of RWeka wrote the rules, it would be).

 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

PLEASE do!  We don't know your OS, version of R, version of RWeka, version 
of Java, what 'stalls' means, what the maintainer said when you asked him 
(as required by the posting guide) ... in short anything we need to even 
guess at the problem.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] recommended combo of apps for new user?

2007-08-18 Thread Prof Brian Ripley
Some additional comments on the DBMS front.

(a) SPSS is not a DBMS, so it is not clear that you need this. But if you 
do and are storing valuable data in a DBMS a lot of further questions come 
into play, like how you are going to do backups.  I'd say PostgreSQL was 
really only for professional-level administrators.  My sysadmins recommend 
MySQL for most people.  We do also run PostgreSQL and they find it a lot 
trickier to maintain.

'dozens of columns and thousands of rows' is not big.  A data frame with 
50 columns and 5000 rows would only take 2Mb to store, and R will easily 
handle 100x with 4GB of RAM (and if you have less, get 4GB).  So storing 
data in .rda (R's save() format) is most likely viable.  R's indexing etc 
operations make it good at data manipulation, and using a DBMS will 
involve learning SQL, a non-trivial cost.

(b) You have a choice of interfaces to a DBMS, RODBC and the DBI+ family, 
e.g. DBI+RMySQL and DBI+RSQLite.  I'm biased, but I find RODBC more 
intuitive, and many people have reported it to be faster.  If all you want 
is non-permanent storage for manipulation of large data sets, consider 
also SQLiteDF.

On Sat, 18 Aug 2007, Duncan Murdoch wrote:

 Martin Brown wrote:
 [i sent this message earlier but apparently should have sent it plain
 text, as follows..]

 Hi there,

 I would like some advice, not so much about how to use R, but about
 software that I need to complement R.  I've rooted around in the FAQ's
 and done a few searches on this mailing list but haven't quite found
 the perspective I need.

 I am an experienced data analyst in my field (forest ecology and
 ecological monitoring) but new to R. I am a long time user of SPSS and
 have gotten pretty handy with it.  However, I am frustrated with SPSS
 for several reasons:  There's the cost (I'm a freelancer; I pay for my
 software myself);  the Windows dependence (I use Kubuntu as my usual
 OS now, and switching back and forth is a pain); the horrible
 inefficiency when I do certain types of file manipulations; and the
 inability to do the kind of publication-quality graphs I want... I've
 usually ended up using a commercial graphing program (another source
 of expense and limitation).

 I'd like to switch to using R on Kubuntu, for all those reasons.  In
 addition I think the mathematical formality that R encourages might be
 good for me.

 However, reviewing the FAQ's on the R project web site makes me
 realize that I've been using SPSS as three kinds of software really:
 a DBMS; a statistical analysis package; and a graphing package.  It
 looks like moving to R might involve learning three kinds of software,
 not just one.  I wonder:

 1) What open-source DBMS works most seamlessly with R?  I have seen
 MySQL recommended but wonder if there are alternatives.  I sometimes
 need to handle big data files.  In fact a lot of my work involves
 exploratory and descriptive analyses of rather large and messy
 databases from ecological monitoring, rather than statistical tests
 per se.  In SPSS the data files I have been generating have dozens of
 columns and thousands of rows, often with value and variable labels
 helpful for documenting my work.

See above.


 I think you won't find much difference in the R interface between MySQL,
 PostgreSQL, or SQLite.  The choice should be made based on the qualities
 of the database (and I don't know enough about the differences to give a
 recommendaton.)
 2) For the purpose of creating publication-quality graphs, do R users
 typically need to go outside of the R system? If so, what open-source
 programs would you all recommend?

 R is great for this, but you might need to go outside for some
 specialized stuff (e.g. medical imaging).

 3) Any other software I need to learn that would make my work in R
 more productive? (for example, a code editor).

 A lot of people are happy with ESS mode in Emacs.

 Duncan Murdoch

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question about sm.options sm.survival

2007-08-17 Thread Prof Brian Ripley
On Thu, 16 Aug 2007, Rachel Jia wrote:

 Hi, there:

 It's my first time to post question in this forum, so thanks for your
 tolerance if my question is too naive. I am using a nonparametric smoothing
 procedure in sm package to generate smoothed survival curves for continuous
 covariate. I want to truncate the suvival curve and only display the part
 with covariate value between 0 and 7. The following is the code I wrote:

 sm.options(list(xlab=log_BSI_min3_to_base, xlim=c(0,7), ylab=Median
 Progression Prob))
 sm.survival(min3.base.prog.cen[,3],min3.base.prog.cen[,2],min3.base.prog.cen[,1],h=sd(min3.base.prog.cen[,3]),status.code=1
 )

 But the xlim option does not work. Can anyone help me with this problem?

The help page suggests that you need to use xlim as an inline option (part 
of ...). Following the help page example

 sm.survival(x, y, status, h=2, xlim=c(0,4))

works.  So I think you need to follow the help page exactly.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Date format on x-axis

2007-08-17 Thread Prof Brian Ripley
If you want to use English, you need to set your session to be use 
English.


PLEASE do read the posting guide 

http://www.R-project.org/posting-guide.html

We need to know your OS and locale, and you did not follow the guide.

On a Unix-alike probably Sys.setlocale(LC_TIME,en_US) or 
Sys.setlocale(LC_TIME, en_US.utf8) is needed.  On Windows 
Sys.setlocale(LC_TIME, en).



On Fri, 17 Aug 2007, [EMAIL PROTECTED] wrote:


Dear R users,

Plotting question from a R beginner...

When I try to plot a response through time, for example:

Date-c(2006-08-17, 2006-08-18, 2006-08-19, 2006-08-20)
response-c(4,4,8,12)
as.Date(Date)


I presume that was Date - as.Date(Date)


plot(Date,response)


The dates on the graphic appear in spanish. This I guess is the default
way of plotting because my windows is in spanish, but I need a aug 17
instead of ago 17 (agosto is the spanish for august)...
I've tried,

format(Date, %m %d)

And although it does change the way Date is listed, well it's still
plotted in spanish...
I've also searched through par() settings, but xaxp,xaxs, xaxt, xpd and
xlog do not solve my problem...

Could anyone help me solve this format question?

Thanks a million in advance,

Greetings,
Iñaki Etxebeste Larrañaga
M.Sc. Biologist
Producción Vegetal y Recursos Forestales
ETSIIAA Universidad de Valladolid
Avda. Madrid,57
34071 Palencia (Spain)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] installation of the gsl package on Suse 10.1

2007-08-17 Thread Prof Brian Ripley
On Fri, 17 Aug 2007, luca laghi wrote:

 I am trying to install the gsl  package.
 I had gsl installed with YaSt in /usr/lib.
 when I launch R as superuser and launch install.packages, it says in
 cannot find Gnu Scientific Library. How can I make it find them?

Please tell us the exact messages you get.

At a quess, you need gsl-devel (or gsl-dev or some such) as well as gsl.

 Thank you,
 Luca Laghi

 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

PLEASE do!

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem using rank

2007-08-17 Thread Prof Brian Ripley
On Fri, 17 Aug 2007, Jiong Zhang, PhD wrote:

 Hi All,

 I had 12766 elements in a column, 12566 are values and 200 are NAs. I 
 used the following line to get the ranks:

 total_list$MB.rank - rank(-total_list$MB,ties.method=min,na.last=NA)

 but I got an error message:

 Error in `$-.data.frame`(`*tmp*`, BCRP_PW_F.rank, value = c(3949, 6182,  :
replacement has 12199 rows, data has 12766

 What shall I do to keep the NAs as NAs?  thanks a lot.

If all else fails try reading the help!  You have to select the right 
option for na.last, and yours is not it.  I suspect you want 
na.last=keep, but only you know what you mean.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] binomial simulation

2007-08-16 Thread Prof Brian Ripley
On Wed, 15 Aug 2007, Moshe Olshansky wrote:

 Thank you - I wasn't aware of this function.
 One can even use lchoose which allows really huge
 arguments (more than 2^1000)!

Using dbinom() for binomial probabilities would be even better, 
and that has a log=TRUE argument to return results on natural log scale.

 dbinom(k,N,p,log=TRUE) + dbinom(m,k,q,log=TRUE)
[1] -92.52584
 log(choose(N,k)*p^k*(1-p)^(N-k)*choose(k,m)*q^m*(1-q)^(k-m))
[1] -92.52584


 --- Lucke, Joseph F [EMAIL PROTECTED]
 wrote:

 C is an R function for setting contrasts in a
 factor.  Hence the funky
 error message.
 ?C

 Use choose() for your C(N,k)
 ?choose

 choose(200,2)
 19900

 choose(200,100)
  9.054851e+58

 N=200; k=100; m=50; p=.6; q=.95

 choose(N,k)*p^k*(1-p)^(N-k)*choose(k,m)*q^m*(1-q)^(k-m)
 6.554505e-41

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf
 Of Moshe Olshansky
 Sent: Wednesday, August 15, 2007 2:06 AM
 To: sigalit mangut-leiba; r-help
 Subject: Re: [R] binomial simulation

 No wonder that you are getting overflow, since
 gamma(N+1) = n! and 200!  (200/e)^200  10^370.
 There exists another way to compute C(N,k). Let me
 know if you need this
 and I will explain to you how this can be done.
 But do you really need to compute the individual
 probabilities? May be
 you need something else and there is no need to
 compute the individual
 probabilities?

 Regards,

 Moshe.

 --- sigalit mangut-leiba [EMAIL PROTECTED] wrote:

 Thank you,
 I'm trying to run the joint probabilty:

 C(N,k)*p^k*(1-p)^(N-k)*C(k,m)*q^m*(1-q)^(k-m)

 and get the error: Error in C(N, k) : object not
 interpretable as a
 factor

 so I tried the long way:

 gamma(N+1)/(gamma(k+1)*(gamma(N-k)))

 and the same with k, and got the error:

 1: value out of range in 'gammafn' in: gamma(N +
 1)
 2: value out of range in 'gammafn' in: gamma(N -
 k) 

 Do you know why it's not working?

 Thanks again,

 Sigalit.



 On 8/14/07, Moshe Olshansky
 [EMAIL PROTECTED]
 wrote:

 As I understand this,
 P(T+ | D-)=1-P(T+ | D+)=0.05
 is the probability not to detect desease for a
 person
 at ICU who has the desease. Correct?

 What I asked was whether it is possible to
 mistakenly
 detect the desease for a person who does not
 have
 it?

 Assuming that this is impossible the formula is
 below:

 If there are N patients, each has a probability
 p
 to
 have the desease (p=0.6 in your case) and q is
 the probability to
 detect the desease for a person who
 has
 it (q = 0.95 for ICU and q = 0.8 for a regular
 unit),
 then

 P(k have the desease AND m are detected) = P(k
 have the desease)*P(m

 are detected / k have
 the
 desease) =
 C(N,k)*p^k*(1-p)^(N-k)*C(k,m)*q^m*(1-q)^(k-m)
 where C(a,b) is the Binomial coefficient a
 above
 b -
 the number of ways to choose b items out of a
 (when
 the order does not matter). You of course must
 assume
 that N = k = m = 0 (otherwise the probability
 is
 0).

 To generate such pairs (k infected and m
 detected)
 you
 can do the following:

 k - rbinom(N,1,p)
 m - rbinom(k,1,q)

 Regards,

 Moshe.

 --- sigalit mangut-leiba [EMAIL PROTECTED]
 wrote:

 Hi,
 The probability of false detection is: P(T+ |
 D-)=1-P(T+ |
 D+)=0.05.
 and I want to find the joint probability
 P(T+,D+)=P(T+|D+)*P(D+)
 Thank you for your reply,
 Sigalit.


 On 8/13/07, Moshe Olshansky
 [EMAIL PROTECTED]
 wrote:

 Hi Sigalit,

 Do you want to find the probability P(T+ = t
 AND
 D+ =
 d) for all the combinations of t and d (for
 ICU
 and
 Reg.)?
 Is the probability of false detection (when
 there
 is
 no disease) always 0?

 Regards,

 Moshe.

 --- sigalit mangut-leiba [EMAIL PROTECTED]
 wrote:

 hello,
 I asked about this simulation a few days
 ago,
 but
 still i can't get what i
 need.
 I have 2 units: icu and regular. from icu
 I
 want
 to
 take 200 observations
 from binomial distribution, when
 probability
 for
 disease is: p=0.6.
 from regular I want to take 300
 observation
 with
 the
 same probability: p=0.6
 .
 the distribution to detect disease when
 disease
 occurred- *for someone from
 icu* - is: p(T+ | D+)=0.95.
 the distribution to detect disease when
 disease
 occurred- *for someone from
 reg.unit* - is: p(T+ | D+)=0.8.
 I want to compute the joint distribution
 for
 each

 === message truncated ===

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list

Re: [R] an easy way to construct this special matirx

2007-08-16 Thread Prof Brian Ripley
?toeplitz
?lower.tri

since it is the lower triangle of a Toeplitz matrix (or drop the top row)

r - 0.95
R - toeplitz(r^(0:4))
R[upper.tri(R)] - 0
R[-1,]


On Thu, 16 Aug 2007, [EMAIL PROTECTED] wrote:

 Hi,
 Sorry if this is a repost. I searched but found no results.
 I am wondering if it is an easy way to construct the following matrix:

 r  1 0 00
 r^2   r 1 00
 r^3   r^2  r 10
 r^4   r^3  r^2  r1

 where r could be any number. Thanks.
 Wen

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Polynomial fitting

2007-08-16 Thread Prof Brian Ripley
It is easier to use poly(raw=TRUE), and better to use poly() with 
orthogonal polynomials.

The original poster shows signs of having read neither the help for 
predict.lm nor the posting guide, and so almost certainly misused the 
predict method.


On Thu, 16 Aug 2007, Jon Minton wrote:

 Remember that polynomials of the form

 y = b1*x + b2*x^2 + ... + bm*x^m

 fit the linear regression equation form

 Y = beta_1*x_1 + beta_2*x_2 + ... + beta_m*x_m

 If one sets (from the 1st to the 2nd equation)
 x - x_1
 x^2 - x_2
 x^3 - x_3
 etc.

 In R this is easy, just use the identity operator I() when specifying the
 equation.
 e.g. for a 3rd order polynomial:

 model - lm(Y ~ x + I(x^2) + I(x^3) + I(x^4))

 hth, Jon

 ***

 I'm looking some way to do in R a polynomial fit, say like polyfit
 function of Octave/MATLAB.

 For who don't know, c = polyfit(x,y,m) finds the coefficients of a
 polynomial p(x) of degree m that fits the data, p(x[i]) to y[i], in a
 least squares sense. The result c is a vector of length m+1 containing
 the polynomial coefficients in descending powers:
 p(x) = c[1]*x^n + c[2]*x^(n-1) + ... + c[n]*x + c[n+1]

 For prediction, one can then use function polyval like the following:

 y0 = polyval( polyfit( x, y, degree ), x0 )

 y0 are the prediction values at points x0 using the given polynomial.

 In R, we know there is lm for 1-degree polynomial:
 lm( y ~ x ) == polyfit( x, y, 1 )

 and for prediction I can just create a function like:
 lsqfit - function( model, xx ) return( xx * coefficients(model)[2] +
 coefficients(model)[1] );
 and then: y0 - lsqfit(x0)
 (I've tried with predict.lm( model, newdata=x0 ) but obtain a bad result)

 For a degree greater than 1, say m,  what can I use.??
 I've tried with
   lm( y ~ poly(x, degree=m) )
 I've also looked at glm, nlm, approx, ... but with these I can't
 specify the polynomial degree.

 Thank you so much!

 Sincerely,

 -- Marco

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Trim trailng space from data.frame factor variables

2007-08-16 Thread Prof Brian Ripley
On Thu, 16 Aug 2007, Marc Schwartz wrote:

 The easiest way might be to modify the lapply() call as follows:

 d[] - lapply(d, function(x) if (is.factor(x)) factor(sub( +$, , x)) else 
 x)

 str(d)
 'data.frame':   60 obs. of  3 variables:
 $ x: Factor w/ 5 levels 1,2,3,4,..: 1 1 1 1 1 1 1 1 1 1 ...
 $ y: num  7.01 8.33 5.48 6.51 5.61 ...
 $ f: Factor w/ 3 levels lev1,lev2,..: 1 1 1 1 1 1 1 1 1 1 ...


 This way the coercion back to a factor takes place within the loop as
 needed.

 Note that I also meant to type sub() and not grep() below. The default
 behavior for both is to return a character vector (if 'value = TRUE' in
 grep()). There is not an argument to override that behavior.

I would have thought the thing to do was to apply sub() to the levels:

chfactor - function(x) { levels(x) - sub( +$, , levels(x)); x }

d[] - lapply(d, function(x) if (is.factor(x)) chfactor(x) else x)

This has the advantage of not losing the order of the levels.  It will 
merge levels if they only differ in the number of trailing spaces, which 
is probably what you want.


 HTH,

 Marc


 On Thu, 2007-08-16 at 19:19 +0300, Lauri Nikkinen wrote:
 Thanks Marc! What would be the easiest way to coerce char-variables
 back to factor-variables? Is there a way to prevent the coercion in
 d[] - lapply(d, function(x) if ( is.factor(x)) sub( +$, , x) else
 x) ?



 -Lauri



 2007/8/16, Marc Schwartz [EMAIL PROTECTED]:
 On Thu, 2007-08-16 at 17:54 +0300, Lauri Nikkinen wrote:
 Hi folks,

 I would like to trim the trailing spaces in my factor
 variables using lapply
 (described in this post by Marc Schwartz:
 http://tolstoy.newcastle.edu.au/R/e2/help/07/08/22826.html)
 but the code is
 not functioning (in this example there is only one factor
 with trailing
 spaces):

 Ayepas I noted in that post, it was untestedmy error.

 The problem is that by using ifelse() as I did, the test for
 the column
 being a factor returns a single result, not one result per
 element.
 Hence, the appropriate conditional code is only performed on
 the first
 element in each column, rather than being vectorized on the
 entire
 column.

 y1 - rnorm(20) + 6.8
 y2 - rnorm(20) + (1:20* 1.7 + 1)
 y3 - rnorm(20) + (1:20*6.7 + 3.7)
 y - c(y1,y2,y3)
 x - gl(5,12)
 f - gl(3,20, labels=paste(lev, 1:3,, sep=))
 d - data.frame (x=x,y=y, f=f)
 str(d)

 d[] - lapply(d, function(x) ifelse(is.factor(x), sub( +$,
 , x), x))
 str(d)

 How should I modify this?

 Try this instead:

 d[] - lapply(d, function(x) if (is.factor(x)) sub( +$, ,
 x) else x)

 str(d)
 'data.frame':   60 obs. of  3 variables:
 $ x: chr  1 1 1 1 ...
 $ y: num  6.70 4.42 8.03 4.90 6.98 ...
 $ f: chr  lev1 lev1 lev1 lev1 ...

 Note that by using grep(), the factors are coerced to
 character vectors
 as expected. You would need to coerce back to factors if you
 need them
 as such.

 HTH,

 Marc Schwartz

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error in building R

2007-08-15 Thread Prof Brian Ripley
On Wed, 15 Aug 2007, Giovanni Petris wrote:


 Hello,

 I am upgrading to the current R 2.5.1 under Sun Solaris 8.

Actually, 2.5.1 is not current: '2.5.1 patched' aka R-patched is and this 
has already been addressed there.

  I call the configure script with the --without-readline flag, and it 
 works fine. Then, when I invoke make, I get this kind of error messages:


 make[2]: Entering directory `/usr/local/R/R-2.5.1-inst/src/library'
  Building/Updating help pages for package 'base'
 Formats: text html latex example
 Can't use an undefined value as filehandle reference at 
 /usr/local/R/R-2.5.1-inst/share/perl/R/Rdconv.pm line 78.
  Building/Updating help pages for package 'tools'
 Formats: text html latex example
 Can't use an undefined value as filehandle reference at 
 /usr/local/R/R-2.5.1-inst/share/perl/R/Rdconv.pm line 78.
  Building/Updating help pages for package 'utils'
 Formats: text html latex example
 Can't use an undefined value as filehandle reference at 
 /usr/local/R/R-2.5.1-inst/share/perl/R/Rdconv.pm line 78.


 (I don't know if this has to do with perl, but I have version 5.005_03)

It does.  My memory is that version of Perl predates Solaris 8 (it comes 
from the 1990's).  You need Perl = 5.6.1, and I would suggest installing 
Perl 5.8.x (which is already 6 years' old) as the next version of R will 
require it.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] getting lapply() to work for a new class

2007-08-15 Thread Prof Brian Ripley
On Wed, 15 Aug 2007, Pijus Virketis wrote:

 I would like to get lapply() to work in the natural way on a class I've
 defined.

What you have not said is that this is an S4 class.

 As far as I can tell, lapply() needs the class to be coercible
 to a list. Even after I define as.list() and as.vector(x, mode=list)
 methods, though, I still get an Error in as.vector(x, list) : cannot
 coerce to vector. What am I doing wrong?

Not considering namespaces.  Setting an S4 method for as.list() creates an 
object called as.list in your workspace, but the lapply function uses the 
as.list in the base namespace.  That's the whole point of namespaces: to 
protect code against redefining functions.

This works as documented for S3 methods (since as.list is S3 generic): it 
is a 'feature' of S4 methods that deserves to be much more widely 
understood.


 # dummy class
 setClass(test, representation(test=list))

 # set up as.list()
 test.as.list - function(x) [EMAIL PROTECTED]
 setMethod(as.list, signature(x=test), test.as.list)

 # set up as.vector(x, mode=list)
 test.as.vector - function(x, mode) [EMAIL PROTECTED]
 setMethod(as.vector, signature(x=test, mode=character),
 test.as.vector)

 obj - new(test, test=list(1, 2, 3))

 # this produces Error in as.vector(x, list) : cannot coerce to
 vector on R 2.4.1
 lapply(obj, print)

 # these work
 lapply(as.list(obj), print)
 lapply(as.vector(obj, list), print)

 Thank you,

 Pijus

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] getting lapply() to work for a new class

2007-08-15 Thread Prof Brian Ripley
On Wed, 15 Aug 2007, Pijus Virketis wrote:

 Thank you.

 When I tried to set as.list() in baseenv(), I learned that its bindings
 are locked.

Of course.  Did you not see my comment about 'to protect code against 
redefining functions'?

 Does this mean that the thing to do is just to write my own
 lapply, which does the coercion using my private as.list(), and then
 invokes the base lapply()?

I believe 'the thing to do' is to call your as.list explicitly.  After 
all, the first 'l' in lapply means 'list', so is it is 'natural' to call 
it on a list.

And please do NOT edit other people's messages without indication: the R 
posting guide covers that and it is a copyright violation.


 -P

 -Original Message-

Not so: an EDITED version of my message.

 From: Prof Brian Ripley [mailto:[EMAIL PROTECTED]
 Sent: Wednesday, August 15, 2007 5:18 PM

 As far as I can tell, lapply() needs the class to be coercible to a
 list. Even after I define as.list() and as.vector(x, mode=list)
 methods, though, I still get an Error in as.vector(x, list) :
 cannot coerce to vector. What am I doing wrong?

 Not considering namespaces.  Setting an S4 method for as.list() creates
 an object called as.list in your workspace, but the lapply function uses
 the as.list in the base namespace.  That's the whole point of
 namespaces: to protect code against redefining functions.



-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] invert 160000x160000 matrix

2007-08-14 Thread Prof Brian Ripley
On Tue, 14 Aug 2007, Patnaik, Tirthankar  wrote:

 A variety of tricks would need to be used to invert a matrix of this 
 size. If there are any other properties of the matrix that you know 
 (symmetric, positive definite, etc, sparse) then they could be useful 
 too. You could partition the matrix first, then use an in-place inverse 
 technique for each block to individually calculate the blocks-inverses, 
 then combine to get the inverse of the initial matrix. Again, if the 
 implementation is actually solving an Ax-B = 0 system of equations, then 
 there are specific methods for these too, like an LU decomp, for 
 instance. You might also want to check out some texts for this, like the 
 Numerical Recipes.

 How's the matrix stored right now?

Well, not in R as a matrix: see ?Memory-limits.  It is about 12x larger 
than the largest possible matrix in R.


 Best,
 -Tir

 Tirthankar Patnaik
 India Strategy
 Citigroup Investment Research
 +91-22-6631 9887

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Moshe Olshansky
 Sent: Tuesday, August 14, 2007 6:40 AM
 To: Paul Gilbert; Jiao Yang
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] invert 16x16 matrix

 While inverting the matrix may be a problem, if you need to
 solve an equation A*x = b you do not need to invert A, there
 exist iterative methods which do need A or inv(A) - all you
 need to provide is a function that computes A*x for an
 arbitrary vector x.
 For such a large matrix this may be slow but possible.

 --- Paul Gilbert [EMAIL PROTECTED]
 wrote:

 I don't think you can define a matrix this large in R, even if you
 have the memory. Then, of course, inverting it there may be other
 programs that have limitations.

 Paul

 Jiao Yang wrote:

 Can R invert a 16x16 matrix with all
 positive numbers?  Thanks a lot!



-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cov.unscaled in gls object

2007-08-14 Thread Prof Brian Ripley
This is what the vcov() generic is for.  You are asking for internal 
details from a different class (summary.lm).

On Tue, 14 Aug 2007, Sven Garbade wrote:

 Hi list,

 can I extract the cov.unscaled (the unscaled covariance matrix) from a
 gls fit (package nlme), like with summary.lm? Background: In a fixed
 effect meta analysis regression the standard errors of the coefficients
 can be computed as sqrt(diag(cov.unscaled)) where cov.unscaled is
 (X'WX). I try do do this with a gls-fit.

I don't think so: the 'unscaled' is a clue.  The vcov method is

 stats:::vcov.lm
function (object, ...)
{
 so - summary.lm(object, corr = FALSE)
 so$sigma^2 * so$cov.unscaled
}

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Import of Access data via RODBC changes column name (NO to Expr1014) and the content of the column

2007-08-14 Thread Prof Brian Ripley
On Tue, 14 Aug 2007, Maciej Hoffman-Wecker wrote:


 Dear all,

 I have some problems with importing data from an Access data base via
 RODBC to R. The data base contains several tables, which all are
 imported consecutively. One table has a column with column name NO. If
 I run the code attached on the bottom of the mail I get no complain, but
 the column name (name of the respective vector of the data.frame) is
 Expr1014 instead of NO. Additionally the original column (type
 text) containes 0s and missings, but the imported column contains
 0s only (type int). If I change the column name in the Access data
 base to NOx, the import works fine with the right name and the same
 data.

 Previously I generated a tiny Access data base which reproduced the
 problem. To be on the safe site I installed the latest version (2.5.1)
 and now the example works fine, but within my production process the
 error still remaines. An import into excel via ODBC works fine.

 So there is no way to figure it out whether this is a bug or a
 feature.-)

It's most likely an ODBC issue, but you have not provided a reproducible 
example.

 The second problem I have is that when I rerun rm(list = ls(all = T));
 gc() and the import several times I get the following error:

 Error in odbcTables(channel) : Calloc could not allocate (263168 of 1)
 memory
 In addition: Warning messages:
 1: Reached total allocation of 447Mb: see help(memory.size) in:
 odbcQuery(channel, query, rows_at_time)
 2: Reached total allocation of 447Mb: see help(memory.size) in:
 odbcQuery(channel, query, rows_at_time)
 3: Reached total allocation of 447Mb: see help(memory.size) in:
 odbcTables(channel)
 4: Reached total allocation of 447Mb: see help(memory.size) in:
 odbcTables(channel)

 which is surprising to me, as the first two statements should delete all

How do you _know _what they 'should' do?  That only deletes all objects in 
the workspace, not all objects in R, and not all memory blocks used by R.

Please do read ?Memory-limits for the possible reasons.

Where did '447Mb' come from?  If this machine has less than 2Gb of RAM, 
buy some more.


 objects and recover the memory. Is this only a matter of memory? Is
 there any logging that reduces the memory? Or is this issue connected to
 the upper problem?

 I added the code on the bottom - maybe there is some kind of misuse I
 lost sight of. Any hints are appreciated.

 Kind regards,
 Maciej

 version
   _
 platform   i386-pc-mingw32
 arch   i386
 os mingw32
 system i386, mingw32
 status
 major  2
 minor  5.1
 year   2007
 month  06
 day27
 svn rev42083
 language   R
 version.string R version 2.5.1 (2007-06-27)


 ## code

 get.table - function(name, db, drop = NULL){
  .con - try(odbcConnectAccess(db), silent = T)
  if(!inherits(.con, RODBC)) return(.con)
  ## exclude memo columns
  .t - try(sqlColumns(.con, name))
  if(inherits(.t, try-error)){close(.con); return(.t)}
  .t - .t[.t$COLUMN_SIZE  255, COLUMN_NAME]
  .t - paste(.t, collapse = ,)
  ## get table
  .t - paste(select, .t, from, name)
  .d - try(sqlQuery(.con, .t), silent = T)
  if(inherits(.d, try-error)){close(.con); return(.d)}
  .con - try(close(.con), silent = T)
  if(inherits(.con, try-error)) return(.con)
  .d - .d[!names(.d) %in% drop]
  return(.d)
 }

 get.alltables - function(db){
  .con - try(odbcConnectAccess(db), silent = T)
  if(!inherits(.con, RODBC)) return(.con)
  .tbls - try(sqlTables(.con)[[TABLE_NAME]])
  if(inherits(.tbls, try-error)){close(.con); return(.tbls)}
  .con - try(close(.con), silent = T)
  if(inherits(.con, try-error)) return(.con)
  .tbls - .tbls[-grep(^MSys, .tbls)]
  .d - lapply(seq(along = .tbls), function(.i){
.d -
  try(get.table(.tbls[.i], db = db))
return(invisible(.d))
  })
  names(.d) - .tbls
  .ok - !sapply(.d, inherits, try-error)
  return(list(notdone = .d[!.ok], data = .d[.ok]))
 }

 library(RODBC)

 alldata - get.alltables(db = ./myaccessdb.MDB)

 ## code end

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] graph dimensions default

2007-08-14 Thread Prof Brian Ripley
On Tue, 14 Aug 2007, Simon Pickett wrote:

 Hi,

 I would like to (if possible) set the default width and height for graphs
 at the start of each session and have each new graphic device overwrite
 the previous one.

Hmm.  It is graphics devices that have dimensions, and plots that 
overwrite other plots on a device, so your intentions are pretty unclear. 
(If you resize a device window the plot dimensions change so they are not 
intrinsic to the plot.)

If you want the default behaviour to be like normal but with, say, a wider 
onscreen device window you can have (on Windows, which you didn't say)

mywindows - function(...) windows(width=10, height=6, ...)
options(device=mywindows)

in your ~/.Rprofile .  Otherwise, please try again to tell us what you 
do want.



 I only know how to do this using windows(width=,height=...) which opens up
 a new plotting device every time, so I end up with lots of graphs all over
 the place until I get the one I want!

 Thanks in advance,

 Simon


 Simon Pickett
 PhD student
 Centre For Ecology and Conservation
 Tremough Campus
 University of Exeter in Cornwall
 TR109EZ
 Tel 01326371852

 http://www.uec.ac.uk/biology/research/phd-students/simon_pickett.shtml

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Linear Regression with slope equals 0

2007-08-14 Thread Prof Brian Ripley
On Tue, 14 Aug 2007, [EMAIL PROTECTED] wrote:


 Hi there, am trying to run a linear regression with a slope of 0.

 I have a dataset as follows

 t d
 1 303
 2 302
 3 304
 4 306
 5 307
 6 303

 I would like to test the significance that these points would lie on a
 horizontal straight line.

 The standard regression lm(d~t) doesn't seem to allow the slope to be set.

lm(d ~ 1) does, though, to zero.

More generally you can use offset(), e.g. lm(d ~ offset(7*t)) forces a 
slope of 7.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] graph dimensions default

2007-08-14 Thread Prof Brian Ripley
On Tue, 14 Aug 2007, Simon Pickett wrote:

 Yes,

 Thankyou, that does the trick nicely. I thought that kind of thing could
 be specified using par() but I guess not.

As I said, size is not a property of the plot.
And par() applies to the current device, not future ones.


 Thanks again.



 On Tue, 14 Aug 2007, Simon Pickett wrote:

 Hi,

 I would like to (if possible) set the default width and height for
 graphs
 at the start of each session and have each new graphic device overwrite
 the previous one.

 Hmm.  It is graphics devices that have dimensions, and plots that
 overwrite other plots on a device, so your intentions are pretty unclear.
 (If you resize a device window the plot dimensions change so they are not
 intrinsic to the plot.)

 If you want the default behaviour to be like normal but with, say, a wider
 onscreen device window you can have (on Windows, which you didn't say)

 mywindows - function(...) windows(width=10, height=6, ...)
 options(device=mywindows)

 in your ~/.Rprofile .  Otherwise, please try again to tell us what you
 do want.



 I only know how to do this using windows(width=,height=...) which opens
 up
 a new plotting device every time, so I end up with lots of graphs all
 over
 the place until I get the one I want!

 Thanks in advance,

 Simon


 Simon Pickett
 PhD student
 Centre For Ecology and Conservation
 Tremough Campus
 University of Exeter in Cornwall
 TR109EZ
 Tel 01326371852

 http://www.uec.ac.uk/biology/research/phd-students/simon_pickett.shtml

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 --
 Brian D. Ripley,  [EMAIL PROTECTED]
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595



 Simon Pickett
 PhD student
 Centre For Ecology and Conservation
 Tremough Campus
 University of Exeter in Cornwall
 TR109EZ
 Tel 01326371852

 http://www.uec.ac.uk/biology/research/phd-students/simon_pickett.shtml



-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] {grid} plain units with non NULL data arguments

2007-08-14 Thread Prof Brian Ripley
On Tue, 14 Aug 2007, Wolfram Fischer wrote:

 In help(unit) I read:

 The 'data' argument must be a list when the 'unit.length()'
 is greater than 1.  For example, 'unit(rep(1, 3), c(npc,
 strwidth, inches), data=list(NULL, my string, NULL))'.

 In the newest R-versions it is not anymore allowed to let strings
 in the data-argument for plain units, otherwise one gets the
 following error:
Non-NULL value supplied for plain unit

 I have some labels. Between them I wanted to set a distance of 1.5 lines.
 (I wanted to use that for a grid.layout for a legend:
 The space is for the symbols.)

labels - c( ':', 'a', 'bb', 'ccc', '', 'e' )
n - length( labels )
s - as.list( c( labels[1], rep( labels[-1], each=2 ) ) )
u - unit( data=s, x=c( 1, rep( c( 1.5, 1 ), n-1 ) ),
units=c( 'strwidth', rep( c( 'lines', 'strwidth' ), n-1 ) ) )

 How can I insert the NULL values into the list ``s''?

 To fill every second element of s with NULL, I tried:
s[ 2 * ( 1 : length( labels[-1] ) ) ] - NULL
 But this deletes every second element.

A value of list(NULL) is correct for inserting NULLs into lists.
(More generally to substitute in a list you need a list value.)

 The following would work:
s[ 2 * ( 1 : length( labels[-1] ) ) ] - NA
 But unit() does not accept NAs.

More to the point, it does not accept logical vectors as NULL values.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] glm(family=binomial) and lmer

2007-08-14 Thread Prof Brian Ripley
On Tue, 14 Aug 2007, Chris O'Brien wrote:

 Dear R users,

 I've notice that there are two ways to conduct a binomial GLM with binomial
 counts using R.  The first way is outlined by Michael Crawley in his
 Statistical Computing book (p 520-521):

and in the places he got it from (it is not his original work).

These are not the only two ways, and they are not the same analyses as the 
saturated models differ.  The usual way to use weights is

y - dead/batch
model3 - glm(y ~ log(dose), binomial, weights=batch)
summary(model3)

and internally glm converts models with a two-column response to this 
form, for it is in this form the binomial fits into the GLM framework.

See the White Book or MASS (even the 1994 edition).


 dose=c(1,3,10,30,100)
 dead = c(2,10,40,96,98)
 batch=c(100,90,98,100,100)
 response = cbind(dead,batch-dead)
 model1=glm(y~log(dose),binomial)
 summary(model1)

 Which returns (in part):
 Coefficients:
 Estimate Std. Error z value Pr(|z|)
 (Intercept)  -4.5318 0.4381  -10.35   2e-16 ***
 log(dose) 1.9644 0.1750   11.22   2e-16 ***
 Null deviance: 408.353  on 4  degrees of freedom
 Residual deviance:  10.828  on 3  degrees of freedom
 AIC: 32.287

 Another way to do the same analysis is to reformulate the data, and use GLM
 with weights:

 y1=c(rep(0,5),rep(1,5))
 dose1=rep(dose,2)
 number = c(batch-dead,dead)
 data1=as.data.frame(cbind (y1,dose,number))
 model2=glm(y1~log(dose1),binomial,weights=number,data=data1)
 summary(model2)

 Which returns:

 Coefficients:
 Estimate Std. Error z value Pr(|z|)
 (Intercept)  -4.5318 0.4381  -10.35   2e-16 ***
 log(dose1)1.9644 0.1750   11.22   2e-16 ***
 (Dispersion parameter for binomial family taken to be 1)
 Null deviance: 676.48  on 9  degrees of freedom
 Residual deviance: 278.95  on 8  degrees of freedom
 AIC: 282.95

 Number of Fisher Scoring iterations: 6

 These two methods are similar in the parameter estimates and standard
 errors, however the deviances, their d.f., and AIC differ.  I take the
 first method to be the correct one.

This form has ten obeservations of groups with weights 2,98,10,80 

 However, I'm really interested in conducting a GLM binomial mixed model,
 and I am unable to figure out how to use the first method with the lmer
 function from the lme4 library, e.g.

 model3=lmer(y~log(dose)+time|ID)# the above example data doesn't have
 the random effect, but my own data set does.

   Does anyone have any suggestions?

 thanks,
 chris

 Thanks,
 Chris O'Brien

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Import of Access data via RODBC changes column name (NO to Expr1014) and the content of the column

2007-08-14 Thread Prof Brian Ripley

On Tue, 14 Aug 2007, Maciej Hoffman-Wecker wrote:


Dear Professor Ripley,

Thank you very much for your response. I send the problem, as I didn't 
have any more ideas were to search for the reason. I didn't say this is 
a R bug, knowing the responses on such mails.-)


But I succeeded in developing a tiny example, that reproduces the bug 
(wherever it is).


Thank you, that was helpful: much easier to follow that the previous code.

...


library(RODBC)
.con - odbcConnectAccess(./test2.mdb)
(.d - try(sqlQuery(.con, select * from Tab1)))

 F1 NO F2
1  1  1  1
2  2  2  2
3  0 NA  1
4  1  0  0

(.d - try(sqlQuery(.con, select F1 , NO , F2 from Tab1)))

 F1 Expr1001 F2
1  10  1
2  20  2
3  00  1
4  10  0

close(.con)


So the problem occurs if the column names are specified within the query.
Is the query select F1 , NO , F2 from Tab1 invalid?


I believe so. 'NO' is an SQL92 and ODBC reserved word, at least according 
to http://www.bairdgroup.com/reservedwords.cfm


See also http://support.microsoft.com/default.aspx?scid=kb;en-us;286335
which says

  For existing objects with names that contain reserved words, you can
  avoid errors by surrounding the object name with brackets ([ ]).

and lists 'NO' as a reserved word.  RODBC quotes all column names it uses 
to be sure (and knows about most non-standard quoting mechanisms from the 
ODBC driver in use).  But this was a query you generated and so you need 
to do the quoting.


Regarding the memory issue, I _knew_ that there must be a reason for the 
running out of memory space. Sorry for not being more specific. My 
question than is:


Is there a way to 'reset' the environment without quitting R and 
restarting it?


Sorry, no.  You cannot move objects in memory.

But why '477Mb' is coming up is still unexplained, and suggests that the 
machine has a peculiar amount of memory or some flag has been used.





Thank you for your help.

Kind regards,
Maciej


-Ursprüngliche Nachricht-
Von: Prof Brian Ripley [mailto:[EMAIL PROTECTED]
Gesendet: Dienstag, 14. August 2007 11:51
An: Maciej Hoffman-Wecker
Cc: r-help@stat.math.ethz.ch
Betreff: Re: [R] Import of Access data via RODBC changes column name (NO to 
Expr1014) and the content of the column

On Tue, 14 Aug 2007, Maciej Hoffman-Wecker wrote:



Dear all,

I have some problems with importing data from an Access data base via
RODBC to R. The data base contains several tables, which all are
imported consecutively. One table has a column with column name NO.
If I run the code attached on the bottom of the mail I get no
complain, but the column name (name of the respective vector of the
data.frame) is Expr1014 instead of NO. Additionally the original
column (type
text) containes 0s and missings, but the imported column contains
0s only (type int). If I change the column name in the Access data
base to NOx, the import works fine with the right name and the same
data.

Previously I generated a tiny Access data base which reproduced the
problem. To be on the safe site I installed the latest version (2.5.1)
and now the example works fine, but within my production process the
error still remaines. An import into excel via ODBC works fine.

So there is no way to figure it out whether this is a bug or a
feature.-)


It's most likely an ODBC issue, but you have not provided a reproducible 
example.


The second problem I have is that when I rerun rm(list = ls(all =
T)); gc() and the import several times I get the following error:

Error in odbcTables(channel) : Calloc could not allocate (263168 of 1)
memory In addition: Warning messages:
1: Reached total allocation of 447Mb: see help(memory.size) in:
odbcQuery(channel, query, rows_at_time)
2: Reached total allocation of 447Mb: see help(memory.size) in:
odbcQuery(channel, query, rows_at_time)
3: Reached total allocation of 447Mb: see help(memory.size) in:
odbcTables(channel)
4: Reached total allocation of 447Mb: see help(memory.size) in:
odbcTables(channel)

which is surprising to me, as the first two statements should delete
all


How do you _know _what they 'should' do?  That only deletes all objects in the 
workspace, not all objects in R, and not all memory blocks used by R.

Please do read ?Memory-limits for the possible reasons.

Where did '447Mb' come from?  If this machine has less than 2Gb of RAM, buy 
some more.



objects and recover the memory. Is this only a matter of memory? Is
there any logging that reduces the memory? Or is this issue connected to
the upper problem?

I added the code on the bottom - maybe there is some kind of misuse I
lost sight of. Any hints are appreciated.

Kind regards,
Maciej


version

  _
platform   i386-pc-mingw32
arch   i386
os mingw32
system i386, mingw32
status
major  2
minor  5.1
year   2007
month  06
day27
svn rev42083
language   R
version.string R version 2.5.1 (2007-06-27)


## code

Re: [R] weights in GAMs (package mgcv)

2007-08-14 Thread Prof Brian Ripley
Let's simplify to a linear model.  If your covariates have uncertainties, 
most likely a linear regression is not appropriate.  This sounds like an 
'errors in measurements' model, as covered in

@Book{Fuller.87,
   author   = Fuller, Wayne A.,
   title= Measurement Error Models,
   publisher= John Wiley and Sons,
   address =  New York,
   year = 1987,
   ISBN = 0-471-86187-1,
}

in which there is a true covariate that enters the model, but it is only 
observed with measurement error (or similar scenarios).

This is hard enough for linear models, without thinking about non-normal 
models or extensions beyond linear predictors.  The GLM (including GAM) 
estimation process assumes various things, including that the covariates 
that enter into the model are fixed (possibly by conditioning on them) and 
known.

On Tue, 14 Aug 2007, Julian Burgos wrote:

 Dear list,

 I?m using the ?mgcv? package to fit some GAMs. Some of my covariates are
 derived quantities and have an associated standard error, so I would
 like to incorporate this uncertainty into the GAM estimation process.
 Ideally, during the estimation process less importance would be given to
 observations whose covariates have high standard errors.

 The gam() function in the ?mgcv? package has a ?weights? argument.
 According to the package documentation, this can be used to provide
 prior weights to the data. This argument (as far as I understand) takes
 a vector of the same length of the data with numeric values higher than
 zero. So it seems that I should combine the standard errors of all
 covariates into a single vector and use it as weights. But it is not
 obvious to me how to do this, given that the covariates have different
 units and ranges of values.

Actually this is just taken from glm(), and case weights are part of the 
definition of a GLM.  In so far as I understand your scenario, you do not 
have a GLM.

 Is there any way to provide weights to the covariates directly (for
 example providing a matrix of n x m values, where n=number of covariates
 and m=number of observations)?

 Thanks,

 Julian

 Julian M. Burgos

 Fisheries Acoustics Research Lab
 School of Aquatic and Fishery Science
 University of Washington

 1122 NE Boat Street
 Seattle, WA  98105

 Phone: 206-221-6864

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mann-Whitney U

2007-08-14 Thread Prof Brian Ripley
On Tue, 14 Aug 2007, Natalie O'Toole wrote:

 Hi,

 Could someone please tell me how to perform a Mann-Whitney U test on a
 dataset with 2 groups where one group has more data values than another?

 I have split up my 2 groups into 2 columns in my .txt file i'm using with
 R. Here is the code i have so far...

 group1 - c(LeafArea2)
 group2 - c(LeafArea1)
 wilcox.test(group1, group2)

 This code works for datasets with the same number of data values in each
 column, but not when there is a different number of data values in one
 column than another column of data.

There is an example of that scenario on the help page for wilcox.test, so 
it does 'work'.  What exactly went wrong for you?

 Is the solution that i have to have a null value in the data column with
 the fewer data values?

 I'm testing for significant diferences between the 2 groups, and the
 result i'm getting in R with the uneven values is different from what i'm
 getting in SPSS.

We need a worked example.  As the help page says, definitions do differ. 
If you can provide a reproducible example in R and the output from SPSS we 
may be able to tell you how to relate that to what you see in R.

[...]

 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

As it says, we really need such code (and the output you get) to be able 
to help you.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] installation of packages

2007-08-14 Thread Prof Brian Ripley
Please see the discussion in the rw-FAQ.

On Wed, 15 Aug 2007, [EMAIL PROTECTED] wrote:

 Dear All,

 Have just installed v2.5.1 on Windows XP. Works fine but I had quite a few
 pakages loaded for 2.5.0 (from contributed) and was wondering how I can
 get 2.5.1 to recognise them without having to reinstall them all.

 Is this possible or do I have to reinstall all the packages again?

 I required 2.5.1 for lme4 and matrix.

 Many thanks in advance.

 

 Regards

 Robin Dobos,
 Livestock Research Officer (Livestock Production Systems),
 Beef Industry Centre of Excellence,
 NSW Department of Primary Industries,
 Armidale, NSW, Australia, 2351

 ph:  +61 2 6770 1824
 fax:  +61 2 6770 1830
 mobile: 0431 391 885
 email: [EMAIL PROTECTED]

 If we knew what it was we were doing,
 it would not be called research, would it?

 Albert Einstein


 This message is intended for the addressee named and may con...{{dropped}}

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A clean way to initialize class slot of type numeric vector

2007-08-13 Thread Prof Brian Ripley
Well, c() is NULL, so R did as you asked it to.  See ?integer: an integer 
vector of length 0 can be gotten by integer(0) (and other ways).

If you want integers, why have a slot which is numeric?

 setClass(foo, representation(members=integer))
[1] foo
 new(foo)
An object of class foo
Slot members:
integer(0)

is the natural and simpler way to do this.


On Mon, 13 Aug 2007, [EMAIL PROTECTED] wrote:

 Hi,

 I have a class definition like this:

 setClass(foo, representation(members=numeric),
   prototype(members=c()))

 I intend my class to have members, a slot whose value should be a vector 
 of integer. When I initialize this class, I don't have any member yet. 
 So my member is blank. But if I run the above definition into R, it will 
 complain that my slot members is assigned to NULL which does not extend 
 class numeric. So how can I fix this? Is there any clean way to do 
 this? This is quite a common situation but I can't seem to find a way 
 out. Any help would be really appreciated. Thank you.

 - adschai

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] question regarding is.factor()

2007-08-13 Thread Prof Brian Ripley
typeof() for 'types'.

However, factor is not a type but a class, so class() is probably what 
you want.

On Mon, 13 Aug 2007, Jabez Wilson wrote:

 Dear all, please help with what must be a straightforward question which 
 I can't answer.

But 'An Introduction to R' could.

  I add a column of my dataframe as factor of an existing column e.g.

  df[,5] - factor(df[,2])

  and can test that it is by is.factor()

  but if I did not know in advance what types the columns were, is 
 there a function to tell me what they are.

  i.e. instead of is.factor(), is.matrix(), is.list(), a function more 
 like what.is()


 -

   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Convert factor to numeric vector of labels

2007-08-12 Thread Prof Brian Ripley

See the FAQ Q7.10 (and please study the posting guide)

On Sun, 12 Aug 2007, Falk Lieder wrote:


Hi,

I have imported a data file to R. Unfortunately R has interpreted some
numeric variables as factors. Therefore I want to reconvert these to numeric
vectors whose values are the factor levels' labels. I tried
as.numeric(factor),
but it returns a vector of factor levels (i.e. 1,2,3,...) instead of labels
(i.e. 0.71, 1.34, 2.61,…).
What can I do instead?

Best wishes, Falk

[[alternative HTML version deleted]]



PLEASE do read the posting guide http://www.R-project.org/posting-guide.html


--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R-excel

2007-08-11 Thread Prof Brian Ripley

On Fri, 10 Aug 2007, Peter Wickham wrote:


I am running R 2.5.1 using Mac OSX 10.4.10. xlsReadWrite is a Windows
binary. Instead, install and load packages: (1) gtools:(2) gdata. These
are both Windows and Mac binaries. gdata depends on gtools, so be sure
to load gtools first or set the installation depends parameters. Then you


The R default *is* to install dependencies in R = 2.5.0.


can use read.xls.  Thus, in Mac: data-read.xls(/Users/your
name/Documents/data.xls,sheet=1). For Windows, substitute the appropriate
filepath and file name in the first argument of read.xls: e.g.,
data-read.xls(A:/filename.xls,sheet-1). Thanks to correspondents for


You mean sheet=1 

There are other platforms, and the usage of gdata::read.xls is common to 
all platforms.



their advice; but I hope that this may alleviate some of the frustration
(referred to in the R Import/Export Manual) associated with dealing with


That is described in the 'R Data Import/Export Manual' (sic).

It *increases* the frustration of those who WTFM to see it and its 
contents misdescribed in this way.  Further, people who search the list 
archives are liable to make use of buggy posts like this one, so it seems 
necessary to put the corrections and frustration on the record.


Please just point people to the appropriate manual



EXCEL files in R.

Erika Frigo wrote:



Good morning to everybody,
I have a problem : how can I import excel files in R???

thank you very much


Dr.sa. Erika Frigo
Università degli Studi di Milano
Facoltà di Medicina Veterinaria
Dipartimento di Scienze e Tecnologie Veterinarie per la Sicurezza
Alimentare (VSA)

Via Grasselli, 7
20137 Milano
Tel. 02/50318515
Fax 02/50318501
[[alternative HTML version deleted]]


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.







--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Connecting to database on statup

2007-08-11 Thread Prof Brian Ripley
On Sat, 11 Aug 2007, Ruddy M wrote:

 Hello,
 Q/ Is it possible to create a DBMS connection automatically on startup of R? 
 (Making sure of course that the db server has been started...)
 I am running MySQL on Mac OS X 10.4.2 with R2.4.1.

 I have tried to write a function using the RMySQL commands (below) and place 
 them in .First of .RProfile:

 drv - dbDriver(MySQL)
 dbcon - dbConnect(drv, {other parameters present in my.cnf file} 
 dbname=mydbName)

 DOES create a connection when entered into my R console individually but NOT 
 when I place them in a function, i.e.,

 condb - function() {
   drv - dbDriver(MySQL)
   dbcon - dbConnect(drv, dbname=mydbName)
   dbGetInfo(db)
   }

 When the function is called, the dbGetInfo(dbcon) does return connection 
 info but no connection object is present.

What do you think the return value of this function is?

You need to return dbcon, not the value of dbGetInfo(some argument other 
than db).  Perhaps you meant to print the latter?: if so you need at 
explicit print() statement.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cleaning up the memory

2007-08-10 Thread Prof Brian Ripley

On Fri, 10 Aug 2007, Monica Pisica wrote:



Thanks! I will look into ...

I have 4 GB RAM, and i was monitoring the memory with Windows task 
manager so i was looking how R gets more and more memory allocation 
from less than 100Mb to  1500Mb .


Then you are almost certainly fragmenting the address space.

We still don't know your OS and whether you have enabled the /3GB switch 
(if relevant to that version of Windows).   Most versions of Windows have 
a 2Gb address space, but some can be as high as 4Gb (Vista 64 which I use 
is one: the details are in the rw-FAQ for the latest versions of R, e.g. 
R-patched and R-devel).  That factor of 2 can make a big difference.


My initial tables are between 30 to 80 Mb and the resulting tables that 
incorporate the initial tables plus PCA and kmeans results are inbetween 
50 to 200MB or thereabouts!


And yes, i don't really care about memory allocation in detail - what i 
want is to free that memory after every cycle ;-)


Although, after i didn't do anything in R and it was idle for more than 
30 min. the memory allocation according to Task manager dropped to 15 Mb 
. which is good - but i cannot wait inbetween cycles half an hour 
though .


Calling gc() will reduce the memory allocation, but that is not the point.
You can have 15Mb allocated and still not a 50Mb hole in the address 
space (although that would be extremely unlucky, not having several 200Mb 
holes is quite likely).




Again thanks,

Monica Date: Fri, 10 Aug 2007 18:28:07 +0100 From: 
[EMAIL PROTECTED] To: [EMAIL PROTECTED] CC: 
r-help@stat.math.ethz.ch Subject: Re: [R] Cleaning up the memory  On 
Fri, 10 Aug 2007, Monica Pisica wrote:Hi,   I have 4 huge 
tables on which i want to do a PCA analysis and a kmean   clustering. 
If i run each table individually i have no problems, but if   i want 
to run it in a for loop i exceed the memory alocation after the   
second table, even if i save the results as a csv table and i clean up  
 all the big objects with rm command. To me it seems that even if i 
don't   have the objects anymore, the memory these objects used to 
occupy is not   cleared. Is there any way to clear up the memory as 
well? I don't want   to close R and start it up again. Also i am 
running R under Windows.  See ?gc, which does the clearing.  
However, unless you study the memory allocation in detail (which you  
cannot do from R code), you don't actually know that this is the 
problem.  More likely is that you have fragmentation of your 32-bit 
address space:  see ?Memory-limits.  Without any idea what memory 
you have and what 'huge' means, we can only  make wild guesses. It 
might be worth raising the memory limit (the  --max-mem-size flag).  
  thanks,   Monica  
_  
[[trailing spam removed]]   [[alternative HTML version deleted]]  
 __  
R-help@stat.math.ethz.ch mailing list  
https://stat.ethz.ch/mailman/listinfo/r-help  PLEASE do read the 
posting guide http://www.R-project.org/posting-guide.html  and provide 
commented, minimal, self-contained, reproducible code.   --  Brian 
D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, 
http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 
272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, 
UK Fax: +44 1865 272595 
_ 
Messenger Café ? open for fun 24/7. Hot games, cool activities served 
daily. Visit now. http://cafemessenger.com?ocid=TXT_TAGLM_AugWLtagline


--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] kde2d error message

2007-08-10 Thread Prof Brian Ripley
If X or Y contains missing values, _you_ supplied missing values as the 
'lims' argument and it will be those missing values that are reported.

I do not see how you expect to be able to do density estimation with 
missing values: they are unknown and so no part of the answer is known. If 
you are prepared to omit them, you can do so but my software (if this is 
indeed kde2d from package MASS, uncredited) does not make such arbitrary 
choices for you.

On Fri, 10 Aug 2007, Jennifer Dillon wrote:

 Hello!

 I am trying to do a smooth with the kde2d function,

That is not what the only kde2d function I know of does.

 and I'm getting an error message about NAs.  Does anyone have any 
 suggestions?  Does this function not do well with NAs in general?

 fit - kde2d(X, Y, n=100,lims=c(range(X),range(Y)))

 Error in if (from == to || length.out  2) by - 1 :
missing value where TRUE/FALSE needed


 Thanks in advance!!

 Jen

   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

PLEASE do as we ask.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Memory Experimentation: Rule of Thumb = 10-15 Times the Memory

2007-08-10 Thread Prof Brian Ripley
I don't understand why one would run a 64-bit version of R on a 2GB 
server, especially if one were worried about object size.  You can run 
32-bit versions of R on x86_64 Linux (see the R-admin manual for a 
comprehensive discussion), and most other 64-bit OSes default to 32-bit 
executables.

Since most OSes limit 32-bit executables to around 3GB of address space, 
there starts to become a case for 64-bit executables at 4GB RAM but not 
much case at 2GB.

It was my intention when providing the infrastructure for it that Linux 
binary distributions on x86_64 would provide both 32-bit and 64-bit 
executables, but that has not happened.  It would be possible to install 
ix86 builds on x86_64 if -m32 was part of the ix86 compiler specification 
and the dependency checks would notice they needed 32-bit libraries. 
(I've had trouble with the latter on FC5: an X11 update removed all my 
32-bit X11 RPMs.)

On Fri, 10 Aug 2007, Michael Cassin wrote:

 Thanks for all the comments,

 The artificial dataset is as representative of my 440MB file as I could 
 design.

 I did my best to reduce the complexity of my problem to minimal
 reproducible code as suggested in the posting guidelines.  Having
 searched the archives, I was happy to find that the topic had been
 covered, where Prof Ripley suggested that the I/O manuals gave some
 advice.  However, I was unable to get anywhere with the I/O manuals
 advice.

 I spent 6 hours preparing my post to R-help. Sorry not to have read
 the 'R-Internals' manual.  I just wanted to know if I could use scan()
 more efficiently.

 My hurdle seems nothing to do with efficiently calling scan() .  I
 suspect the same is true for the originator of this memory experiment
 thread. It is the overhead of storing short strings, as Charles
 identified and Brian explained.  I appreciate the investigation and
 clarification you both have made.

 56B overhead for a 2 character string seems extreme to me, but I'm not
 complaining. I really like R, and being free, accept that
 it-is-what-it-is.

Well, there are only about 5 2-char strings in an 8-bit locale, so 
this does seem a case for using factors (as has been pointed out several 
times).

And BTW, it is not 56B overhead, but 56B total for up to 7 chars.

 In my case pre-processing is not an option, it is not a one off
 problem with a particular file. In my application, R is run in batch
 mode as part of a tool chain for arbitrary csv files.  Having found
 cases where memory usage was as high as 20x file size, and allowing
 for a copy of the the loaded dataset, I'll just need to document that
 it is possible that files as small as 1/40th of system memory may
 consume it all.  That rules out some important datasets (US Census, UK
 Office of National Statistics files, etc) for 2GB servers.

 Regards, Mike


 On 8/9/07, Prof Brian Ripley [EMAIL PROTECTED] wrote:
 On Thu, 9 Aug 2007, Charles C. Berry wrote:

 On Thu, 9 Aug 2007, Michael Cassin wrote:

 I really appreciate the advice and this database solution will be useful to
 me for other problems, but in this case I  need to address the specific
 problem of scan and read.* using so much memory.

 Is this expected behaviour?

 Yes, and documented in the 'R Internals' manual.  That is basic reading
 for people wishing to comment on efficiency issues in R.

 Can the memory usage be explained, and can it be
 made more efficient?  For what it's worth, I'd be glad to try to help if 
 the
 code for scan is considered to be worth reviewing.

 Mike,

 This does not seem to be an issue with scan() per se.

 Notice the difference in size of big2, big3, and bigThree here:

 big2 - rep(letters,length=1e6)
 object.size(big2)/1e6
 [1] 4.000856
 big3 - paste(big2,big2,sep='')
 object.size(big3)/1e6
 [1] 36.2

 On a 32-bit computer every R object has an overhead of 24 or 28 bytes.
 Character strings are R objects, but in some functions such as rep (and
 scan for up to 10,000 distinct strings) the objects can be shared.  More
 string objects will be shared in 2.6.0 (but factors are designed to be
 efficient at storing character vectors with few values).

 On a 64-bit computer the overhead is usually double.  So I would expect
 just over 56 bytes/string for distinct short strings (and that is what
 big3 gives).

 But 56Mb is really not very much (tiny on a 64-bit computer), and 1
 million items is a lot.

 [...]


 --
 Brian D. Ripley,  [EMAIL PROTECTED]
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595



-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax

Re: [R] Cleaning up the memory

2007-08-10 Thread Prof Brian Ripley
On Fri, 10 Aug 2007, Monica Pisica wrote:


 Hi,

 I have 4 huge tables on which i want to do a PCA analysis and a kmean 
 clustering. If i run each table individually i have no problems, but if 
 i want to run it in a for loop i exceed the memory alocation after the 
 second table, even if i save the results as a csv table and i clean up 
 all the big objects with rm command. To me it seems that even if i don't 
 have the objects anymore, the memory these objects used to occupy is not 
 cleared. Is there any way to clear up the memory as well? I don't want 
 to close R and start it up again. Also i am running R under Windows.

See ?gc, which does the clearing.

However, unless you study the memory allocation in detail (which you 
cannot do from R code), you don't actually know that this is the problem. 
More likely is that you have fragmentation of your 32-bit address space: 
see ?Memory-limits.

Without any idea what memory you have and what 'huge' means, we can only 
make wild guesses.  It might be worth raising the memory limit (the 
--max-mem-size flag).


 thanks,

 Monica
 _
 [[trailing spam removed]]

   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] reading xcms files

2007-08-10 Thread Prof Brian Ripley
On Fri, 10 Aug 2007, Roberto Olivares Hernandez wrote:

 Hi,

 I am using xcms library to read mass spectrum data. I generate objects 
 from CDF files using the command line

  SME10 - xcmsRaw(SME_10.CDF)

 I have 50 CDF files with different name and I don't want to repeat the 
 command for each one. Is there any option to read all the files and 
 generate a corresponding object name?

Something like

for(f in Sys.glob(*.CDF)) assign(sub(\\.CDF$, , f), xcmsRaw(f))

(untested, of course).

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reading time/date string

2007-08-09 Thread Prof Brian Ripley
On Thu, 9 Aug 2007, Matthew Walker wrote:

 Thanks Mark, that was very helpful.  I'm now so close!

 Can anyone tell me how to extract the value from an instance of a
 difftime class?  I can see the value, but how can I place it in a
 dataframe?

as.numeric(time_delta)

Hint: you want the number, not the value (which is a classed object).


  time_string1 - 10:17:07 02 Aug 2007
  time_string2 - 13:17:40 02 Aug 2007
 
  time1 - strptime(time_string1, format=%H:%M:%S %d %b %Y)
  time2 - strptime(time_string2, format=%H:%M:%S %d %b %Y)
 
  time_delta - difftime(time2,time1, unit=sec)
  time_delta
 Time difference of 10833 secs # --- I'd like this value just here!
 
  data.frame(time1, time2, time_delta)
 Error in as.data.frame.default(x[[i]], optional = TRUE) :
cannot coerce class difftime into a data.frame



 Thanks again,

 Matthew


 Mark W Kimpel wrote:
 Look at some of these functions...

 DateTimeClasses(base)   Date-Time Classes
 as.POSIXct(base)Date-time Conversion Functions
 cut.POSIXt(base)Convert a Date or Date-Time Object to a Factor
 format.Date(base)   Date Conversion Functions to and from Character

 Mark
 ---

 Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
 Indiana University School of Medicine

 15032 Hunter Court, Westfield, IN  46074

 (317) 490-5129 Work,  Mobile  VoiceMail
 (317) 663-0513 Home (no voice mail please)

 **

 Matthew Walker wrote:
 Hello everyone,

 Can anyone tell me what function I should use to read time/date
 strings and turn them into a form such that I can easily calculate
 the difference of two?  The strings I've got look like 10:17:07 02
 Aug 2007.  If I could calculate the number of seconds between them
 I'd be very happy!

 Cheers,

 Matthew

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 .


 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R memory usage

2007-08-09 Thread Prof Brian Ripley
See

?gc
?Memory-limits

On Wed, 8 Aug 2007, Jun Ding wrote:

 Hi All,

 I have two questions in terms of the memory usage in R
 (sorry if the questions are naive, I am not familiar
 with this at all).

 1) I am running R in a linux cluster. By reading the R
 helps, it seems there are no default upper limits for
 vsize or nsize. Is this right? Is there an upper limit
 for whole memory usage? How can I know the default in
 my specific linux environment? And can I increase the
 default?

See ?Memory-limits, but that is principally a Linux question.


 2) I use R to read in several big files (~200Mb each),
 and then I run:

 gc()

 I get:

used  (Mb) gc trigger   (Mb)  max used
 Ncells  23083130 616.4   51411332 1372.9  51411332
 Vcells 106644603 813.7  240815267 1837.3 227550003

 (Mb)
 1372.9
 1736.1

 What do columns of used, gc trigger and max used
 mean? It seems to me I have used 616Mb of Ncells and
 813.7Mb of Vcells. Comparing with the numbers of max
 used, I still should have enough memory. But when I
 try

 object.size(area.results)   ## area.results is a big
 data.frame

 I get an error message:

 Error: cannot allocate vector of size 32768 Kb

 Why is that? Looks like I am running out of memory. Is
 there a way to solve this problem?

 Thank you very much!

 Jun


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] tcltk error on Linux

2007-08-09 Thread Prof Brian Ripley
On Thu, 9 Aug 2007, Mark W Kimpel wrote:

 I am having trouble getting tcltk package to load on openSuse 10.2
 running R-devel. I have specifically put my /usr/share/tcl directory in
 my PATH, but R doesn't seem to see it. I also have installed tk on my
 system. Any ideas on what the problem is?

Whether Tcl/Tk would available was determined when you installed R.  The 
relevant information was in the configure output and log, which we don't 
have.

You are not running a released version of R: please don't use the 
development version unless you are familiar with the build process and 
know how to debug such things yourself.  The rule is that questions about 
development versions of R should not be asked here but on R-devel (and not 
to R-core which I have deleted from the recipients).

I suggest reinstalling R (preferably R-patched) and if tcltk still is not 
available sending the relevant configure information to the R-devel list.

 Also, note that I have some warning messages on starting up R, not sure
 what they mean or if they are pertinent.

Those are coming from a Bioconductor package: again you must be using 
development versions with R-devel and those are not stable (last time I 
looked even Biobase would not install, and the packages change daily).

If you have all those packages in your startup, please don't -- there will 
be a considerable performance hit so only load them when you need them.


 Thanks, Mark

 Warning messages:
 1: In .updateMethodsInTable(fdef, where, attach) :
   Couldn't find methods table for conditional, package Category may
 be out of date
 2: In .updateMethodsInTable(fdef, where, attach) :
   Methods list for generic conditional not found
  require(tcltk)
 Loading required package: tcltk
 Error in firstlib(which.lib.loc, package) :
   Tcl/Tk support is not available on this system
  sessionInfo()
 R version 2.6.0 Under development (unstable) (2007-08-01 r42387)
 i686-pc-linux-gnu

 locale:
 LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C

 attached base packages:
 [1] splines   tools stats graphics  grDevices utils datasets
 [8] methods   base

 other attached packages:
  [1] affycoretools_1.9.3annaffy_1.9.1  xtable_1.5-0
  [4] gcrma_2.9.1matchprobes_1.9.10 biomaRt_1.11.4
  [7] RCurl_0.8-1XML_1.9-0  GOstats_2.3.8
 [10] Category_2.3.19genefilter_1.15.9  survival_2.32
 [13] KEGG_1.17.0RBGL_1.13.3annotate_1.15.3
 [16] AnnotationDbi_0.0.88   RSQLite_0.6-0  DBI_0.2-3
 [19] GO_1.17.0  limma_2.11.9   affy_1.15.7
 [22] preprocessCore_0.99.12 affyio_1.5.6   Biobase_1.15.23
 [25] graph_1.15.10

 loaded via a namespace (and not attached):
 [1] cluster_1.11.7  rcompgen_0.1-15
 



-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help on R performance using aov function

2007-08-09 Thread Prof Brian Ripley
aov() will handle multiple responses and that would be considerably more 
efficient than running separate fits as you seem to be doing.


Your code is nigh unreadable: please use your spacebar and remove the 
redundant semicolons: `Writing R Extensions' shows you how to tidy up 
your code to make it presentable.  But I think anova_[[1]] is really

coef(summary(aov_)) which is a lot more intelligible.

On Thu, 9 Aug 2007, Francoise PFIFFELMANN wrote:


Hi,
I’m trying to replace some SAS statistical functions by R (batch calling).
But I’ve seen that calling R in a batch mode (under Unix) takes about 2or 3
times more than SAS software. So it’s a great problem of performance for me.
Here is an extract of the calculation:

stoutput-file(res_oneWayAnova.dat,w);
cat(Param|F|Prob,file=stoutput,\n);
for (i in 1:n) {
p-list_param[[i]]
aov_-aov(A[,p]~ A[,wafer],data=A);
anova_-summary(aov_);
if (!is.na(anova_[[1]][1,5])  anova_[[1]][1,5]=0.0001)
res_aov-cbind(p,anova_[[1]][1,4],0.0001) else
res_aov-cbind(p,anova_[[1]][1,4],anova_[[1]][1,5]);
cat(res_aov, file=stoutput, append = TRUE,sep = |,\n);
};
close(stoutput);


A is a data.frame of about (400 lines and 1800 parameters).
I’m a new user of R and I don’t know if it’s a problem in my code or if
there are some tips that I can use to optimise my treatment.

Thanks a lot for your help.

Françoise Pfiffelmann
Engineering Data Analysis Group
--
Crolles2 Alliance
860 rue Jean Monnet
38920 Crolles, France
Tel: +33 438 92 29 84
Email: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ARIMA fitting

2007-08-09 Thread Prof Brian Ripley

On Tue, 7 Aug 2007, [EMAIL PROTECTED] wrote:


Hello,
I‘m trying to fit an ARIMA process, using STATS package, arima function.
Can I expect, that fitted model with any parameters is stationary, causal
and invertible?


Please read ?arima: it answers all your questions, and points out that the 
answer depends on the arguments passed to arima().


The posting guide did ask you to do this *before* posting: please study it 
more carefully.


--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Memory Experimentation: Rule of Thumb = 10-15 Times the Memory

2007-08-09 Thread Prof Brian Ripley
On Thu, 9 Aug 2007, Charles C. Berry wrote:

 On Thu, 9 Aug 2007, Michael Cassin wrote:

 I really appreciate the advice and this database solution will be useful to
 me for other problems, but in this case I  need to address the specific
 problem of scan and read.* using so much memory.

 Is this expected behaviour?

Yes, and documented in the 'R Internals' manual.  That is basic reading 
for people wishing to comment on efficiency issues in R.

 Can the memory usage be explained, and can it be
 made more efficient?  For what it's worth, I'd be glad to try to help if the
 code for scan is considered to be worth reviewing.

 Mike,

 This does not seem to be an issue with scan() per se.

 Notice the difference in size of big2, big3, and bigThree here:

 big2 - rep(letters,length=1e6)
 object.size(big2)/1e6
 [1] 4.000856
 big3 - paste(big2,big2,sep='')
 object.size(big3)/1e6
 [1] 36.2

On a 32-bit computer every R object has an overhead of 24 or 28 bytes. 
Character strings are R objects, but in some functions such as rep (and 
scan for up to 10,000 distinct strings) the objects can be shared.  More 
string objects will be shared in 2.6.0 (but factors are designed to be 
efficient at storing character vectors with few values).

On a 64-bit computer the overhead is usually double.  So I would expect 
just over 56 bytes/string for distinct short strings (and that is what 
big3 gives).

But 56Mb is really not very much (tiny on a 64-bit computer), and 1 
million items is a lot.

[...]


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RMySQL loading error

2007-08-09 Thread Prof Brian Ripley
On Thu, 9 Aug 2007, Clara Anton wrote:

 Hi,

 I am having problems loading RMySQL.

 I am using MySQL 5.0,  R version 2.5.1, and RMySQL with Windows XP.

More exact versions would be helpful.

 When I try to load rMySQL I get the following error:

  require(RMySQL)
 Loading required package: RMySQL
 Error in dyn.load(x, as.logical(local), as.logical(now)) :
unable to load shared library
 'C:/PROGRA~1/R/R-25~1.1/library/RMySQL/libs/RMySQL.dll':
  LoadLibrary failure:  Invalid access to memory location.


 I did not get any errors while installing MySQL or RMySQL. It seems that
 there are other people with similar problems, although I could not find
 any hint on how to try to solve the problem.

It is there, unfortunately along with a lot of uniformed speculation.

 Any help, hint or advice would be greatly appreciated.

The most likely solution is to update (or downdate) your MySQL.  You 
possibly got RMySQL from the CRAN Extras site, and if so this is covered 
in the ReadMe there:

   The build of RMySQL_0.6-0 is known to work with MySQL 5.0.21 and 5.0.45,
   and known not to work (it crashes on startup) with 5.0.41.

Usually the message is the one you show, but I have seen R crash.  The 
issue is the MySQL client DLL: that from 5.0.21 or 5.0.45 works in 5.0.41.

All the reports of problems I have seen are for MySQL versions strictly 
between 5.0.21 and 5.0.45.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] S4 based package giving strange error at install time, but not at check time

2007-08-09 Thread Prof Brian Ripley
On Thu, 9 Aug 2007, Rajarshi Guha wrote:

 Hi, I have a S4 based package package that was loading fine on R
 2.5.0 on both OS X and
 Linux. I was checking the package against 2.5.1 and doing R CMD check
 does not give any warnings. So I next built the package and installed
 it. Though the package installed fine I noticed the following message:

 Loading required package: methods
 Error in loadNamespace(package, c(which.lib.loc, lib.loc),
 keep.source = keep.source) :
 in 'fingerprint' methods specified for export, but none
 defined: fold, euc.vector, distance, random.fingerprint,
 as.character, length, show
 During startup - Warning message:
 package fingerprint in options(defaultPackages) was not found
   ^^^

Do you have this package in your startup files or the environment variable 
R_DEFAULT_PACKAGES?  R CMD check should not look there: whatever you are 
quoting above seems to.

 However, I can load the package in R with no errors being reported and
 it seems that the functions are working fine.

 Looking at the sources I see that my NAMESPACES file contains the
 following:

 importFrom(methods)

That should specify what to import, or be imports(methods).  See 
'Writing R Extensions'.

 exportClasses(fingerprint)
 exportMethods(fold, euc.vector, distance, random.fingerprint,
 as.character, length, show)
 export(fp.sim.matrix, fp.to.matrix, fp.factor.matrix,
 fp.read.to.matrix, fp.read, moe.lf, bci.lf, cdk.lf)

 and all the exported methods are defined. As an example consider the
 'fold' method. It's defined as

 setGeneric(fold, function(fp) standardGeneric(fold))
 setMethod(fold, fingerprint,
   function(fp) {
 ## code for the function snipped
   })

 Since the method has been defined I can't see why I should see the
 error during install time, but nothing when the package is checked.

 Any pointers would be appreciated.

 ---
 Rajarshi Guha  [EMAIL PROTECTED]
 GPG Fingerprint: 0CCA 8EE2 2EEB 25E2 AB04  06F7 1BB9 E634 9B87 56EE
 ---
 Bus error -- driver executed.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to include bar values in a barplot?

2007-08-08 Thread Prof Brian Ripley
Please see

?format
?round

Note that text() is said to expect a character vector, so why did you 
supply a numeric vector?

   labels: a character vector or expression specifying the _text_ to be
   written.  An attempt is made to coerce other language objects
   (names and calls) to expressions, and vectors and other
   classed objects to character vectors by 'as.character'. If
   'labels' is longer than 'x' and 'y', the coordinates are
   recycled to the length of 'labels'.

and try as.character(vals) for yourself.

 Is there any way to round up those numbers?

See library(fortunes); fortune(Yoda)


On Wed, 8 Aug 2007, Donatas G. wrote:

 On Wednesday 08 August 2007 00:40:56 Donatas G. wrote:
 On Tuesday 07 August 2007 22:09:52 Donatas G. wrote:
 How do I include bar values in a barplot (or other R graphics, where this
 could be applicable)?

 To make sure I am clear I am attaching a barplot created with
 OpenOffice.org which has barplot values written on top of each barplot.

 After more than two hours search I finally found a solution:
 http://tolstoy.newcastle.edu.au/R/help/06/05/27286.html

 Hey, the solution happens to be only partiall... If the values are not real
 numbers, and have a lot of digits after the dot, the graph might become
 unreadable...

 see this

 vals -
 c(1,1.1236886,4.77554676,5.3345245,1,1.1236886,4.77554676,5.3345245,5.5345245,5.4345245,1.1236886,4.77554676,5.3345245,1.1236886,4.77554676,5.3345245)
 names(vals) - LETTERS[1:16]
 mp - barplot(vals, ylim = c(0, 6))
 text(mp, vals, labels = vals, pos = 3)

 Is there any way to round up those numbers?

 I tried using
 options(digits=2)
 , and it does change the display of a table, but it does not influence the
 barplot...

Well, it does not affect as.character, nor should it.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Find out the workspace name

2007-08-08 Thread Prof Brian Ripley
On Wed, 8 Aug 2007, ONKELINX, Thierry wrote:

 ?getwd()

and ?setWindowTitle, which even has this as the first example.

help.search(window title) gets you there.


 [mailto:[EMAIL PROTECTED] Namens Luis Ridao Cruz

 Sometimes there might be several R sessions open at the same
 time. In Windows no name appears in the R main bar (just R
 Console)

 Is it possible to know the name of the workspace.
 I ussually write it on the script I am working on but I wish
 to know without having to search in a text file.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Changing font in boxplots

2007-08-08 Thread Prof Brian Ripley
On Wed, 8 Aug 2007, G Iossa, School Biological Sciences wrote:

 Hi John,

 Thanks so much for such a quick reply.
 I have tried to set all to Times font running

 par(font.lab=6) (not 4, maybe this is a local setting on my machine?)

'6' is a setting specific to certain devices on Windows.  You should 
really be using font families (which are quite new and so not used in 
many of the introductions).

par(family=serif)

will change the default for all the text on subsequent plots to be in 
a serif font, which on the windows() device is (by default) Times.

The R posting guide does ask you to tell us your OS, so that points like 
this do not have to be guessed at.

 but now the boxplot shown has the x and y labels in Times New Roman and the
 x and y axis still in Arial. Any idea why R is not setting those in Times?

Because you did not ask it to.  The font of axis annotation is set by 
font.axis, not font.lab (which is controls title()'s xlab and ylab and 
nothing in axis()).  See ?axis and ?par, both of which make this clear.

John Kane has claimed that what inline pars are used by boxplot() is 'not 
clear from ?boxplot', but the lack of clarity is his, not in the 
documentation. ?boxplot refers you to ?bxp, and that spells out exactly 
which inline pars are used.


 Thanks a lot for your advice,
 Graziella

 --On 08 August 2007 09:16 -0400 John Kane [EMAIL PROTECTED] wrote:

 I don't know if boxplot will accept a font argument.m
 From ?boxplot it is not clear.
 You may need to set the par() command before the
 boxplot

 Example:
 par(font.lab=4)
 boxplot(mass ~ family, data=mydata, ylab=mass %,
 xlab=family,las=1, cex.axis=1)

 --- G Iossa, School Biological Sciences
 [EMAIL PROTECTED] wrote:

 Hi all,

 I am very new to R and this might be a simple
 question but I have looked
 everywhere you suggest before writing to you.

 I am trying to change font type from san-serif to a
 serif (Times New
 Romans) on all labels and axis of my boxplot. I have
 used this function in
 other plots before, e.g.:

 plot(residuals~lnlifespan, data=mydata, pch=psymb,
 font=6, xlab=ln
 reproductive lifespan, ylab=residuals ln mass,
 font.lab=6, cex=1.5,
 cex.axis=1.5, cex.lab=1.5)

 and found that font.lab or font.axis=6 gives Times
 font. However, when I
 try for boxplot:

 boxplot(mass ~ family, data=mydata, ylab=mass %,
 xlab=family,
 font.axis=6,  font=6, par(las=1), cex.axis=1)

 it does not work (R does not give any warning
 messages). I have also tried
 family=Times but without success. Any idea of why
 is not doing it and
 what I can do to get Times font on my boxplot?
 I run R on Windows.

 Thanks a lot,
 Graziella


 *
 Dr. Graziella Iossa

 Mammal Research Unit
 School Biological Sciences
 University of Bristol
 Woodland Road
 Bristol BS8 1UG, UK

 E-mail: [EMAIL PROTECTED]
 Tel 0044 (0)117 9288918
 Fax 0044 (0)117 3317985
 http://www.bio.bris.ac.uk/research/mammal/index.html
 http://www.bio.bris.ac.uk/people/Iossa.htm

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error: Cannot Coerce POSIXt to POSIXct when building package

2007-08-08 Thread Prof Brian Ripley
On Wed, 8 Aug 2007, Praveen Kanakamedala wrote:

 A newbie here - please forgive me if this is a basic question.  We have an
 in house package built in R 2.2.1 (yes we're a little behind the times at
 our firm)and would like to rebuild it using R 2.5.1.  However, when I try
 and build the package from source, I keep getting this error:

 Error in as(slotVal, slotClass, strict = FALSE) :
no method or default for coercing POSIXt to POSIXct
 Error : unable to load R code in package 'Mango'
 Error: package/namespace load failed for 'Mango'


 I tried defining a new method as.POSIXct in the package to coerce POSIXt
 to POSIXct and then added the as.POSIXct method to the NAMSPACE file.  The
 build still doesn't work (I get the same error message). Any idea what I am
 doing wrong? The coercion statement looks like this and works in R GUI:

How did you get this?  There should be no objects of class 'POSIXt' alone, 
and I get e.g.

 now - Sys.time()
 as(now, POSIXct)
Error in asMethod(object) : explicit coercion of old-style class (POSIXt, 
POSIXct) is not defined

That can be fixed (see ?as), but you seem to have a malformed object in 
one of your slots.

As often applies,

 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 #from is a vector of dates in the format %d-%b-%Y)
 from - as.POSIXct(strptime(from, format = %d%b%Y), tz = GMT)

 Here is my environment info:

 R version 2.5.1 (2007-06-27)
 i386-pc-mingw32

 locale:
 LC_COLLATE=English_United Kingdom.1252;LC_CTYPE=English_United
 Kingdom.1252;LC_MONETARY=English_United
 Kingdom.1252;LC_NUMERIC=C;LC_TIME=English_United Kingdom.1252

 attached base packages:
 [1] tcltk stats graphics  grDevices utils datasets
 methods   base

 other attached packages:
  fSeries  nnet  mgcv   fBasics fCalendar   fEcofin   spatial
 MASS
 251.70  7.2-34  1.3-25  251.70  251.70  251.70  7.2-34  
 7.2-34
 I would sincerely appreciate any help.

   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cannot add lines to plot

2007-08-07 Thread Prof Brian Ripley
On Tue, 7 Aug 2007, Zeno Adams wrote:


 Hello,

 I want to plot a time series and add lines to the plot later on.
 However, this seems to work only as long as I plot the series against
 the default index. As soon as I plot against an object
 of class chron or POSIXt (i.e. I want to add a date/time axis), the
 lines do not appear anymore. The command to add the lines is executed
 without an error message.

 (THIS DOES NOT ADD THE LINES)
 plot(datum2[(3653):(3653+i)],dlindus[(3653):(3653+i)], col
 =hcl(h=60,c=35,l=60), ylim=c(-8,8), type = l, xlab=(),
 ylab=(Return), main = (Industry))
 lines(gvarindus, type=l, lwd=2)
 lines(quantindustlow, col =black, type = l,lty=3)
 lines(quantindusthigh, col =black, type = l,lty=3)

 (THIS ADDS THE LINES, but then I dont have an date axis)
 plot(dlindus[(3653):(3653+i)], col =hcl(h=60,c=35,l=60), ylim=c(-8,8),
 type = l, xlab=(), ylab=(Return), main = (Industry))
 lines(gvarindus, type=l, lwd=2)
 lines(quantindustlow, col =black, type = l,lty=3)
 lines(quantindusthigh, col =black, type = l,lty=3)

 This sounds like a fairly simple problem, but I cannot find any answer
 in the R-help archives.

Look at the help for lines: the standard call is lines(x, y, col =black, 
type = l,lty=3) and you have omitted x.  See ?xy.coords for what 
happens then.

I think the reason you did not find this in the archives is that this is a 
rare misreading (or non-reading) of the help pages.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Interaction factor and numeric variable versus separate regressions

2007-08-07 Thread Prof Brian Ripley
These are not the same model.  You want x*f, and then you will find
the differences in intercepts and slopes from group 1 as the coefficients.

Remember too that the combined model pools error variances and the 
separate model has separate error variance for each group.

To understand model formulae, study Bill Venables' exposition in chapter 6 
of MASS.

On Tue, 7 Aug 2007, Sven Garbade wrote:

 Dear list members,

 I have problems to interpret the coefficients from a lm model involving
 the interaction of a numeric and factor variable compared to separate lm
 models for each level of the factor variable.

 ## data:
 y1 - rnorm(20) + 6.8
 y2 - rnorm(20) + (1:20*1.7 + 1)
 y3 - rnorm(20) + (1:20*6.7 + 3.7)
 y - c(y1,y2,y3)
 x - rep(1:20,3)
 f - gl(3,20, labels=paste(lev, 1:3, sep=))
 d - data.frame(x=x,y=y, f=f)

 ## plot
 # xyplot(y~x|f)

 ## lm model with interaction
 summary(lm(y~x:f, data=d))

 Call:
 lm(formula = y ~ x:f, data = d)

 Residuals:
Min  1Q  Median  3Q Max
 -2.8109 -0.8302  0.2542  0.6737  3.5383

 Coefficients:
Estimate Std. Error t value Pr(|t|)
 (Intercept)  3.687990.41045   8.985 1.91e-12 ***
 x:flev1  0.208850.04145   5.039 5.21e-06 ***
 x:flev2  1.496700.04145  36.109   2e-16 ***
 x:flev3  6.708150.04145 161.838   2e-16 ***
 ---
 Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

 Residual standard error: 1.53 on 56 degrees of freedom
 Multiple R-Squared: 0.9984,   Adjusted R-squared: 0.9984
 F-statistic: 1.191e+04 on 3 and 56 DF,  p-value:  2.2e-16

 ## separate lm fits
 lapply(by(d, d$f, function(x) lm(y ~ x, data=x)), coef)
 $lev1
 (Intercept)   x
 6.77022860 -0.01667528

 $lev2
 (Intercept)   x
   1.0190781.691982

 $lev3
 (Intercept)   x
   3.2746566.738396


 Can anybody give me a hint why the coefficients for the slopes
 (especially for lev1) are so different and how the coefficients from the
 lm model with interaction are related to the separate fits?

 Thanks, Sven

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lda and maximum likelihood

2007-08-06 Thread Prof Brian Ripley
On Mon, 6 Aug 2007, [EMAIL PROTECTED] wrote:

 I am trying to compare several methods for classify data into groups.
 In that purpose I 'd like to developp model comparison and selection
 using AIC.

 In the lda function of the MASS library, the maximum likelihood of the
 function is not given in the output and the script is not available.

The source _is_ available: it is part of the R tarball, and in the VR 
bundle on CRAN.

 Do anyone know how to extract or compute the maximum likelihood used in
 the lda function?

It does not maximize a likelihood: what it does do is described in the 
book for which this is support software.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] warnings()

2007-08-06 Thread Prof Brian Ripley
Possible routes:

1) Use options(warn=2) and traceback().

2) Search the *package* sources.  This is from package GRASS, I believe.
(Not all messages come from packages: they can come from R itself or from 
compiled code linked into a package.)


On Mon, 6 Aug 2007, javier garcia-pintado wrote:

 Hi,
 Is there a way to know which library is giving a warning?
 Specifically, I'm getting a set of warnings:

 Too many open raster files

 Thanks and best wishes,



-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Q: calling par() in .First()

2007-08-06 Thread Prof Brian Ripley
On Mon, 6 Aug 2007, Greg Snow wrote:

 Be aware that the effects of calls to par usually only last for the
 duration of the graphics device, not the R session.

They always apply to the current device only (and will create a current 
device if possible).

 If you put a call to par in your startup script, then it will open a 
 graphics device and set the option, but if you close that graphics 
 device and do another plot then a new graphics device will be started 
 with the default parameters rather than what you set in the startup 
 script.

 You can set some of the options (including background color) when
 starting a graphics device, that may be the better option.

You can also set a hook (see ?setHook) on plot.new (see its help page), 
which could be used to set par(bg=).  A hook on package grDevices would 
have avoided the reported error messages.  (Calling graphics::par in 
startup code works in R-devel but not in 2.5.x, but using hooks works in 
any fairly recent R.)

 There was some discussion a while back on having global options for some
 of the graphics defaults, but I don't think anything has been
 implemented yet.

I don't believe there was agreement that was desirable.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Function for trim blanks from a string(s)?

2007-08-06 Thread Prof Brian Ripley
I am sure Marc knows that ?sub has examples of trimming trailing space and 
whitespace in various styles.

On Mon, 6 Aug 2007, Marc Schwartz wrote:

 On Mon, 2007-08-06 at 12:15 -0700, adiamond wrote:
 I feel like an idiot posting this because every language I've ever seen has a
 string function that trims blanks off strings (off the front or back or
 both).

Some very common languages do not, though.  It is an exercise in Kernighan 
 Ritchie (the original C reference), and an FAQ entry for Perl.

 Ideally, it would process whole data frames/matrices etc but I don't
 even see one that processes a single string.  But I've searched and I don't
 even see that.  There's a strtrim function but it does something completely
 different.

 If you want to do this while initially importing the data into R using
 one of the read.table() family of functions, see the 'strip.white'
 argument in ?read.table, which would do an entire data frame in one
 call.

 Otherwise, the easiest way to do it would be to use sub() or gsub()
 along the lines of the following:

 # Strip leading space
 sub(^ +, , YourTextVector)


 # Strip trailing space
 sub( +$, , YourTextVector)


 # Strip both
 gsub((^ +)|( +$), , YourTextVector)




 Examples of use:

 sub(^ +, ,Leading Space)
 [1] Leading Space


 sub( +$, , Trailing Space)
 [1] Trailing Space


 gsub((^ +)|( +$), , Leading and Trailing Space)
 [1] Leading and Trailing Space


 See ?sub which also has ?gsub

 Note that the above will only strip spaces, not all white space.

 You can then use the appropriate call in one of the *apply() family of
 functions to loop over columns/rows as may be appropriate.

Well, arrays are vectors and so can be done by

A[] - sub(., A)

and data frames with character columns by

A[] - lapply(A, function(x) sub(., x))

 HTH,

 Marc Schwartz

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data analysis

2007-08-06 Thread Prof Brian Ripley
On Tue, 7 Aug 2007, [EMAIL PROTECTED] wrote:

 On 06-Aug-07 19:26:59, lamack lamack wrote:
 Dear all, I have a factorial design where the
 response is an ordered categorical response.

 treatment (two levels: 1 and 2)
 time (four levels: 30, 60,90 and 120)
 ordered response (0,1,2,3)

 could someone suggest a correct analysis or some references?

 For your data below, I would be inclined to start from here,
 which gives the counts for the different responses:


   Response
 
 Trt Time   0123
 ++
 Tr1  30 |   1 3  |  4
 60 |   211  |  4
 90 |   31   |  4
120 |   31   |  4
 ++---
 Tr2  30 |   2 2  |  4
 60 |   31   |  4
 90 |   3 1  |  4
120 |  121   |  4
 =
 Tr1 |  0934  | 16
 ++---
 Tr2 |  1   1023  | 16
 =

 This suggests that, if anything is happening there at all,
 it is a tendency for high response to occur at shorter times,
 and low response at longer times, with little if any difference
 between the treatments.

 To approach this formally, I would consider adopting a
 re-randomisation approach, re-allocating the outcomes at
 random in such a way as to preserve the marginal totals,
 and evaluating a statistic T, defined in terms of the counts
 and such as to be sensitive to the kind of effect you seek.

 Then situate the value of T obtained from the above counts
 within the distribution of T obtained by this re-randomisation.

 There must be, somewhere in R, routines which can perform this
 kind of constrained re-randomisation,but I'm not sufficiently
 familiar with that area of R to know for sure about them.

?r2dtable  for 2D tables.  But there is a classic way to do this without 
using randomization and holding the time*treatment marginals fixed: 
log-linear models.

 I hope other readers who know about this area in R can come
 up with suggestions!

However, that approach is not taking into account that the response is 
ordered. First make sure the variables are factors: here in data frame 
'dat'.

dat - read.table(..., header=TRUE, colClasses=factor)
library(MASS)
summary(polr(response ~ time*treatment, data = dat))

suggests there is nothing very significant here, and dropping the 
interaction

 summary(polr(response ~ time+treatment, data = dat))

Re-fitting to get Hessian

Call:
polr(formula = response ~ time + treatment, data = dat)

Coefficients:
 Value Std. Error   t value
time60 -1.7030709  1.0323027 -1.649779
time90 -2.1833059  1.0959290 -1.992196
time120-2.7900588  1.1703586 -2.383935
treatment2 -0.8168075  0.7663541 -1.065836

shows a marginal effect of time:

 stepAIC(polr(response ~ time*treatment, data = dat))

selects a model with just 'time' as an explanatory variable.

 anova(polr(response ~ time, dat), polr(response ~ 1, dat))
Likelihood ratio tests of ordinal regression models

Response: response
   Model Resid. df Resid. Dev   TestDf LR stat.Pr(Chi)
1 129   66.58130
2  time26   59.68091 1 vs 2 3 6.900383 0.07514162

again suggests that the effect of time is marginal.

References: obviously this is covered in MASS (see the R FAQ).


 best wishes,
 Ted.

 subject treatment  time   response
 1   130   3
 2   130   3
 3   130   1
 4   130   3
 5   160   3
 6   160   1
 7   160   1
 8   160   2
 9   190   2
 10  190   1
 11  190   1
 12  190   1
 13  1   120   2
 14  1   120   1
 15  1   120   1
 16  1   120   1
 17  230   3
 18  230   3
 19  230   1
 20  230   1
 21  260   1
 22  260   2
 23  260   1
 24  260   1
 25  290   1
 26  290   1
 27  290   1
 28  290   3
 29  2   120   1
 30  2   120   2
 31  2   120   0
 32  2   120   1

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the 

Re: [R] Y-intercept Value

2007-08-06 Thread Prof Brian Ripley
?offset : you can specify a different intercept for each case, or a common 
one.

Or you could just use lm (y - 3 ~ 0 +x), but offset() works better for 
prediction.

On Mon, 6 Aug 2007, Benjamin Zuckerberg wrote:


 Hello everyone,

 Quick question...is there a way of specifying a y-intercept value
 within a lm statement.  For example, if I wanted to specify the
 regression to pass through the origin I would enter lm(y~0+x).  But
 can I specify an actual term such as 1,2,3,4, etc. as an intercept
 value?  Thank you!


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question regarding QT device

2007-08-05 Thread Prof Brian Ripley
grDevices::deviceIsInteractive is only in the unreleased R-devel version 
of R: which version are you using?

Please do study the R posting guide: we do ask for basic information for a 
good reason, and do ask for questions on packages (especially unreleased 
packages) to be sent to the maintainer.


On Sun, 5 Aug 2007, Saptarshi Guha wrote:

 Hi,
   After a few modifications in the makefiles, I successfully compiled
 the Qt device (written by Deepayan Sirkar) for OS X 10.4.9 on a
 Powerbook.
   However when loading into R

   If i remove this line from zzz.R in qtutils/R

 grDevices::deviceIsInteractive(QT)

   and then install
   library(qtutils)

   loads fine and the QT() calls returns a QT window, however, if i
 switch to another application and then switch back to the R GUI, the
 menubar has disappeared.

   If I do not remove the line

 grDevices::deviceIsInteractive(QT)

   the following error appears an qtutils does not load
   Error : 'deviceIsInteractive' is not an exported object from
 'namespace:grDevices'
   Error : .onLoad failed in 'loadNamespace' for 'qtutils'
   Error: package/namespace load failed for 'qtutils'

   Could anyone provide some pointers to get that deviceIsInteractive
 to work?

   Thanks for your time
   Saptarshi

 Saptarshi Guha | [EMAIL PROTECTED] | http://www.stat.purdue.edu/~sguha


   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot to postscript orientation

2007-08-03 Thread Prof Brian Ripley
Do you have the Orientation menu set to 'Auto'?
The effect described seems that if 'Rotate media' is selected, which it 
should not be.

The files look fine to me in GSView 4.8 on Windows and other viewers on 
Linux.  I agree with Uwe that it is a viewer issue (most reported 
postscript/PDF are).

On Fri, 3 Aug 2007, John Kane wrote:


 I seem to see the same problem that Miruna gets just
 to confirm that it is not just her set-up.

 I'm using GSview4.8 if that helps

 --- Uwe Ligges [EMAIL PROTECTED]
 wrote:




 Miruna Petrescu-Prahova wrote:
  Hi

  I am trying to save some plots in a postscript
 file. When I generate the
 plots in the main window, they appear correctly -
 their orientation is
 landscape (i.e., horizontal). However, when I open
 the .ps file with GSview,
 the whole page appears vertically, and the plot
 appears horizontally, which
 means that the plot is only partially visible
 (example here


 https://webfiles.uci.edu/mpetresc/postscript.files/default.ps
 ). I searched
 the R-help mailing list archive and found 2
 suggestions: setting the width
 and height and setting horizontal = FALSE. I have
 tried setting the width
 and height but it makes no difference. I have also
 tried using horizontal =
 FALSE. This rotates and elongates the plot, but
 it is still displayed
 horizontally on a vertical page, and so only
 partially  visible (example
 here

 https://webfiles.uci.edu/mpetresc/postscript.files/horiz.false.ps).
 I
 am not sure what is wrong. Plots are created with
 filled.contour.


 I guess this is a misconfiguration of your GSview.
 The plots are fine
 for me. Anyway, you might also want to set the
 argument
 paper=special in the postscript() call.

 Uwe Ligges


  Thanks
  Miruna


 
 Miruna Petrescu-Prahova
 Department of Sociology
 University of California, Irvine
 [EMAIL PROTECTED]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.


 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to properly finalize external pointers?

2007-08-03 Thread Prof Brian Ripley

On Fri, 3 Aug 2007, Duncan Murdoch wrote:


On 8/3/2007 9:19 AM, Jens Oehlschlägel wrote:

Dear R .Call() insiders,

Can someone enlighten me how to properly finalize external pointers in C code 
(R-2.5.1 win)? What is the relation between R_ClearExternalPtr and the 
finalizer set in R_RegisterCFinalizer?

I succeeded registering a finalizer that works when an R object containing an 
external pointer is garbage collected. However, I have some difficulties 
figuring out how to do that in an explicit closing function.

I observed that
- calling R_ClearExternalPtr does not trigger the finalizer and is dangerous 
because it removes the pointer before the finalizer needs it at 
garbage-collection-time (no finalization = memory leak)
- calling the finalizer directly ensures finalization but now the finalizer is 
called twice (once again at garbage collection time, and I did not find 
documentation how to unregister the finalizer)
- It works to delete the SEXP external pointer object but only if not calling 
R_ClearExternalPtr (but why then do we need it?) Furthermore it is unfortunate 
to delay freeing the external pointers memory if I know during runtime that it 
can be done immediately.

Shouldn't R_ClearExternalPtr call the finalizer and then unregister it? 
(this would also work when removing the SEXP external pointer object is 
difficult because it was handed over to the closing function directly 
as a parameter)


I think we want R_ClearExternalPtr to work even if the finalizer would
fail (e.g. to clean up when there was an error when trying to build the
external object).

So I'd suggest that when you want to get rid of an external object
immediately, you call the finalizer explicitly, then call
R_ClearExternalPtr.  The documentation doesn't address the question of
whether this will clear the registered finalizer so I don't know if
you'll get a second call to the finalizer during garbage collection, but
even if you do, isn't it easy enough to do nothing when you see the null
ptr, as you do below?


You will get a further finalizer call at GC, and I know of no way to 
unregister finalizers. So make sure the finalizer does nothing the second 
time.


The way connections are handled in R-devel provides an example (although 
it is work in progress).


Another possibility is to call the GC yourself if you know there are a lot 
of objects to clear up.



By the way, questions about programming at this level are better asked
in the R-devel group.


Indeed.

[...]

--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] FW: Selecting undefined column of a data frame (was [BioC] read.phenoData vs read.AnnotatedDataFrame)

2007-08-03 Thread Prof Brian Ripley
You are reading the wrong part of the code for your argument list:

  foo[FileName]
Error in `[.data.frame`(foo, FileName) : undefined columns selected

[.data.frame is one of the most complex functions in R, and does many 
different things depending on which arguments are supplied.


On Fri, 3 Aug 2007, Steven McKinney wrote:

 Hi all,

 What are current methods people use in R to identify
 mis-spelled column names when selecting columns
 from a data frame?

 Alice Johnson recently tackled this issue
 (see [BioC] posting below).

 Due to a mis-spelled column name (FileName
 instead of Filename) which produced no warning,
 Alice spent a fair amount of time tracking down
 this bug.  With my fumbling fingers I'll be tracking
 down such a bug soon too.

 Is there any options() setting, or debug technique
 that will flag data frame column extractions that
 reference a non-existent column?  It seems to me
 that the [.data.frame extractor used to throw an
 error if given a mis-spelled variable name, and I
 still see lines of code in [.data.frame such as

 if (any(is.na(cols)))
stop(undefined columns selected)



 In R 2.5.1 a NULL is silently returned.

 foo - data.frame(Filename = c(a, b))
 foo[, FileName]
 NULL

 Has something changed so that the code lines
 if (any(is.na(cols)))
stop(undefined columns selected)
 in [.data.frame no longer work properly (if
 I am understanding the intention properly)?

 If not, could  [.data.frame check an
 options() variable setting (say
 warn.undefined.colnames) and throw a warning
 if a non-existent column name is referenced?




 sessionInfo()
 R version 2.5.1 (2007-06-27)
 powerpc-apple-darwin8.9.1

 locale:
 en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8

 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   
 base

 other attached packages:
 plotrix lme4   Matrix  lattice
 2.2-3  0.99875-4 0.999375-0 0.16-2




 Steven McKinney

 Statistician
 Molecular Oncology and Breast Cancer Program
 British Columbia Cancer Research Centre

 email: smckinney +at+ bccrc +dot+ ca

 tel: 604-675-8000 x7561

 BCCRC
 Molecular Oncology
 675 West 10th Ave, Floor 4
 Vancouver B.C.
 V5Z 1L3
 Canada




 -Original Message-
 From: [EMAIL PROTECTED] on behalf of Johnstone, Alice
 Sent: Wed 8/1/2007 7:20 PM
 To: [EMAIL PROTECTED]
 Subject: Re: [BioC] read.phenoData vs read.AnnotatedDataFrame

 For interest sake, I have found out why I wasn't getting my expected
 results when using read.AnnotatedDataFrame
 Turns out the error was made in the ReadAffy command, where I specified
 the filenames to be read from my AnnotatedDataFrame object.  There was a
 typo error with a capital N ($FileName) rather than lowercase n
 ($Filename) as in my target file..whoops.  However this meant the
 filename argument was ignored without the error message(!) and instead
 of using the information in the AnnotatedDataFrame object (which
 included filenames, but not alphabetically) it read the .cel files in
 alphabetical order from the working directory - hence the wrong file was
 given the wrong label (given by the order of Annotated object) and my
 comparisons were confused without being obvious as to why or where.
 Our solution: specify that filename is as.character so assignment of
 file to target is correct(after correcting $Filename) now that using
 read.AnnotatedDataFrame rather than readphenoData.

 Data-ReadAffy(filenames=as.character(pData(pd)$Filename),phenoData=pd)

 Hurrah!

 It may be beneficial to others, that if the filename argument isn't
 specified, that filenames are read from the phenoData object if included
 here.

 Thanks!

 -Original Message-
 From: Martin Morgan [mailto:[EMAIL PROTECTED]
 Sent: Thursday, 26 July 2007 11:49 a.m.
 To: Johnstone, Alice
 Cc: [EMAIL PROTECTED]
 Subject: Re: [BioC] read.phenoData vs read.AnnotatedDataFrame

 Hi Alice --

 Johnstone, Alice [EMAIL PROTECTED] writes:

 Using R2.5.0 and Bioconductor I have been following code to analysis
 Affymetrix expression data: 2 treatments vs control.  The original
 code was run last year and used the read.phenoData command, however
 with the newer version I get the error message Warning messages:
 read.phenoData is deprecated, use read.AnnotatedDataFrame instead The
 phenoData class is deprecated, use AnnotatedDataFrame (with
 ExpressionSet) instead

 I use the read.AnnotatedDataFrame command, but when it comes to the
 end of the analysis the comparison of the treatment to the controls
 gets mixed up compared to what you get using the original
 read.phenoData ie it looks like the 3 groups get labelled wrong and so

 the comparisons are different (but they can still be matched up).
 My questions are,
 1) do you need to set up your target file differently when using
 read.AnnotatedDataFrame - what is the standard format?

 I can't quite tell where things are going wrong for you, so it would
 help if you can narrow 

Re: [R] FW: Selecting undefined column of a data frame (was [BioC] read.phenoData vs read.AnnotatedDataFrame)

2007-08-03 Thread Prof Brian Ripley
I've since seen your followup a more detailed explanation may help.
The path through the code for your argument list does not go where you 
quoted, and there is a reason for it.

Generally when you extract in R and ask for an non-existent index you get 
NA or NULL as the result (and no warning), e.g.

 y - list(x=1, y=2)
 y[[z]]
NULL

Because data frames 'must' have (column) names, they are a partial 
exception and when the result is a data frame you get an error if it would 
contain undefined columns.

But in the case of foo[, FileName], the result is a single column and so 
will not have a name: there seems no reason to be different from

 foo[[FileName]]
NULL
 foo$FileName
NULL

which similarly select a single column.  At one time they were different 
in R, for no documented reason.


On Fri, 3 Aug 2007, Prof Brian Ripley wrote:

 You are reading the wrong part of the code for your argument list:

  foo[FileName]
 Error in `[.data.frame`(foo, FileName) : undefined columns selected

 [.data.frame is one of the most complex functions in R, and does many 
 different things depending on which arguments are supplied.


 On Fri, 3 Aug 2007, Steven McKinney wrote:

 Hi all,
 
 What are current methods people use in R to identify
 mis-spelled column names when selecting columns
 from a data frame?
 
 Alice Johnson recently tackled this issue
 (see [BioC] posting below).
 
 Due to a mis-spelled column name (FileName
 instead of Filename) which produced no warning,
 Alice spent a fair amount of time tracking down
 this bug.  With my fumbling fingers I'll be tracking
 down such a bug soon too.
 
 Is there any options() setting, or debug technique
 that will flag data frame column extractions that
 reference a non-existent column?  It seems to me
 that the [.data.frame extractor used to throw an
 error if given a mis-spelled variable name, and I
 still see lines of code in [.data.frame such as
 
 if (any(is.na(cols)))
stop(undefined columns selected)
 
 
 
 In R 2.5.1 a NULL is silently returned.
 
 foo - data.frame(Filename = c(a, b))
 foo[, FileName]
 NULL
 
 Has something changed so that the code lines
 if (any(is.na(cols)))
stop(undefined columns selected)
 in [.data.frame no longer work properly (if
 I am understanding the intention properly)?
 
 If not, could  [.data.frame check an
 options() variable setting (say
 warn.undefined.colnames) and throw a warning
 if a non-existent column name is referenced?
 
 
 
 
 sessionInfo()
 R version 2.5.1 (2007-06-27)
 powerpc-apple-darwin8.9.1
 
 locale:
 en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8
 
 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods 
 base
 
 other attached packages:
 plotrix lme4   Matrix  lattice
 2.2-3  0.99875-4 0.999375-0 0.16-2
 
 
 
 
 Steven McKinney
 
 Statistician
 Molecular Oncology and Breast Cancer Program
 British Columbia Cancer Research Centre
 
 email: smckinney +at+ bccrc +dot+ ca
 
 tel: 604-675-8000 x7561
 
 BCCRC
 Molecular Oncology
 675 West 10th Ave, Floor 4
 Vancouver B.C.
 V5Z 1L3
 Canada
 
 
 
 
 -Original Message-
 From: [EMAIL PROTECTED] on behalf of Johnstone, Alice
 Sent: Wed 8/1/2007 7:20 PM
 To: [EMAIL PROTECTED]
 Subject: Re: [BioC] read.phenoData vs read.AnnotatedDataFrame
 
 For interest sake, I have found out why I wasn't getting my expected
 results when using read.AnnotatedDataFrame
 Turns out the error was made in the ReadAffy command, where I specified
 the filenames to be read from my AnnotatedDataFrame object.  There was a
 typo error with a capital N ($FileName) rather than lowercase n
 ($Filename) as in my target file..whoops.  However this meant the
 filename argument was ignored without the error message(!) and instead
 of using the information in the AnnotatedDataFrame object (which
 included filenames, but not alphabetically) it read the .cel files in
 alphabetical order from the working directory - hence the wrong file was
 given the wrong label (given by the order of Annotated object) and my
 comparisons were confused without being obvious as to why or where.
 Our solution: specify that filename is as.character so assignment of
 file to target is correct(after correcting $Filename) now that using
 read.AnnotatedDataFrame rather than readphenoData.
 
 Data-ReadAffy(filenames=as.character(pData(pd)$Filename),phenoData=pd)
 
 Hurrah!
 
 It may be beneficial to others, that if the filename argument isn't
 specified, that filenames are read from the phenoData object if included
 here.
 
 Thanks!
 
 -Original Message-
 From: Martin Morgan [mailto:[EMAIL PROTECTED]
 Sent: Thursday, 26 July 2007 11:49 a.m.
 To: Johnstone, Alice
 Cc: [EMAIL PROTECTED]
 Subject: Re: [BioC] read.phenoData vs read.AnnotatedDataFrame
 
 Hi Alice --
 
 Johnstone, Alice [EMAIL PROTECTED] writes:
 
 Using R2.5.0 and Bioconductor I have been following code to analysis
 Affymetrix expression data: 2 treatments vs

Re: [R] Sorting data for multiple regressions

2007-08-03 Thread Prof Brian Ripley
Well, R has a by() function that does what you want, and its help page 
contains an example of doing regression by group.

(There are other ways.)

On Fri, 3 Aug 2007, Paul Young wrote:

 So I am trying to perform a robust regression (fastmcd in the robust
 package) on a dataset and I want to perform individual regressions based

fastmcd does not do regression ... or I would have adapted the ?by 
example to show you.

 on the groups within the data.  We have over 300 sites and we want to
 perform a regression based on the day of week and the hour for every
 site.  I was wondering if anyone knows of a 'by' command similar to the
 one used in SAS that automatically groups the data for the regressions.
 If not, does anyone have any tips on how to split the data into smaller
 sets and then perform the regression on each set.  I am new to R, so I
 don't know all of the common work arounds and such.  At the moment the
 only method I can think of is to split the data using condition
 statements and manually running the regression on each set.  Thanks or
 your help

 -Paul


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   3   4   5   6   7   8   9   10   >