Re: [R] question about cumulating random effects in lmer

2006-04-24 Thread Dieter Menne
zhongmiao wang zhongmiao at gmail.com writes:

 I am studying the effect of schools on student achievement growth over
 time. School effect is random. The effects of schools in prior years
 are assumed to be persistent till the current year. Thus, the total
 school effect in the second year is like J=J2+J1. Does anyone know how
 to model this kind of cumulating random effects in lmer? Thanks in
 advance!

Compute new variables with the cumulative effect of the previous years and use 
these instead of J2, J1. If there are not too many years, using reshape to 
make a wide version of the data frame, and then adding up within a row is 
probably the most transparent way to do this.

Dieter

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] How to set up starting values in lme

2006-04-24 Thread Dieter Menne
zhongmiao wang zhongmiao at gmail.com writes:

 
 I keep getting the error message Error in lme.formula(fixed = Score ~
 factor(time) - 1, data = ldata, random = list(dumid = mat),  :
 iteration limit reached without convergence (9)
 Is there a way to set up starting values, especially the starting
 values for the variance covariance of random effect? Thanks in
 advance!

For models that simple in the fixed part, lme is rather robust when the random 
part is correctly specified. Using starting value IS important in nlme, but in 
this case I assume that your externally computed mat is ill-conditioned. Did 
you try one of the supplied versions (e.g. pdDiag(~)) first? These cover 
quite a large ground, and if you are not, better try these first.

Dieter

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] plot control

2006-04-24 Thread Alexander Nervedi
Dear R gurus,

I'd like to plot a distribution with the tickmarks always at the quantiles 
of the y-axis, as opposed to the quantiles of the distribution I am 
plotting. plot seems to place these ticks based on some calculations that I 
cant see (?plot doesnt show the innards of plot).

Below is some functional code, but the tick marks are placed unattractively 
since I am referencing the quantiles of the distribution. I'd ideally like 
the tickmarks to be able to reference fixed points on the y-axis and the 
show the associted values.

I'd be very grateful for ideas, suggestion and leads.

- alex.

# some code

y1-rnorm(100)
y2-runif(100)
x -1:100

l -length(y1)
mat-scale(cbind(y1,y2))

plot(x, mat[,1], col = blue, yaxt = n, ylab=)
axis(2, at = sort(mat[,1])[c(0.25*l,0.5*l,0.75*l)],
labels = round(sort(y1)[c(0.25*l, 0.5*l,0.75*l)],2))

points(x, mat[,2], col = red)
axis(4, at = sort(mat[,2])[c(0.25*l,0.4*l,0.75*l)],
labels = round(sort(y2)[c(0.25*l, 0.5*l,0.75*l)],2))

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Missing values detected when there are no missing values

2006-04-24 Thread Petr Pikal
Hi

On 22 Apr 2006 at 23:29, Bob Green wrote:

Date sent:  Sat, 22 Apr 2006 23:29:02 +1000
To: r-help@stat.math.ethz.ch
From:   Bob Green [EMAIL PROTECTED]
Subject:[R] Missing values detected when there are no missing 
values

 I am hoping for some advice on the following matters.
 
 I have a csv data file with 153 variables x 92 rows.   To determine
 what the variables looked like I ran the summary command.  One
 variable had a large number of missing values  54/92.  For some
 reason, all subsequent 74 variables are reported as having 92 NA
 values, irrespective of whether the original csv variable was complete
 or not.

I have not seen any answer yet so I try to shot one.

first how do you know there is not any missing value in your csv 
file?

 
 Below are the commands I ran:
 
   study1dat - read.csv(c:\\study1r.csv,header=T)
   attach(study1dat)
   names(study1dat)
   summary(study1dat)

You showed what you did but we can not know much about study1r.csv so 
my answer is only guess. Let's assume that csv was constructed from 
Excel, couldn't be a problem in its construction? Some space in some 
columns which are not seen in Excel but are exported to csv and read 
to R as NA values?

What does str(study1dat) say about your data?
And are there really , vaues separators and . decimal separators 
as required by read.csv?

 
 The second puzzling issue, is that one variable with no missing values
 is reported in R as having 3 missing values, whereas there are no
 missing values in the csv file. The only errors in reading the data I
 received were:

Not when reading but when attaching data frame. Names in your data 
frame are same as names of some functions in mentioned packages, 
which is not an error, R just tell you that this had happened and you 
shall be avare of it.

HTH
Petr


 
 The following object(s) are masked from package:stats :
   time
 
  The following object(s) are masked from package:graphics :
   screen
 
  The following object(s) are masked from package:datasets :
   sleep
 
  The following object(s) are masked from package:base :
  pipe
 
 I am happy to send the csv file if required. Any advice that can
 offered is appreciated,
 
 Bob
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html

Petr Pikal
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] bivariate weighted kernel density estimator

2006-04-24 Thread Adelchi Azzalini
On Sun, 23 Apr 2006 09:13:35 +0200, Erich Neuwirth wrote:

EN Is there code for bivariate kernel density estimation?
EN For bivariate kernels there is
EN kde2d in MASS
EN kde2d.g in GRASS
EN KernSur in GenKern
EN (list probably incomplete)
EN but none of them seems to accept a weight parameter
EN (like density does since R 2.2.0)
EN 

sm.density of package sm allows to use weights 
(with Gaussian kernel)

best wishes,
Adelchi Azzalini

-- 
Adelchi Azzalini  [EMAIL PROTECTED]
Dipart.Scienze Statistiche, Università di Padova, Italia
tel. +39 049 8274147,  http://azzalini.stat.unipd.it/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] garch warning

2006-04-24 Thread Antonio, Fabio Di Narzo
What package are you using?
If you're using tseries, NA's are not allowed in the time series, so you
have to trim out your first missing with a command like:
x1 - na.remove(x)
made available by tseries itself. But I'm note sure is the first NA your
problem... You should provide reproducible example, if you can.

Antonio, Fabio Di Narzo.


2006/4/24, stat stat [EMAIL PROTECTED]:

 Dear r users,

   Few days ago I posted the same topic but unable to receive any
 suggestion. So I am asking this same question.

   I was trying to fit a garch(1,1) model to my dataset. But while
 executing I got a warning message NaNs produced in: sqrt(pred$e). And got
 the estimated sd's along with five NA, but as per my best knowledge I
 should get only one NA i.e. corresponding to the first observation only.
 If anyone tell me why I got this message it will be a great advantage for
 me.

   With regards,



 thanks in advance

 -


 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] boxplots instead of a scatterplot

2006-04-24 Thread Michael Graber
Dear R list,

I am a newbie to R and programming itself, so my question may be easy to 
answer for you.
I wanted to create a scatterplot and i used the following code:

par(mar=c(10, 4.1,4.1,2.1))
plot(q$location,q$points, , las=2, cex.axis=0.5,xlab=, ylab= )

#location are character strings, there are about 70 locations
#points are numeric, there are more than 4 points for every location

my problem is that this code does not create a simple scatterplot with 
location on the x axis and points on the y axis. Instead of this i get 
vertical boxplots for every location.

Thanks for any hint,

Michael Graber

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] plot control

2006-04-24 Thread Michael Dondrup
Hi,
it's not quite clear to me, what you are trying to accomplish with this. You 
are referring to quantiles (you means quartiles in your case?, 
see ?quantile ). So, are you trying to compare theoretical quantiles of a 
distribution to the empirical quantiles?  Or are you just trying to split the 
left axes into 4 ticks? For the first case, try something like:
 y - rnorm(100)
  y.right.axisticks - quantile(y)
  y.left.ticks - qnorm(c(0.25,0.5,0.75)) #makes no sense to compare with 
  # theor. quantiles of uniform distribution, right?  ;) 
  plot(y,yaxt='n')
 axis(2,at=y.left.ticks,labels=round(y.left.ticks,3))
 axis(4,at=y.right.axisticks,labels=round(y.right.axisticks,3))
or for the latter:
  yrange - range(y)
  y.left.ticks - seq(yrange[1], yrange[2], (yrange[2] - yrange[1])/4)
  plot(y,yaxt='n')
 axis(2,at=y.left.ticks,labels=round(y.left.ticks,3))
 axis(4,at=y.right.axisticks,labels=round(y.right.axisticks,3))
is that, what you thought of?
see also:
?qqplot
for another way to compare quantiles

cheers


Am Monday 24 April 2006 08:53 schrieb Alexander Nervedi:
 Dear R gurus,

 I'd like to plot a distribution with the tickmarks always at the quantiles
 of the y-axis, as opposed to the quantiles of the distribution I am
 plotting. plot seems to place these ticks based on some calculations that I
 cant see (?plot doesnt show the innards of plot).

 Below is some functional code, but the tick marks are placed unattractively
 since I am referencing the quantiles of the distribution. I'd ideally like
 the tickmarks to be able to reference fixed points on the y-axis and the
 show the associted values.

 I'd be very grateful for ideas, suggestion and leads.

 - alex.

 # some code

 y1-rnorm(100)
 y2-runif(100)
 x -1:100

 l -length(y1)
 mat-scale(cbind(y1,y2))

 plot(x, mat[,1], col = blue, yaxt = n, ylab=)
 axis(2, at = sort(mat[,1])[c(0.25*l,0.5*l,0.75*l)],
 labels = round(sort(y1)[c(0.25*l, 0.5*l,0.75*l)],2))

 points(x, mat[,2], col = red)
 axis(4, at = sort(mat[,2])[c(0.25*l,0.4*l,0.75*l)],
 labels = round(sort(y2)[c(0.25*l, 0.5*l,0.75*l)],2))

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Problem with data frame

2006-04-24 Thread Arun Kumar Saha
Dear r-users,

suppose I have n normal distributions with parameter N(0,i) i=1,2,...,n
respectively.

Now I want to generate 500 random number for each distribution. And want to
put all 500*n random numbers
in a single data frame.

I tried with following code:

n=20
random = data.frame(n)
for ( i in 2: length)
   {
random[,i] = random(500,mean=0,sd=i)
   }

but while executing this I am getting errors.

Can anyone give me any suggestion?
Thanks and  regards
Arun

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] arrange data for simple regression analysis

2006-04-24 Thread Tomás Revilla
Hello, I want to arrange data from a table to perform a simple
regression. All the examples I saw deal with paired data, e.g. 'x' and
'y' have the same dimensions (e.g. 5 values for x and 5 for y).

But I have more than one 'y' for each 'x' value, e.g. the data file
has a x = 0, 30, 60, and 120 columns. And for each of them I have
several replicate responses (e.g. individuals), not allways the same
number. After I read the data with read.table(), ending with 4
columns, what is next? how can I regress this against c(0, 30, 60,
120)?

0   --   n1 y values
30 --  n2 y values
60 -- n3 y values
120  -- n4 y values

Thanks,

Tomas

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Problem with data frame

2006-04-24 Thread Rich
Try this ...

 rDf - data.frame(sapply(1:20, rnorm, n=500, mean=0)) 

Rich.

S  R Training  Consulting
mangosolutions
Tel+44 1249 467 467
Fax   +44 1249 467 468

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Arun Kumar Saha
Sent: 24 April 2006 10:15
To: r-help@stat.math.ethz.ch
Subject: [R] Problem with data frame

Dear r-users,

suppose I have n normal distributions with parameter N(0,i) i=1,2,...,n
respectively.

Now I want to generate 500 random number for each distribution. And want to
put all 500*n random numbers
in a single data frame.

I tried with following code:

n=20
random = data.frame(n)
for ( i in 2: length)
   {
random[,i] = random(500,mean=0,sd=i)
   }

but while executing this I am getting errors.

Can anyone give me any suggestion?
Thanks and  regards
Arun

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Problem with data frame

2006-04-24 Thread Jacques Veslot
as.data.frame(replicate(20, rnorm(500)))

Arun Kumar Saha a écrit :

 Dear r-users,
 
 suppose I have n normal distributions with parameter N(0,i) i=1,2,...,n
 respectively.
 
 Now I want to generate 500 random number for each distribution. And want to
 put all 500*n random numbers
 in a single data frame.
 
 I tried with following code:
 
 
n=20
 
 random = data.frame(n)
 for ( i in 2: length)
{
 random[,i] = random(500,mean=0,sd=i)
}
 
 but while executing this I am getting errors.
 
 Can anyone give me any suggestion?
 Thanks and  regards
 Arun
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
 


-- 
---
[EMAIL PROTECTED]
CNRS UMR 8090 - http://www-good.ibl.fr
Génomique et physiologie moléculaire des maladies métaboliques
I.B.L 2eme etage - 1 rue du Pr Calmette, B.P.245, 59019 Lille Cedex
Tel : 33 (0)3.20.87.10.44 Fax : 33 (0)3.20.87.10.31

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] distribution of the product of two correlated normal

2006-04-24 Thread Peter Ruckdeschel
Yu, Xuesong writes:
 
 Does anyone know what the distribution for the product of two correlated
 normal? Say I have X~N(a, \sigma1^2) and Y~N(b, \sigma2^2), and the
 \rou(X,Y) is not equal to 0, I want to know the pdf or cdf of XY. Thanks
 a lot in advance.
 

There is no closed-form expression (at least not to my knowledge) ---
but you could easily write some code for a numerical evaluation of the pdf / 
cdf:

###
#code by P. Ruckdeschel, [EMAIL PROTECTED] 04-24-06
###
#
#pdf of X1X2, X1~N(m1,s1^2), X2~N(m2,s2^2), corr(X1,X2)=rho, evaluated at t
#
#   eps is a very small number to catch errors in division by 0
###
#
dnnorm - function(t, m1, m2, s1, s2, rho,  eps = .Machine$double.eps ^ 0.5){
a - s1*sqrt(1-rho^2)
b - s1*rho
c - s2
f - function(u, t = t, m1 = m1, m2 = m2, a0 = a, b0 = b, c0 = c,  eps = eps)
 {
  nen0 - m2+c0*u
  #catch a division by 0
  nen - ifelse(abs(nen0)eps, nen0, ifelse(nen00, nen0+eps, nen0-eps))
  dnorm(u)/a0/nen * dnorm( t/a0/nen -(m1+b0*u)/a0)
 }
-integrate(f, -Inf, -m2/c, t = t, m1 = m1, m2 = m2, a0 = a, b0 = b, c0 = 
c)$value+
 integrate(f, -m2/c,  Inf, t = t, m1 = m1, m2 = m2, a0 = a, b0 = b, c0 = 
c)$value
}

###
#
#cdf of X1X2, X1~N(m1,s1^2), X2~N(m2,s2^2), corr(X1,X2)=rho, evaluated at t
#
#   eps is a very small number to catch errors in division by 0
###
#
pnnorm - function(t, m1, m2, s1, s2, rho,  eps = .Machine$double.eps ^ 0.5){
a - s1*sqrt(1-rho^2)
b - s1*rho
c - s2
fp - function(u, t = t, m1 = m1, m2 = m2, a0 = a, b0 = b, c0 = c,  eps = eps)
 {nen0 - m2+c0*u ## for all u's used in integrate: never negative
  #catch a division by 0
  nen  - ifelse(nen0eps, nen0, nen0+eps)
  dnorm(u) * pnorm( t/a0/nen- (m1+b0*u)/a0)
 }
fm - function(u, t = t, m1 = m1, m2 = m2, a0 = a, b0 = b, c0 = c,  eps = eps)
 {
  nen0 - m2+c0*u ## for all u's used in integrate: never positive
  #catch a division by 0
  nen  - ifelse(nen0 -eps, nen0, nen0-eps)
  dnorm(u) * pnorm(-t/a0/nen+ (m1+b0*u)/a0)
 }
integrate(fm, -Inf, -m2/c, t = t, m1 = m1, m2 = m2, a0 = a, b0 = b, c0 = 
c)$value+
integrate(fp, -m2/c,  Inf, t = t, m1 = m1, m2 = m2, a0 = a, b0 = b, c0 = 
c)$value
}
##

If you have to evalute dnnorm() or pnnorm() at a lot of values of t
for some given m1, m2, s1, s2, rho, then you should first evaluate
[p,d]nnorm() on a (smaller) number of gridpoints of values for t first
and then use something like approxfun() or splinefun() to give you a
much faster evaluable function.

Hth, Peter

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Problem with data frame

2006-04-24 Thread Gavin Simpson
On Mon, 2006-04-24 at 14:45 +0530, Arun Kumar Saha wrote:
 
 I tried with following code:
 
 n=20
 random = data.frame(n)
 for ( i in 2: length)
{
 random[,i] = random(500,mean=0,sd=i)
}
 
 but while executing this I am getting errors.

Did you check what you'd done above, or is what you posted not copied
and pasted directly from your R session (and therefore contains typos)?

  * data.frame(n) - doesn't make sense; you have data frame with one
row/column containing the number 20
  * What is length?
  * If you want i=1,2,...,n, why do you use i in 2: length?
  * What is random() [the function you are trying to use]?
  * What are the error messages - although one can have a good guess
if you actually tried to run that code. You are asked by the
posting guide to supply.

Does this help:

ran - matrix(ncol = 20, nrow = 500)
for (i in 1:20)
{
  ran[, i] - rnorm(500, sd = i)
}
ran - as.data.frame(ran)

G

 
 Can anyone give me any suggestion?
 Thanks and  regards
 Arun
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
*  Note new Address, Telephone  Fax numbers from 6th April 2006  *
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Gavin Simpson 
ECRC  ENSIS  [t] +44 (0)20 7679 0522
UCL Department of Geography   [f] +44 (0)20 7679 0565
Pearson Building  [e] gavin.simpsonATNOSPAMucl.ac.uk
Gower Street  [w] http://www.ucl.ac.uk/~ucfagls/cv/
London, UK.   [w] http://www.ucl.ac.uk/~ucfagls/
WC1E 6BT.
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] boxplots instead of a scatterplot

2006-04-24 Thread Petr Pikal
Hi


On 24 Apr 2006 at 10:40, Michael Graber wrote:

Date sent:  Mon, 24 Apr 2006 10:40:42 +0200
From:   Michael Graber [EMAIL PROTECTED]
To: R-Mailingliste r-help@stat.math.ethz.ch
Subject:[R] boxplots instead of a scatterplot

 Dear R list,
 
 I am a newbie to R and programming itself, so my question may be easy
 to answer for you. I wanted to create a scatterplot and i used the
 following code:
 
 par(mar=c(10, 4.1,4.1,2.1))
 plot(q$location,q$points, , las=2, cex.axis=0.5,xlab=, ylab= )
 
 #location are character strings, there are about 70 locations

location is probably a factor (not a character vector, try str(q) and 
look what is stated at location variable) so that's why R used 
boxplot automatically. 

 loc-sample(letters[1:3],10, replace=T)
 x-rnorm(10)
 plot(loc,x)
Error in plot.window(xlim, ylim, log, asp, ...) : 
need finite 'xlim' values
In addition: Warning messages:
1: NAs introduced by coercion 
2: no finite arguments to min; returning Inf 
3: no finite arguments to max; returning -Inf 

 plot(as.factor(loc),x)


You can either:

plot(as.numeric(q$location) ,q$points, , las=2, cex.axis=0.5,xlab=, 
ylab= )

but you loose information about location names or to use stripchart

 stripchart(split(x,loc), vertical=T)

see ?stripchart

HTH
Petr


 #points are numeric, there are more than 4 points for every location
 
 my problem is that this code does not create a simple scatterplot with
 location on the x axis and points on the y axis. Instead of this i get
 vertical boxplots for every location.
 
 Thanks for any hint,
 
 Michael Graber
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html

Petr Pikal
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] arrange data for simple regression analysis

2006-04-24 Thread Petr Pikal
Hi

not sure what do you want to do but what about

y-colMeans(your.data)
x-c(0,30,60,120)
fit-lm(y~x)

Is this what you want?

Better to use advice suggested in posting guide and to show some 
reproducible example.

HTH
Petr


On 24 Apr 2006 at 5:15, Tomás Revilla wrote:

Date sent:  Mon, 24 Apr 2006 05:15:58 -0400
From:   Tomás Revilla [EMAIL PROTECTED]
To: r-help@stat.math.ethz.ch
Subject:[R] arrange data for simple regression analysis

 Hello, I want to arrange data from a table to perform a simple
 regression. All the examples I saw deal with paired data, e.g. 'x' and
 'y' have the same dimensions (e.g. 5 values for x and 5 for y).
 
 But I have more than one 'y' for each 'x' value, e.g. the data file
 has a x = 0, 30, 60, and 120 columns. And for each of them I have
 several replicate responses (e.g. individuals), not allways the same
 number. After I read the data with read.table(), ending with 4
 columns, what is next? how can I regress this against c(0, 30, 60,
 120)?
 
 0   --   n1 y values
 30 --  n2 y values
 60 -- n3 y values
 120  -- n4 y values
 
 Thanks,
 
 Tomas
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html

Petr Pikal
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] boxplots instead of a scatterplot

2006-04-24 Thread Jim Lemon
Michael Graber wrote:
 Dear R list,
 
 I am a newbie to R and programming itself, so my question may be easy to 
 answer for you.
 I wanted to create a scatterplot and i used the following code:
 
 par(mar=c(10, 4.1,4.1,2.1))
 plot(q$location,q$points, , las=2, cex.axis=0.5,xlab=, ylab= )
 
 #location are character strings, there are about 70 locations
 #points are numeric, there are more than 4 points for every location
 
 my problem is that this code does not create a simple scatterplot with 
 location on the x axis and points on the y axis. Instead of this i get 
 vertical boxplots for every location.
 
Hi Michael,

R is probably trying to interpret the locations as a factor and 
helpfully breaking down the numeric variable by this factor. One way to 
get around this is to fake a numeric for the x axis by using the indices 
of the character variable for plotting, then cram in the labels using 
staxlab in the plotrix package:

# nr.comb is an 'n choose r' function in a private package
# I think there is an equivalent in gregmisc
indies-nr.comb(9,3)[1:70,]
locations-
  apply(indies,1,nonsense.words-function(x) 
paste(LETTERS[x],sep=,collapse=))
testdf-data.frame(locations=sample(locations,300,TRUE),numbers=rnorm(300))
location.indices-
  sapply(testdf$locations,whichloc-function(x) which(locations==x))
# leave lots of room for the labels
x11(width=18,height=7)
plot(location.indices,testdf$numbers,axes=FALSE)
box()
axis(2)
# you can also try vertical labels using par
staxlab(1,at=1:70,labels=locations,nlines=3)

Jim

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] arrange data for simple regression analysis

2006-04-24 Thread Gabor Grothendieck
Here are a couple of possibilities using the builtin iris data set.  Note
that although the coefficients come out the same, the degrees of
freedom, etc., would differ:

 n - rep(1:3, 50)
 lm(Petal.Length ~ Petal.Width, iris, weight = n)

Call:
lm(formula = Petal.Length ~ Petal.Width, data = iris, weights = n)

Coefficients:
(Intercept)  Petal.Width
  1.0572.262

 lm(Petal.Length ~ Petal.Width, iris[rep(1:nrow(iris), n),])

Call:
lm(formula = Petal.Length ~ Petal.Width, data = iris[rep(1:nrow(iris),
n), ])

Coefficients:
(Intercept)  Petal.Width
  1.0572.262

On 4/24/06, Tomás Revilla [EMAIL PROTECTED] wrote:
 Hello, I want to arrange data from a table to perform a simple
 regression. All the examples I saw deal with paired data, e.g. 'x' and
 'y' have the same dimensions (e.g. 5 values for x and 5 for y).

 But I have more than one 'y' for each 'x' value, e.g. the data file
 has a x = 0, 30, 60, and 120 columns. And for each of them I have
 several replicate responses (e.g. individuals), not allways the same
 number. After I read the data with read.table(), ending with 4
 columns, what is next? how can I regress this against c(0, 30, 60,
 120)?

 0   --   n1 y values
 30 --  n2 y values
 60 -- n3 y values
 120  -- n4 y values

 Thanks,

 Tomas

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] R help

2006-04-24 Thread Erez
Hello,

I'm trying to create a large matrix and it's extends the limit boundaries.
The matrix is 100,000x2874 and R is throwing me out, what shall i do?

Thanks
Erez

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] R help

2006-04-24 Thread Petr Pikal
Hi

more memory
new comp
new OS
think about possibility to reformulate the problem with help of 
database and loading/processing data in chunks.

Couple of similar questions were answered not long ago so check 
archives.

HTH
Petr



On 24 Apr 2006 at 13:06, Erez wrote:

Date sent:  Mon, 24 Apr 2006 13:06:12 +0200
From:   Erez [EMAIL PROTECTED]
To: r-help@stat.math.ethz.ch
Subject:[R] R help

 Hello,
 
 I'm trying to create a large matrix and it's extends the limit
 boundaries. The matrix is 100,000x2874 and R is throwing me out, what
 shall i do?
 
 Thanks
 Erez
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html

Petr Pikal
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] R help

2006-04-24 Thread Erez
Hi,
There is anyway to run R script on c++?
Erez

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Coefficients for aliased models

2006-04-24 Thread Ross Darnell
I find the reporting of aliased terms in class lm objects as little 
inconsistent.

The best way to show  this is to give an example.

X - mvrnorm(10,c(0,0),Sigma=matrix(c(1,0.5,0.5,1),nrow=2))
Xnew - as.data.frame(cbind(X,X[,2])) # make copy of X[,2]
names(Xnew) - c(y,X,Xcopy)
model - lm(y~X+Xcopy,data=Xnew)
coef(model)  # reports NA for Xcopy estimate
summary(model) # ditto
coef(summary(model)) # drops NA for Xcopy estimate

Is this intentional? Is there a way of extracting the coefficient table 
from the summary output with the aliased term included?

Many thanks

Ross Darnell

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] R 2.3.0 is released

2006-04-24 Thread Peter Dalgaard
I've rolled up R-2.3.0.tar.gz a short while ago. This version contains
several changes and additions, mostly incremental. See the full list
of changes below.

You can get it (in a short while) from

http://cran.r-project.org/src/base/R-2/R-2.3.0.tar.gz

or wait for it to be mirrored at a CRAN site nearer to you. Binaries
for various platforms will appear in due course.
 
There is also a version split for floppies. 

For the R Core Team

Peter Dalgaard

These are the md5sums for the freshly created files, in case you wish
to check that they are uncorrupted:

eb723b61539feef013de476e68b5c50a  COPYING
a6f89e2100d9b6cdffcea4f398e37343  COPYING.LIB
152bf40b34f471387c623c724e112a58  FAQ
70447ae7f2c35233d3065b004aa4f331  INSTALL
fcb3488d9d8e95e439f4bde1b730a615  NEWS
88bbd6781faedc788a1cbd434194480c  ONEWS
4f004de59e24a52d0f500063b4603bcb  OONEWS
11cc1e9df640ab52e608cf9e695f7354  R-2.3.0.tar.gz
2fb2766d3a35b1c4b525d61dec39f502  R-2.3.0.tar.gz-split.aa
51ac3cd512cbc0f265ca1c8318732c30  R-2.3.0.tar.gz-split.ab
0d5c03adcdc336e2881c1e5a080c8542  R-2.3.0.tar.gz-split.ac
d7a9431dff3a3a7fefd60ce0ac4b39aa  R-2.3.0.tar.gz-split.ad
096386cbc903ea5c5af2a91415b3535b  R-2.3.0.tar.gz-split.ae
7e05f409a33e08df384aa8ae8ec80f90  R-2.3.0.tar.gz-split.af
6b79a851552a70a491454be0cfdfa685  R-2.3.0.tar.gz-split.ag
474a171062b1ea432bfdcb68afd696b7  R-2.3.0.tar.gz-split.ah
08173075ecea19a8cc75a062bf3fa2ac  R-2.3.0.tar.gz-split.ai
c9cdbbed7dce6b1d5a2af4dc4c495fc1  R-2.3.0.tar.gz-split.aj
11cc1e9df640ab52e608cf9e695f7354  R-latest.tar.gz
433182754c05c2cf7a04ad0da474a1d0  README
020479f381d5f9038dcb18708997f5da  RESOURCES


Here is the relevant bit of the NEWS file:


CHANGES IN R VERSION 2.3.0


USER-VISIBLE CHANGES

o   In the grid package there are new 'arrow' arguments to
grid.line.to(), grid.lines(), and grid.segments()
(grid.arrows() has been deprecated).

The new 'arrow' arguments have been added BEFORE
the 'name', 'gp' and 'vp' arguments so existing code that
specifies any of these arguments *by position* (not by name)
will fail.

o   all.equal() is more stringent, see the PR#8191 bug fix below.

o   The data frame argument to transform() is no longer called 'x',
but '_data'.  Since this is an invalid name, it is less likely
to clash with names given to transformed variables. (People
were getting into trouble with transform(data, x=y+z).)


NEW FEATURES

o   arima.sim() has a new argument 'start.innov' for compatibility
with S-PLUS.  (If not supplied, the output is unchanged from
previous versions in R.)

o   arrows() has been changed to be more similar to segments():
for example col=NA omits the arrow rather than as previously
(undocumented) using par(col).

o   as.list() now accepts symbols (as given by as.symbol() aka
as.name()).

o   atan2() now allows one complex and one numeric argument.

o   The 'masked' warnings given by attach() and library() now only
warn for functions masking functions or non-functions masking
non-functions.

o   New function Axis(), a generic version of axis(), with Date and
POSIX[cl]t methods.  This is used by most of the standard
plotting functions (boxplot, contour, coplot, filled.contour,
pairs, plot.default, rug, stripchart) which will thus label x
or y axes appropriately.

o   pbeta() now uses TOMS708 in all cases and so is more accurate
in some (e.g. when lower.tail = FALSE and when one of the
shape parameters is very small).

o   [qr]beta(), [qr]f() and [qr]t() now have a non-centrality parameter.

o   [rc]bind and some more cases of subassignment are implemented
for raw matrices.  (PR8529 and 8530)

o   The number of lines of deparsed calls printed by browser() and
traceback() can be limited by the option deparse.max.lines.
(Wish of PR#8638.)

o   New canCoerce() utility function in methods package.

o   [pq]chisq() are considerably more accurate for moderate (up to
80) values of ncp, and lower.tail = FALSE is fully supported
in that region.  (They are somewhat slower than before.)

o   chol(pivot = TRUE) now gives a warning if used on a (numerically)
non-positive-definite matrix.

o   chooseCRANmirror() consults the CRAN master (if accessible) to
find an up-to-date list of mirrors.

o   cov.wt() is more efficient for 'cor = TRUE' and has a new 'method'
argument which allows 'Maximum Likelihood'.

o   do.call() gains an 'envir' argument.

o   eigen() applied to an asymmetric real matrix now uses a
tolerance to decide if the result is complex (rather than
expecting the imaginary parts of the eigenvalues to be exactly
zero).

o   New function embedFonts() for embedding fonts in PDF or
PostScript graphics files.

   

[R] Re : bivariate weighted kernel density estimator

2006-04-24 Thread justin bem
Have you try the KernSmooth package ?

- Message d'origine 
De : Adrian Baddeley [EMAIL PROTECTED]
À : r-help@stat.math.ethz.ch; [EMAIL PROTECTED]
Envoyé le : Lundi, 24 Avril 2006, 2h39mn 13s
Objet : [R]  bivariate weighted kernel density estimator

Erich Neuwirth writes:

 Is there code for bivariate kernel density estimation?
 ...
 but none of them seems to accept a weight parameter

In package 'spatstat' the function density.ppp performs weighted kernel 
smoothing, 
including bivariate kernel density estimation.

At the moment it only offers the Gaussian kernel
but we plan to include other kernels.

Adrian Baddeley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html




[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] trellis.par.get without opening a device?

2006-04-24 Thread Dieter Menne
I am using the Deepayan's Sweave trick to set graphics parameters for all
graphs:

ltheme = canonical.theme(color=TRUE)
sup = trellis.par.get(superpose.line)
ltheme$superpose.line$col = c('black',red,blue,#e3,green,
gray)


Works perfectly, there is only a minor nuissance that trellis.par.get opens
a device every time, producing a dummy Rplots.ps file or a window (when run
after Stangle).

Is there a way to suppress this? Well, not really serious.

Dieter

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] rcorrp.cens

2006-04-24 Thread Stefano Mazzuco
Thank you Frank for your prompt reply

You're definitely right, it seems that comparing rank concordance is a quite
inefficient way to test the predictive power of a covariable. Thus LR test
works better.

Stefano


On 4/21/06, Frank E Harrell Jr [EMAIL PROTECTED] wrote:

 Stefano Mazzuco wrote:
  Hi R-users,
 
  I'm having some problems in using the Hmisc package.
 
  I'm estimating a cox ph model and want to test whether the drop in
  concordance index due to omitting one covariate is significant. I think
 (but
  I'm not sure) here are two ways to do that:
 
  1) predict two cox model (the full model and model without the covariate
 of
  interest) and estimate the concordance index (i.e. area under the ROC
 curve)
  with rcorr.cens for both models, then compute the difference
 
  2) predict the two cox models and estimate directly the difference
 between
  the two c-indices using rcorrp.cens. But it seems that the rcorrp.censgives
  me the drop of Dxy index.
 
  Do you have any hint?
 
  Thanks
  Stefano

 First of all, any method based on comparing rank concordances loses
 powers and is discouraged.  Likelihood ratio tests (e.g., by embedding a
 smaller model in a bigger one) are much more powerful.  If you must base
 comparisons on rank concordance (e.g., ROC area=C, Dxy) then rcorrp.cens
 can work if the sample size is large enough so that uncertainty about
 regression coefficient estimates may be ignored.  rcorrp.cens doesn't
 give the drop in C; it gives the probability that one model is more
 concordant with the outcome than another, among pairs of paired
 predictions.

 The bootcov function in the Design package has a new version that will
 output bootstrap replicates of C for a model, and its help file tells
 you how to use that to compare C for two models.  This should only be
 done to show how low a power such a procedure has.  rcporrp is likely to
 be more powerful than that, but likelihood ratio is what you want.  You
 will find many cases where one model increases C by only 0.02 but it has
 many more useful (more extreme) predictions.

 --
 Frank E Harrell Jr   Professor and Chair   School of Medicine
   Department of Biostatistics   Vanderbilt University


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] pnorm2

2006-04-24 Thread Tolga Uzuner
Hi,

Has pnorm2 been dropped from sn ? Have some code using it which seems to 
failt with a new sn update...

Thanks,
Tolga

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Problem with the cluster package

2006-04-24 Thread Rouyer Tristan
Hi everybody,

I want to use the cluster package (Cluster Analysis Extended Rousseeuw et 
al.). I downloaded it from the CRAN and installed it on my linux system 
(fedora core 4). All seemed to be allright.
But when trying to launch examples, I obtained the following message :

 library(cluster)
 data(votes.repub)
  agn1 - agnes(votes.repub, metric = manhattan, stand = TRUE)
Error in .Fortran(twins, as.integer(n), as.integer(jp), x2, dv, dis = 
double(if (keep.diss) length(dv) else 1),  :
Fortran entry point twins_ not in DLL for package cluster

When installing the package, I saw that gfortran compiler was used. And in the 
manuel pages it is specified that gfortran has problems with entry, 
namelist,...

Is my problem related to the fortran compiler ? Shall I to use another fortran 
compiler ? If so, which one ?
Does anybody have encountered the same problem ?

Thanks in advance

-- 
~~~((°~
Tristan Rouyer
04 99 57 32 09
IFREMER - Centre de Recherche Halieutique Méditerranéen et Tropical
Avenue Jean Monnet - BP 171 - 34203 Sète cedex - France

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Sending an ESC command to the console from wihtin a script

2006-04-24 Thread Tolga Uzuner
Hi,

Is there a way to send an ESC command to the console from within a 
script window, without using the mouse ?

Thanks,
Tolga

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Sending an ESC command to the console from wihtin a script

2006-04-24 Thread Gabor Grothendieck
RSiteSearch(clear screen)

will locate Windows code to send a ctrl-L to the screen that you
can modify.

On 4/24/06, Tolga Uzuner [EMAIL PROTECTED] wrote:
 Hi,

 Is there a way to send an ESC command to the console from within a
 script window, without using the mouse ?

 Thanks,
 Tolga

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Sending an ESC command to the console from wihtin a script

2006-04-24 Thread Tolga Uzuner
Gabor Grothendieck wrote:

RSiteSearch(clear screen)

will locate Windows code to send a ctrl-L to the screen that you
can modify.

On 4/24/06, Tolga Uzuner [EMAIL PROTECTED] wrote:
  

Hi,

Is there a way to send an ESC command to the console from within a
script window, without using the mouse ?

Thanks,
Tolga

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html




  

Many thanks,
Tolga

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Modelling heteroskedasticity in a multilevel model

2006-04-24 Thread Antonio Revilla
Dear list members,

I am facing a 3-level model, for which my research hypotheses suggest that 
the variance of both level-1 and level-2 residuals may be a function of a 
level-3 variable.

To be a bit more clear: I am fitting a longitudinal model for a panel of 
companies grouped in industries. I suggest that some industry variables may 
create 'unexpected' shocks at especific points in time; such shocks are not 
accounted for by the explanatory variables in the model, so that they will 
presumably increase variance of level-1 residuals. On the other hand, 
industry-level attributes may also affect the relative relative size of 
firm-level permanent effects (represented by level-2 residuals)

Do you know how could I model such a residual structure in R? I have been 
looking at the varfunc command in the nlme package, but I am not sure if 
such a function can perform the kind of analysis I actually need.

Thank you very much in advance,

Antonio

_
¿Estás pensando en cambiar de coche? Todas los modelos de serie y extras en 
MSN Motor. http://motor.msn.es/researchcentre/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] omitting coefficients in summary.lm()

2006-04-24 Thread Dimitri Szerman
Hi,

I'm running a regression using lm(), in which one of the right-hand side
variables is factor with many levels (say, 80). I am not intersted in the
estimates of the resulting dummies, but I have to include them in my
regression equation. So, I don't want the estimates associated with theses
dummies to be printed by summary.lm( ). Is there an easy way to do this?

Thank you,

Dimitri

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] the 'copula' package

2006-04-24 Thread Casey Quinn
Is anybody using the Copula package in R? The particular problem I'm 
facing is that R is not acknowledging the fitCopula command/function 
when I load the package and (try to) run something very simple:

fit1 - fitCopula(x1 = list(u11,u12,u13,u14,u15,u16,u17,u18), tCopula, 
optim.control = list(NULL), method = BFGS)

Anybody also using it, successfully or unsuccessfully? I'd appreciate a 
tip or two.

Casey Quinn
Centre for Health Economics
University of York
York YO10 5DD
England

Phone: +44 01904 32 1411
Fax:+44 01904 32 1402
Email:   [EMAIL PROTECTED]
Web:   http://www.york.ac.uk/inst/che/staff/quinn.htm

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] regression modeling

2006-04-24 Thread Weiwei Shi
Hi, there:
I am looking for a regression modeling (like regression trees) approach for
a large-scale industry dataset. Any suggestion on a package from R or from
other sources which has a decent accuracy and scalability? Any
recommendation from experience is highly appreciated.

Thanks,

Weiwei

--
Weiwei Shi, Ph.D

Did you always know?
No, I did not. But I believed...
---Matrix III

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] omitting coefficients in summary.lm()

2006-04-24 Thread P Ehlers
Dimitri,

coef(summary(your model)) pulls out the matrix of coefs/SEs/etc.
You could subset that.

Peter Ehlers


Dimitri Szerman wrote:
 Hi,
 
 I'm running a regression using lm(), in which one of the right-hand side
 variables is factor with many levels (say, 80). I am not intersted in the
 estimates of the resulting dummies, but I have to include them in my
 regression equation. So, I don't want the estimates associated with theses
 dummies to be printed by summary.lm( ). Is there an easy way to do this?
 
 Thank you,
 
 Dimitri
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Handling large dataset dataframe

2006-04-24 Thread Sachin J
Hi,
   
  I have a dataset consisting of 350,000 rows and 266 columns.  Out of 266 
columns 250 are dummy variable columns. I am trying to read this data set into 
R dataframe object but unable to do it due to memory size limitations (object 
size created is too large to handle in R).  Is there a way to handle such a 
large dataset in R. 
   
  My PC has 1GB of RAM, and 55 GB harddisk space running windows XP.
   
  Any pointers would be of great help.
   
  TIA
  Sachin


-

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Handling large dataset dataframe

2006-04-24 Thread roger koenker
You can read chunks of it at a time and store it in sparse matrix
form using the packages SparseM or Matrix,  but then you need
to think about what you want to do with it least squares sorts
of things are ok, but other options are somewhat limited...


url:www.econ.uiuc.edu/~rogerRoger Koenker
email[EMAIL PROTECTED]Department of Economics
vox: 217-333-4558University of Illinois
fax:   217-244-6678Champaign, IL 61820


On Apr 24, 2006, at 12:41 PM, Sachin J wrote:

 Hi,

   I have a dataset consisting of 350,000 rows and 266 columns.  Out  
 of 266 columns 250 are dummy variable columns. I am trying to read  
 this data set into R dataframe object but unable to do it due to  
 memory size limitations (object size created is too large to handle  
 in R).  Is there a way to handle such a large dataset in R.

   My PC has 1GB of RAM, and 55 GB harddisk space running windows XP.

   Any pointers would be of great help.

   TIA
   Sachin

   
 -

   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting- 
 guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] layout and image.plot

2006-04-24 Thread Cal Stats
Hi..
  
i was using image.plot from the library fields along with layout
  
  i had the follwoing commands:
  
  layout(matrix(c(1,1,2,3,4,5),ncol=2,nrow=3,byrow=TRUE))
  followed by 5 image plots.
  
  i wanted image 1 on the top row followed by the 4 images on the next two rows.
  
  it works with ordinart plot() command but not with image.plot .
  
  Any suggestions.
  
  Harsh
  

-

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Handling large dataset dataframe

2006-04-24 Thread Sachin J
Hi Roger,
   
  I want to carry out regression analysis on this dataset. So I believe I can't 
read the dataset in chunks. Any other solution?
   
  TIA
  Sachin
  

roger koenker [EMAIL PROTECTED] wrote:
  You can read chunks of it at a time and store it in sparse matrix
form using the packages SparseM or Matrix, but then you need
to think about what you want to do with it least squares sorts
of things are ok, but other options are somewhat limited...


url: www.econ.uiuc.edu/~roger Roger Koenker
email [EMAIL PROTECTED] Department of Economics
vox: 217-333-4558 University of Illinois
fax: 217-244-6678 Champaign, IL 61820


On Apr 24, 2006, at 12:41 PM, Sachin J wrote:

 Hi,

 I have a dataset consisting of 350,000 rows and 266 columns. Out 
 of 266 columns 250 are dummy variable columns. I am trying to read 
 this data set into R dataframe object but unable to do it due to 
 memory size limitations (object size created is too large to handle 
 in R). Is there a way to handle such a large dataset in R.

 My PC has 1GB of RAM, and 55 GB harddisk space running windows XP.

 Any pointers would be of great help.

 TIA
 Sachin

 
 -

 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting- 
 guide.html




-

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Handling large dataset dataframe

2006-04-24 Thread Gabor Grothendieck
You just need the much smaller cross product matrix X'X and vector X'Y so you
can build those up as you read the data in in chunks.


On 4/24/06, Sachin J [EMAIL PROTECTED] wrote:
 Hi,

  I have a dataset consisting of 350,000 rows and 266 columns.  Out of 266 
 columns 250 are dummy variable columns. I am trying to read this data set 
 into R dataframe object but unable to do it due to memory size limitations 
 (object size created is too large to handle in R).  Is there a way to handle 
 such a large dataset in R.

  My PC has 1GB of RAM, and 55 GB harddisk space running windows XP.

  Any pointers would be of great help.

  TIA
  Sachin


 -

[[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] the 'copula' package

2006-04-24 Thread Roger D. Peng
What is the error message that you get?

-roger

Casey Quinn wrote:
 Is anybody using the Copula package in R? The particular problem I'm 
 facing is that R is not acknowledging the fitCopula command/function 
 when I load the package and (try to) run something very simple:
 
 fit1 - fitCopula(x1 = list(u11,u12,u13,u14,u15,u16,u17,u18), tCopula, 
 optim.control = list(NULL), method = BFGS)
 
 Anybody also using it, successfully or unsuccessfully? I'd appreciate a 
 tip or two.
 
 Casey Quinn
 Centre for Health Economics
 University of York
 York YO10 5DD
 England
 
 Phone: +44 01904 32 1411
 Fax:+44 01904 32 1402
 Email:   [EMAIL PROTECTED]
 Web:   http://www.york.ac.uk/inst/che/staff/quinn.htm
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
 

-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Handling large dataset dataframe

2006-04-24 Thread Richard M. Heiberger
Where is the excess size being identified?  Is it the read? or in the lm().

If it is in the reading of the data, then why are you reading the dummy 
variables?
Would it make sense to read a single column of a factor instead of 80 columns
of dummy variables?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Handling large dataset dataframe

2006-04-24 Thread Sachin J
Hi Richard:
   
  Even if I dont read the dummy var columns, i.e. just read the original 
dataset with 350,000 rows and 16 columns, when I try to run the regression - 
using
   
  lm(y ~ c1 + factor(c2) + factor(c3) ) ; where c2, c3 are dummy variables,
   
  The procedure fails saying not enough memory. But,
   
   lm(y ~ c1 + factor(c2) ) works fine. 
   
  Any thoughts.
   
  Thanks
  Sachin

Richard M. Heiberger [EMAIL PROTECTED] wrote:
  Where is the excess size being identified? Is it the read? or in the lm().

If it is in the reading of the data, then why are you reading the dummy 
variables?
Would it make sense to read a single column of a factor instead of 80 columns
of dummy variables?



-

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Handling large dataset dataframe

2006-04-24 Thread Sachin J
Gabor:
   
  Can you elaborate more.
   
  Thanx
  Sachin

Gabor Grothendieck [EMAIL PROTECTED] wrote:
  You just need the much smaller cross product matrix X'X and vector X'Y so you
can build those up as you read the data in in chunks.


On 4/24/06, Sachin J wrote:
 Hi,

 I have a dataset consisting of 350,000 rows and 266 columns. Out of 266 
 columns 250 are dummy variable columns. I am trying to read this data set 
 into R dataframe object but unable to do it due to memory size limitations 
 (object size created is too large to handle in R). Is there a way to handle 
 such a large dataset in R.

 My PC has 1GB of RAM, and 55 GB harddisk space running windows XP.

 Any pointers would be of great help.

 TIA
 Sachin


 -

 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html




-

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Modeling inverse relationship with copula

2006-04-24 Thread Horace Tso
Dear r list,

I posted this on the S list last week since i'm using some of the
FinMetrics functions on copula. Knowing there is a copula package in R,
I figure this would be an appropriate forum to ask this question.

I want to model inverse relationship between two (non-normal,
non-symmetric) marginals with the gumbel copula, or with any copula.
Say, x is lognormal and y is norm. Since gumbel's delta must be greater
than one, how do I specify the equivalence of a negative correlation? 

If both are symmetric, I think I could get away by using a positive
delta, simulate the bivariate realizations and then flipping the sign on
one of them. Or am I completely off.

I did search through the archive but found no related posting. Thanks
in advance.

Horace W. Tso

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Handling large dataset dataframe

2006-04-24 Thread Liaw, Andy
Instead of reading the entire data in at once, you read a chunk at a time,
and compute X'X and X'y on that chunk, and accumulate (i.e., add) them.
There are examples in S Programming, taken from independent replies by the
two authors to a post on S-news, if I remember correctly.

Andy

From: Sachin J
 
 Gabor:

   Can you elaborate more.

   Thanx
   Sachin
 
 Gabor Grothendieck [EMAIL PROTECTED] wrote:
   You just need the much smaller cross product matrix X'X and 
 vector X'Y so you can build those up as you read the data in 
 in chunks.
 
 
 On 4/24/06, Sachin J wrote:
  Hi,
 
  I have a dataset consisting of 350,000 rows and 266 columns. Out of 
  266 columns 250 are dummy variable columns. I am trying to 
 read this 
  data set into R dataframe object but unable to do it due to memory 
  size limitations (object size created is too large to 
 handle in R). Is 
  there a way to handle such a large dataset in R.
 
  My PC has 1GB of RAM, and 55 GB harddisk space running windows XP.
 
  Any pointers would be of great help.
 
  TIA
  Sachin
 
 
  -
 
  [[alternative HTML version deleted]]
 
  __
  R-help@stat.math.ethz.ch mailing list 
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
  http://www.R-project.org/posting-guide.html
 
 
 
   
 -
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list 
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Sending an ESC command to the console from wihtin a script

2006-04-24 Thread Prof Brian Ripley
On Mon, 24 Apr 2006, Tolga Uzuner wrote:

 Hi,

 Is there a way to send an ESC command to the console from within a
 script window, without using the mouse ?

What OS is this?  If Windows, ESC interrupts a running command, and is not 
itself a command: rather it generates a software interrupt.

I don't see how you can send anything from a script window without using 
either a mouse or the keyboard, and ESC is on your keyboard and also on 
the Misc menu.

So what exactly are you trying to do?

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] rmeta: forest plot problem

2006-04-24 Thread Andrej Kastrin
Der useRs,

I'm working on meta analysis using rmeta package. Using code below I 
plot the forest plot:

library(rmeta)
data (catheter)
a-meta.MH (n.trt, n.ctrl, col.trt, col.ctrl, data=catheter, names=Name, 
subset=c(13,6,5,3,7,12,4,11,1,8,10,2))
summary(a) # odds ratio values and confidence intervals
metaplot(a$logOR, a$selogOR, nn=a$selogOR^-2,a$names, summn=a$logMH, 
sumse=a$selogMH, sumnn=a$selogMH^-2, logeffect=TRUE)

Now I would like to add numerical odds ratio values and corresponding 
confidence intervals for each study on the second y axis (eg. 
http://www.statsdirect.com/help/meta_analysis/cochrane_plot.htm. I try 
with 'text' command, but the outcome is disastrous. If anyone can 
explain the best way to solve my problem, I should be very grateful.

Andrej

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Store results of for loop

2006-04-24 Thread Doran, Harold
I have what I'm sure will turn out to be straightforward. I want to
store the results of a loop for some operations from a patterned vector.
For example, the following doesn't give what I would hope for

ss - c(2,3,9)
results - numeric(length(ss))
for (i in seq(along=ss)){
   results[i] - i + 1
   }

The following does give what I expect, but creates a vector of length 9.

ss - c(2,3,9)
results - numeric(length(ss))
for (i in ss){
   results[i] - i + 1
   }

What I am hoping for is that results should be a vector of length 3.

Harold


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] rmeta: forest plot problem

2006-04-24 Thread Thomas Lumley
On Mon, 24 Apr 2006, Andrej Kastrin wrote:

 Der useRs,

 I'm working on meta analysis using rmeta package. Using code below I
 plot the forest plot:

 library(rmeta)
 data (catheter)
 a-meta.MH (n.trt, n.ctrl, col.trt, col.ctrl, data=catheter, names=Name,
 subset=c(13,6,5,3,7,12,4,11,1,8,10,2))
 summary(a) # odds ratio values and confidence intervals
 metaplot(a$logOR, a$selogOR, nn=a$selogOR^-2,a$names, summn=a$logMH,
 sumse=a$selogMH, sumnn=a$selogMH^-2, logeffect=TRUE)

 Now I would like to add numerical odds ratio values and corresponding
 confidence intervals for each study on the second y axis (eg.
 http://www.statsdirect.com/help/meta_analysis/cochrane_plot.htm. I try
 with 'text' command, but the outcome is disastrous. If anyone can
 explain the best way to solve my problem, I should be very grateful.


One of the examples from Paul Murrell's book (on his web page at
http://www.stat.auckland.ac.nz/~paul/RGraphics/chapter1.html) shows how to 
do a picture like this with grid graphics. It's fairly easy to customize.

-thomas

PS: I have written a more general version of this but haven't added it to 
rmeta yet, and I don't have it with me today

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Store results of for loop

2006-04-24 Thread Thomas Lumley
On Mon, 24 Apr 2006, Doran, Harold wrote:

 I have what I'm sure will turn out to be straightforward. I want to
 store the results of a loop for some operations from a patterned vector.
 For example, the following doesn't give what I would hope for

 ss - c(2,3,9)
 results - numeric(length(ss))
 for (i in seq(along=ss)){
   results[i] - i + 1
   }

 The following does give what I expect, but creates a vector of length 9.

 ss - c(2,3,9)
 results - numeric(length(ss))
 for (i in ss){
   results[i] - i + 1
   }

 What I am hoping for is that results should be a vector of length 3.


either
  results-sapply(ss, function(i) i+1)
or
  for(i in seq(along=ss)){
   results[i]-ss[i]+1
}


-thomas

Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Store results of for loop

2006-04-24 Thread Christos Hatzis
It is not very clear how you want to index your results vector.

If ss contains the indices of the results vector that you are trying to
change, this implies that you have a vector of length 9.  In this case 

results - numeric(max(ss))
results[ss] - ss + 1

will do the trick.

Or in case that ss contains the values that you want to augment by 1

results - numeric(length(ss))
Results - ss + 1

Am I missing something?

-Christos

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Doran, Harold
Sent: Monday, April 24, 2006 4:32 PM
To: r-help@stat.math.ethz.ch
Subject: [R] Store results of for loop

I have what I'm sure will turn out to be straightforward. I want to store
the results of a loop for some operations from a patterned vector.
For example, the following doesn't give what I would hope for

ss - c(2,3,9)
results - numeric(length(ss))
for (i in seq(along=ss)){
   results[i] - i + 1
   }

The following does give what I expect, but creates a vector of length 9.

ss - c(2,3,9)
results - numeric(length(ss))
for (i in ss){
   results[i] - i + 1
   }

What I am hoping for is that results should be a vector of length 3.

Harold


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Store results of for loop

2006-04-24 Thread Marc Schwartz (via MN)
On Mon, 2006-04-24 at 16:31 -0400, Doran, Harold wrote:
 I have what I'm sure will turn out to be straightforward. I want to
 store the results of a loop for some operations from a patterned vector.
 For example, the following doesn't give what I would hope for
 
 ss - c(2,3,9)
 results - numeric(length(ss))
 for (i in seq(along=ss)){
results[i] - i + 1
}

Harold,

Here you are getting:

 results
[1] 2 3 4

because 'i' is 1:3, thus:

 1:3 + 1
[1] 2 3 4


 The following does give what I expect, but creates a vector of length 9.
 
 ss - c(2,3,9)
 results - numeric(length(ss))
 for (i in ss){
results[i] - i + 1
}

Here you are getting:

 results
[1]  0  3  4 NA NA NA NA NA 10

because 'i' is set to 'ss' which is c(2, 3, 9). Thus, 'results' is being
indexed as results[c(2, 3, 9)]. 

You are adding 1 to 'ss' in the loop, thus:

 ss + 1
[1]  3  4 10

In short:

  results[ss] - ss + 1

which yields:

 results
[1]  0  3  4 NA NA NA NA NA 10


 What I am hoping for is that results should be a vector of length 3.

I suspect what you want is:

 ss - c(2, 3, 9)
 results - numeric(length(ss))

 for (i in seq(along = ss))
 {
   results[i] - ss[i] + 1
 }

 results
[1]  3  4 10


You might also want to look at ?sapply, where you could do something
like this:

 sapply(ss, function(x) x + 1)
[1]  3  4 10


HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] bivariate weighted kernel density estimator

2006-04-24 Thread Erich Neuwirth
Dear Roger,
thanks for the suggestion.
That is the solution, just modifying kde2d.
I did it slightly differently.
Since I have up to 20 points,
diag(Z) from your code becomes too large.
But since t(matrix(dnorm(ay),n,nx)) only needs to be
multiplied with the weights rowwise and
* applied to vectors repeats the shorter vector cyclically,
Z * t(matrix(dnorm(ay)))
does the same thing as
diag(Z) %*% t(matrix(dnorm(ay),n,nx))

and does not need too much memory.

I also have to add an excuse:
In the original posting I stated that I need WEIGHTED kernel
density estimators in the subject,
but did not mention weighted in the text.
That was imprecise and probably therefore I mislead some list
participants.

Erich


Roger Bivand wrote:
 kde2d.G is just kde2d with two changes - it takes the grid from the GRASS
 region, and it allows weights in the Z argument. Please have a look at
the
 code and see if you can't simply retro-fit it to kde2d:

 if (!is.null(Z)) {
 if (length(Z) != nx)
 stop(Data vectors must be the same length)
 z1 - matrix(dnorm(ax), n, nx) %*% diag(Z) %*%
t(matrix(dnorm(ay),
 n, nx))/(nx * h[1] * h[2])
 z - z1/z
 }

 This was put into the function to make a very crude kernel density
 interpolator with the grid cell values scaled in the units of the Z
 variable.


-- 
Erich Neuwirth, University of Vienna
Faculty of Computer Science
Computer Supported Didactics Working Group
Visit our SunSITE at http://sunsite.univie.ac.at
Phone: +43-1-4277-39464 Fax: +43-1-4277-39459

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Handling large dataset dataframe

2006-04-24 Thread Sachin J
Hi Andy:
   
  I searched through R-archive to find out how to handle large data set using 
readLines and other related R functions. I couldn't find any single post which 
elaborates the process. Can you provide me with an example or any pointers to 
the postings elaborating the process. 
   
  Thanx in advance
  Sachin
   
  
Liaw, Andy [EMAIL PROTECTED] wrote:
  Instead of reading the entire data in at once, you read a chunk at a time,
and compute X'X and X'y on that chunk, and accumulate (i.e., add) them.
There are examples in S Programming, taken from independent replies by the
two authors to a post on S-news, if I remember correctly.

Andy

From: Sachin J
 
 Gabor:
 
 Can you elaborate more.
 
 Thanx
 Sachin
 
 Gabor Grothendieck wrote:
 You just need the much smaller cross product matrix X'X and 
 vector X'Y so you can build those up as you read the data in 
 in chunks.
 
 
 On 4/24/06, Sachin J wrote:
  Hi,
 
  I have a dataset consisting of 350,000 rows and 266 columns. Out of 
  266 columns 250 are dummy variable columns. I am trying to 
 read this 
  data set into R dataframe object but unable to do it due to memory 
  size limitations (object size created is too large to 
 handle in R). Is 
  there a way to handle such a large dataset in R.
 
  My PC has 1GB of RAM, and 55 GB harddisk space running windows XP.
 
  Any pointers would be of great help.
 
  TIA
  Sachin
 
 
  -
 
  [[alternative HTML version deleted]]
 
  __
  R-help@stat.math.ethz.ch mailing list 
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
  http://www.R-project.org/posting-guide.html
 
 
 
 
 -
 
 [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list 
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 
 


--

--



-

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] String substitution on package install?

2006-04-24 Thread Jeff Gentry

Hello ...

I was working with some older code today that started throwing errors I'd
never seen before.  The source appears to be some sort of substition of
the text of the code on install time, I was hoping that someone might be
able to point me to what I'm doing wrong.

If I take the following function:
foo - function() {
  test - This is a test
  grep(^FOO_\\w+_OK$, test)
}

and put it in some file (say foo.R).  If I source() that file, the
function appears properly.  However, if I put that file in a package, and
do a R CMD INSTALL on that package, it appears as such:

function ()
{
test - This is a test
grep(^FOO_\\, test)
}

The switcharoo on the text in the grep() call was the source of the
errors, btw.

I originally saw this today on a cut of R-devel from late last week, but
then did a svn up from just now and saw it again (I realize that neither
are officially R-2.3.0 but the time window was less than a few days).  I
tried it on R-2.1.1 (the only older version I had sitting around for
whatever reason) and did not see this happening.

Is this something that others can even replicate, or is it particular to
something about my setup.  And even if so, is this a case where I was
doing something wrong all along which now has gotten fixed and made my
wrong thing a Really Wrong Thing?  

Thanks
-J

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] the 'copula' package

2006-04-24 Thread jun yan
Here is an example that works:

 mycop - tCopula(param=0.5, dim=8, dispstr=ex, df=5)
 x - rcopula(mycop, 1000)
 myfit - fitCopula(x, mycop, c(0.6, 10), optim.control=list(trace=1),
method=Nelder-Mead)
 myfit
The ML estimation is based on  1000  observations.
   Estimate Std. Error  z value Pr(|z|)
rho.1 0.4989052 0.01192036 41.853200
df5.2976624 0.32442429 16.329430
The maximized loglikelihood is  2038.907
The convergence code is  0

On 4/24/06, Casey Quinn [EMAIL PROTECTED] wrote:

 Is anybody using the Copula package in R? The particular problem I'm
 facing is that R is not acknowledging the fitCopula command/function
 when I load the package and (try to) run something very simple:

 fit1 - fitCopula(x1 = list(u11,u12,u13,u14,u15,u16,u17,u18), tCopula,
 optim.control = list(NULL), method = BFGS)

 Anybody also using it, successfully or unsuccessfully? I'd appreciate a
 tip or two.

 Casey Quinn
 Centre for Health Economics
 University of York
 York YO10 5DD
 England

 Phone: +44 01904 32 1411
 Fax:+44 01904 32 1402
 Email:   [EMAIL PROTECTED]
 Web:   http://www.york.ac.uk/inst/che/staff/quinn.htm

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Change the language of the labels in a graph

2006-04-24 Thread Lapointe, Pierre
Hello,

How do you change the language of the labels in a graph.  In this example, I
want to get French labels by changing Sys.putenv.  I should get Mai
instead of May.

Sys.putenv(LANGUAGE=fr)
x - as.Date(c(1jan1960, 2jan1960, 31mar1960, 30jul1960), %d%b%Y)
y -1:4
plot(x,y)


Regards,

Pierre Lapointe


**
AVIS DE NON-RESPONSABILITE: Ce document transmis par courrie...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Handling large dataset dataframe [Broadcast]

2006-04-24 Thread Liaw, Andy
Here's a skeletal example.  Embellish as needed:
 
p - 5
n - 300
set.seed(1)
dat - cbind(rnorm(n), matrix(runif(n * p), n, p))
write.table(dat, file=c:/temp/big.txt, row=FALSE, col=FALSE)
 
xtx - matrix(0, p + 1, p + 1)
xty - numeric(p + 1)
f - file(c:/temp/big.txt, open=r)
for (i in 1:3) {
x - matrix(scan(f, nlines=100), 100, p + 1, byrow=TRUE)
xtx - xtx + crossprod(cbind(1, x[, -1]))
xty - xty + crossprod(cbind(1, x[, -1]), x[, 1])
}
close(f)
solve(xtx, xty)
coef(lm.fit(cbind(1, dat[,-1]), dat[,1]))  ## check result

unlink(c:/temp/big.txt)  ## clean up.
 
Andy

-Original Message-
From: Sachin J [mailto:[EMAIL PROTECTED] 
Sent: Monday, April 24, 2006 5:09 PM
To: Liaw, Andy; R-help@stat.math.ethz.ch
Subject: RE: [R] Handling large dataset  dataframe [Broadcast]


Hi Andy:
 
I searched through R-archive to find out how to handle large data set using
readLines and other related R functions. I couldn't find any single post
which elaborates the process. Can you provide me with an example or any
pointers to the postings elaborating the process. 
 
Thanx in advance
Sachin
 

Liaw, Andy [EMAIL PROTECTED] wrote:

Instead of reading the entire data in at once, you read a chunk at a time,
and compute X'X and X'y on that chunk, and accumulate (i.e., add) them.
There are examples in S Programming, taken from independent replies by the
two authors to a post on S-news, if I remember correctly.

Andy

From: Sachin J
 
 Gabor:
 
 Can you elaborate more.
 
 Thanx
 Sachin
 
 Gabor Grothendieck wrote:
 You just need the much smaller cross product matrix X'X and 
 vector X'Y so you can build those up as you read the data in 
 in chunks.
 
 
 On 4/24/06, Sachin J wrote:
  Hi,
 
  I have a dataset consisting of 350,000 rows and 266 columns. Out of 
  266 columns 250 are dummy variable columns. I am trying to 
 read this 
  data set into R dataframe object but unable to do it due to memory 
  size limitations (object size created is too large to 
 handle in R). Is 
  there a way to handle such a large dataset in R.
 
  My PC has 1GB of RAM, and 55 GB harddisk space running windows XP.
 
  Any pointers would be of great help.
 
  TIA
  Sachin
 
 
  -
 
  [[alternative HTML version deleted]]
 
  __
  R-help@stat.math.ethz.ch mailing list 
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
  http://www.R-project.org/posting-guide.html
 
 
 
 
 -
 
 [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list 
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 
 



--
Notice: This e-mail message, together with any attachments, ...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] [O/T] undergrads and R

2006-04-24 Thread Erin Hodgess
Dear R People:

Are your undergraduate students receptive to learning R, as a rule?

Most of the time, mine really like it.  But this semester, they act as
though they are being eaten by rats when learning R.  They are not
trying at all.

Any similar experiences?  If anyone has any good ideas, I would be 
THRILLED to hear them, as I am using R in Summer School.

Thanks,
Sincerely,
Erin Hodgess
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] number of matches when using Match()

2006-04-24 Thread Brian Quinif
  Speaking of standard errors, when correcting for heteroscedasticity,
  how many matches do you use (this is the Var.cal option).  It seems to
  me that it might make sense to use the same number of matches as
  above, but that's just a guess...

 These are related but separate issues.  The number of matches is all
 about covariate balance (bias reduction).  And the Var.cal option is
 related to the heterogeneity of the causal effect.  It could be that
 the data is such that one needs to do 1-to-1 matching to get good
 covariate balance, but that the causal effect is homogeneous so
 Var.cal can be set to 0 etc.

Ok, but in my case, I think that the treatment effect *is*
hetergenous, and I even partition my sample based on a number of
characteristics and find very different effects for these subsamples. 
Given that, it seems that I certainly should not use Var.cal=0.

My question is how do I go about deciding what I should set Var.cal
equal to?  Should it be 1, or perhaps the number of matches I use for
the treatment effect?

Regards,

Brian

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] GUI font size

2006-04-24 Thread Erin Hodgess
Dear R People:

On the Edit menu, there is a GUI preference tab.

On the Font option, the highest value is 18.

Has anyone ever had the font size set larger than that will any 
success, please?

Thanks,
Erin Hodgess
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: [EMAIL PROTECTED]
PS Windows, R 2-2-1

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] [O/T] undergrads and R

2006-04-24 Thread John Fox
Dear Erin,

I wrote the Rcmdr package because my undergrad intro stats students are much
more comfortable with point-and-click interfaces. You're in a computer and
math department, however, while I'm in sociology -- I would have thought
that your students wouldn't have trouble with command-driven software.

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Erin Hodgess
 Sent: Monday, April 24, 2006 5:28 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] [O/T] undergrads and R
 
 Dear R People:
 
 Are your undergraduate students receptive to learning R, as a rule?
 
 Most of the time, mine really like it.  But this semester, 
 they act as though they are being eaten by rats when learning 
 R.  They are not trying at all.
 
 Any similar experiences?  If anyone has any good ideas, I 
 would be THRILLED to hear them, as I am using R in Summer School.
 
 Thanks,
 Sincerely,
 Erin Hodgess
 Associate Professor
 Department of Computer and Mathematical Sciences University 
 of Houston - Downtown
 mailto: [EMAIL PROTECTED]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] GUI font size

2006-04-24 Thread John Fox
Dear Erin,

This is, I guess, under Windows. You can set the point size to a larger
value by editing the Rconsole file in R's \etc directory.

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Erin Hodgess
 Sent: Monday, April 24, 2006 5:45 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] GUI font size
 
 Dear R People:
 
 On the Edit menu, there is a GUI preference tab.
 
 On the Font option, the highest value is 18.
 
 Has anyone ever had the font size set larger than that will 
 any success, please?
 
 Thanks,
 Erin Hodgess
 Associate Professor
 Department of Computer and Mathematical Sciences University 
 of Houston - Downtown
 mailto: [EMAIL PROTECTED]
 PS Windows, R 2-2-1
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] [O/T] undergrads and R

2006-04-24 Thread Clint Bowman
It wasn't R but I've had a similar experience where a class came together
to cause an uncharacteristic reaction to material which had been welcomed
by previous classes (and also by later ones.)

I'd say just put it down to a statistical fluctuation.

Clint

Clint BowmanINTERNET:   [EMAIL PROTECTED]
Air Dispersion Modeler  INTERNET:   [EMAIL PROTECTED]
Air Quality Program VOICE:  (360) 407-6815
Department of Ecology   FAX:(360) 407-7534

USPS:   PO Box 47600, Olympia, WA 98504-7600
Parcels:300 Desmond Drive, Lacey, WA 98503-1274

On Mon, 24 Apr 2006, Erin Hodgess wrote:

 Dear R People:

 Are your undergraduate students receptive to learning R, as a rule?

 Most of the time, mine really like it.  But this semester, they act as
 though they are being eaten by rats when learning R.  They are not
 trying at all.

 Any similar experiences?  If anyone has any good ideas, I would be
 THRILLED to hear them, as I am using R in Summer School.

 Thanks,
 Sincerely,
 Erin Hodgess
 Associate Professor
 Department of Computer and Mathematical Sciences
 University of Houston - Downtown
 mailto: [EMAIL PROTECTED]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] GUI font size

2006-04-24 Thread Michael Prager
Erin,

I was able to set it larger by making the change manually in the 
Rconsole file.

For details of Rgui (Windows) configuration, try

  ?Rconsole

from the prompt.

MHP


Erin Hodgess wrote on 4/24/2006 6:44 PM:
 Dear R People:

 On the Edit menu, there is a GUI preference tab.

 On the Font option, the highest value is 18.

 Has anyone ever had the font size set larger than that will any 
 success, please?

 Thanks,
 Erin Hodgess
 Associate Professor
 Department of Computer and Mathematical Sciences
 University of Houston - Downtown
 mailto: [EMAIL PROTECTED]
 PS Windows, R 2-2-1

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
   

-- 
Michael H. Prager, Ph.D.
Population Dynamics Team
NOAA Center for Coastal Habitat and Fisheries Research
NMFS Southeast Fisheries Science Center
Beaufort, North Carolina  28516  USA
http://shrimp.ccfhrb.noaa.gov/~mprager/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] JGR problem

2006-04-24 Thread Fernando Saldanha
I just installed JGR version 1.3 and I am having a problem. When I try
to load the library RQuantLib I get an error message. The same program
runs perfectly well when I just run rgui.exe. I have pasted the screen
below. The version of R I am using is 2.2.1.

Thanks for any help.

FS


 rm(list=ls())

 # Import libraries
 library(tseries)
Loading required package: quadprog
Loading required package: zoo

'tseries' version: 0.9-30

'tseries' is a package for time series analysis and computational finance.

See 'library(help=tseries)' for details.

 library(zoo)
 library(RQuantLib)
Error in dyn.load(x, as.logical(local), as.logical(now)) :
unable to load shared library 'C:/Program
Files/R/R-2.2.1/library/RQuantLib/libs/RQuantLib.dll':
  LoadLibrary failure:  Invalid access to memory location.
Error in library(RQuantLib) : .First.lib failed for 'RQuantLib'

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] GUI font size

2006-04-24 Thread Duncan Murdoch
On 4/24/2006 6:44 PM, Erin Hodgess wrote:
 Dear R People:
 
 On the Edit menu, there is a GUI preference tab.
 
 On the Font option, the highest value is 18.
 
 Has anyone ever had the font size set larger than that will any 
 success, please?

This actually looks like a bug in the low level code (graphapp).  The 
control is supposed to allow you to type in your own choice of font 
size, but because of what looks like a typo, it restricts choices to the 
ones in the list.

As others have said, you can set a value in the Rconsole file, but with 
this bug, it won't display properly.

Since this bug has been around for at least 7 years, I'm going to check 
the fix for it fairly carefully before I commit it, but I expect it to 
be fixed in the next release.

I may increase the list of sizes in the dropdown box at the same time.

Duncan Murdoch

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Change the language of the labels in a graph

2006-04-24 Thread Gabor Grothendieck
This works for me on my Windows XP system:

Sys.putenv(LANGUAGE=FR); Sys.setlocale(LC_ALL,FR)


On 4/24/06, Lapointe, Pierre [EMAIL PROTECTED] wrote:
 Hello,

 How do you change the language of the labels in a graph.  In this example, I
 want to get French labels by changing Sys.putenv.  I should get Mai
 instead of May.

 Sys.putenv(LANGUAGE=fr)
 x - as.Date(c(1jan1960, 2jan1960, 31mar1960, 30jul1960), %d%b%Y)
 y -1:4
 plot(x,y)


 Regards,

 Pierre Lapointe


 **
 AVIS DE NON-RESPONSABILITE: Ce document transmis par courrie...{{dropped}}

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] [O/T] undergrads and R

2006-04-24 Thread Richard M. Heiberger
This semester for the first time I have been using the combination
of R, R Commander (John Fox's package providing a menu-driven
interface to R), and RExcel (Erich Neuwirth's package for interfacing
R with Excel).  The audience is the introductory Statistics class for Business
undergradutes.  The short summary is that I think the combination works well
for this audience.

I will be talking on my experience at the useR! conference in June.  I added
several additional menu items to Rcmdr for our group.  I sent the January
ones (prior to the beginning of the semester) to John Fox in January.
I will send another batch of menu items, those constructed during the semester,
as soon as the semester is complete.

The goal is to hide most of the programming from the students.  But not all
of it.  I think it is very important for any user of a menu system to
have at least a rudimentary idea of the programming steps behind the menu.
Rcmdr supports this goal since it functions by generating R language statements
from the menu selections and displaying the generated statements.
For example, I will casually change the cex or ylim of a generated plot
statement.  I post the script window (generated and edited statements) from
each class to the course website.  I do not post the output window.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Overlapping assignment

2006-04-24 Thread Rob Steele
Is it valid to assign a slice of an array or vector to an overlapping 
slice of the same object?  R seems to do the right thing but I can't 
find anything where it promises to.

  a - 1:12
  a[4:12] - a[1:9]
  a
  [1] 1 2 3 1 2 3 4 5 6 7 8 9

  b - 1:12
  b[1:9] - b[4:12]
  b
  [1]  4  5  6  7  8  9 10 11 12 10 11 12

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Help needed

2006-04-24 Thread Anamika Chaudhuri
Hi,
   
  I am trying to change a SAS macro to R.
   
  here is my code. I get an error at the last line.
  attach(fram)
  dset1-cbind(AGE,BMI,DEATH)
 BMIGRP-cut(BMI,breaks=3,right=TRUE)
 AGEGRP-floor(AGE/10)-2
 dset-cbind(AGEGRP,BMIGRP,DEATH)
 maxage-max(dset[,1])
 minage-min(dset[,1])
 #maxcls-dset[,2]
 #mincls-dset[,2]
 nage-maxage-minage+1
 nclass-maxcls-mincls+1
 nsub-nrow(dset)
 weight - matrix(,nage,1)
 for ( i in minage:maxage )
+   {weight[i-minage+1,1] = sum(std = i)}
 
   atrisk = matrix(,nclass,nage)
   wevents = matrix(,nclass,nage)
 #reduce data set to frequency table 
  for( i in (minage : maxage))
+  for( j in (mincls : maxcls))
+ #atrisk1-aggregate(dset[,c(AGEGRP,BMIGRP)],list(RANDID=dset1$RANDID,sum)
+ 
+ {atrisk[j-mincls+1,i-minage+1] =
+  sum((dset[,1]=i) (dset[,2]=j))
+ wevents[j-mincls+1,i-minage+1]=
+   sum((dset[,1]=i) (dset[,2]=j)(dset[,3]=1))}
Error: subscript out of bounds
In addition: Warning messages:
1: numerical expression has 11627 elements: only the first used in: 
mincls:maxcls 
2: numerical expression has 11627 elements: only the first used in: 
mincls:maxcls 
   
  Any help will be greatly appreciated.
   
  Thanks,
  Anamika


-

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Handling large dataset dataframe [Broadcast]

2006-04-24 Thread Gabor Grothendieck
The other thing you could try after doing this is to sample
some rows from your data and see if the subset gives
nearly the same answer as the entire data set.

On 4/24/06, Liaw, Andy [EMAIL PROTECTED] wrote:
 Here's a skeletal example.  Embellish as needed:

 p - 5
 n - 300
 set.seed(1)
 dat - cbind(rnorm(n), matrix(runif(n * p), n, p))
 write.table(dat, file=c:/temp/big.txt, row=FALSE, col=FALSE)

 xtx - matrix(0, p + 1, p + 1)
 xty - numeric(p + 1)
 f - file(c:/temp/big.txt, open=r)
 for (i in 1:3) {
x - matrix(scan(f, nlines=100), 100, p + 1, byrow=TRUE)
xtx - xtx + crossprod(cbind(1, x[, -1]))
xty - xty + crossprod(cbind(1, x[, -1]), x[, 1])
 }
 close(f)
 solve(xtx, xty)
 coef(lm.fit(cbind(1, dat[,-1]), dat[,1]))  ## check result

 unlink(c:/temp/big.txt)  ## clean up.

 Andy

 -Original Message-
 From: Sachin J [mailto:[EMAIL PROTECTED]
 Sent: Monday, April 24, 2006 5:09 PM
 To: Liaw, Andy; R-help@stat.math.ethz.ch
 Subject: RE: [R] Handling large dataset  dataframe [Broadcast]


 Hi Andy:

 I searched through R-archive to find out how to handle large data set using
 readLines and other related R functions. I couldn't find any single post
 which elaborates the process. Can you provide me with an example or any
 pointers to the postings elaborating the process.

 Thanx in advance
 Sachin


 Liaw, Andy [EMAIL PROTECTED] wrote:

 Instead of reading the entire data in at once, you read a chunk at a time,
 and compute X'X and X'y on that chunk, and accumulate (i.e., add) them.
 There are examples in S Programming, taken from independent replies by the
 two authors to a post on S-news, if I remember correctly.

 Andy

 From: Sachin J
 
  Gabor:
 
  Can you elaborate more.
 
  Thanx
  Sachin
 
  Gabor Grothendieck wrote:
  You just need the much smaller cross product matrix X'X and
  vector X'Y so you can build those up as you read the data in
  in chunks.
 
 
  On 4/24/06, Sachin J wrote:
   Hi,
  
   I have a dataset consisting of 350,000 rows and 266 columns. Out of
   266 columns 250 are dummy variable columns. I am trying to
  read this
   data set into R dataframe object but unable to do it due to memory
   size limitations (object size created is too large to
  handle in R). Is
   there a way to handle such a large dataset in R.
  
   My PC has 1GB of RAM, and 55 GB harddisk space running windows XP.
  
   Any pointers would be of great help.
  
   TIA
   Sachin
  
  
   -
  
   [[alternative HTML version deleted]]
  
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide!
   http://www.R-project.org/posting-guide.html
  
 
 
 
  -
 
  [[alternative HTML version deleted]]
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide!
  http://www.R-project.org/posting-guide.html
 
 


 
 --
 Notice: This e-mail message, together with any attachments, ...{{dropped}}

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Help needed

2006-04-24 Thread Richard M. Heiberger
The lines
 #maxcls-dset[,2]
 #mincls-dset[,2]
which you have shown commented out select a full column.
You probably want the min and max of that column.

With your definitions, mincls:maxlcs has the same type of behavior as
(1:3):(1:3)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html