Re: [R] Neural Nets (nnet) - evaluating success rate of predictions

2007-05-08 Thread hadley wickham
On 5/7/07, Bert Gunter [EMAIL PROTECTED] wrote:
 Folks:

 If I understand correctly, the following may be pertinent.

 Note that the procedure:

 min.nnet = nnet[k] such that error rate of nnet[k] = min[i] {error
 rate(nnet(training data) from ith random start) }

 does not guarantee a classifier with a lower error rate on **new** data than
 any single one of the random starts. That is because you are using the same
 training set to choose the model (= nnet parameters) as you are using to
 determine the error rate. I know it's tempting to think that choosing the
 best among many random starts always gets you a better classifier, but it
 need not. The error rate on the training set for any classifier -- be it a
 single one or one derived in some way from many -- is a biased estimate of
 the true error rate, so that choosing a classifer on this basis does not
 assure better performance for future data. In particular, I would guess that
 choosing the best among many (hundreds/thousands) random starts is probably
 almost guaranteed to produce a poor predictor (ergo the importance of
 parsimony/penalization).  I would appreciate comments from anyone, pro or
 con, with knowledge and experience of these things, however, as I'm rather
 limited on both.

I agree - it's never a good idea to use the same data for creating
your classifier and determining it's effectiveness (I meant to say
pick the one with the lowest error rate on your TEST data).

The reason to choose from many random starts is that fitting a given
neural network _model_ (ie. input x, n nodes, ...) is very hard due to
the large overparameterisation of the problem space.  For example, the
parameters for one node in a given layer can be exchanged with the
parameters of another node (as well as the parameters that use those
nodes in next layer), without changing the overall model.  This makes
it very hard to optimise, and nnet in R often gets stuck in local
minima.
Looking at what individual nodes are doing, you often see examples
where some nodes contribute nothing to the overall classification.
The random starts aren't to find different models but to find the
parameters for the given model that fits best.  And following this
line of argument, you would probably want to use the internal
criterion value, rather than some external measure of accuracy.

 The simple answer to the question of obtaining the error rate using
 validation data is: Do whatever you like to choose/fit a classifier on the
 training set. **Once you are done,** the estimate of your error rate is the
 error rate you get on applying that classifier to the validation set. But
 you can do this only once! If you don't like that error rate and go back to
 finding a a better predictor in some way, then your validation data have now
 been used to derive the classifier and thus has become part of the training
 data, so any further assessment of the error rate of a new classifier on it
 is now also a biased estimate. You need yet new validation data for that.

Understanding that that estimate is biased is important, but in
practice, do people really care that much?  If you have looked at a
single plot of your data and used that to inform your choice of the
classifier your estimates will already be biased (but if you have used
other knowledge of the data or subject area, you might expect them to
be biased in a positive direction). Are the estimates of model really
the most important thing?  Surely an understanding of the problem/data
is what you are really trying to gain.

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] creating a new column

2007-05-08 Thread Schmitt, Corinna
Hallo,

I just now a solution for da data frame. I'm not sure if this is what you want. 
Just try if it helps. Here an example of my code where I added a column:

df - rbind( c(fred,mary,4),c(fred,mary,7),
 c(fred,mary,9),c(barney,liz,3),
 c(barney,liz,5))
df - data.frame(df)
colnames(df) - c(father,mother,child.age)

# adding column
df - data.frame(df,weddingdate=c(Dec 12th, 1980,Dec 12th, 1980,
Dec 12th, 1980,Apr 9th, 2003,  
Apr 9th, 2003))
df
  
The R-Gui Result:
  father mother child.ageweddingdate
1   fred   mary 4 Dec 12th, 1980
2   fred   mary 7 Dec 12th, 1980
3   fred   mary 9 Dec 12th, 1980
4 barneyliz 3  Apr 9th, 2003
5 barneyliz 5  Apr 9th, 2003

Caution: the number of entries in adding column must correspond to the number 
of rows in your existing data frame df (here 5)

Try this soultion,
Corinna


-Ursprüngliche Nachricht-
Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Im Auftrag von raymond chiruka
Gesendet: Montag, 7. Mai 2007 16:28
An: r
Betreff: [R] creating a new column

hie l would like to create a 6th column actual surv time from the following 
data 
  
  the condition being
  if  censoringTimesurvivaltime then actual survtime =survival time
  else actual survtime =censoring time
  
  the code l used to create the data is
  
   s=2
   while(s!=0){ n=20
 m-matrix(nrow=n,ncol=4)
 colnames(m)=c(treatmentgrp,strata,censoringTime,survivalTime)
for(i in 1:20)  
m[i,]-c(sample(c(1,2),1,replace=TRUE),sample(c(1,2),1,replace=TRUE),rexp(1,.007),rexp(1,.002))
m-cbind(m,0)
 m[m[,3]m[,4],5]-1
 colnames(m)[5]-censoring
  print(m)
   s=s-1
  treatmentgrp strata censoringTime survivalTime censoring
   [1,] 1  1   1.0121591137.80922 0
   [2,] 2  2  32.971439 247.21786 0
   [3,] 2  1  85.758253 797.04949 0
   [4,] 1  1  16.999171  78.92309 0
   [5,] 2  1 272.909896 298.21483 0
   [6,] 1  2 138.230629 935.96765 0
   [7,] 2  2  91.529859 141.08405 0
  
  
  l keep getting an error message when i try to  create the 6th column
  
  
  
  
  
-


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Bad optimization solution

2007-05-08 Thread Paul Smith
It seems that there is here a problem of reliability, as one never
knows whether the solution provided by R is correct or not. In the
case that I reported, it is fairly simple to see that the solution
provided by R (without any warning!) is incorrect, but, in general,
that is not so simple and one may take a wrong solution as a correct
one.

Paul


On 5/8/07, Ravi Varadhan [EMAIL PROTECTED] wrote:
 Your function, (x1-x2)^2, has zero gradient at all the starting values such
 that x1 = x2, which means that the gradient-based search methods will
 terminate there because they have found a critical point, i.e. a point at
 which the gradient is zero (which can be a maximum or a minimum or a saddle
 point).

 However, I do not why optim converges to the boundary maximum, when analytic
 gradient is supplied (as shown by Sundar).

 Ravi.

 
 ---

 Ravi Varadhan, Ph.D.

 Assistant Professor, The Center on Aging and Health

 Division of Geriatric Medicine and Gerontology

 Johns Hopkins University

 Ph: (410) 502-2619

 Fax: (410) 614-9625

 Email: [EMAIL PROTECTED]

 Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html



 
 


 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Paul Smith
 Sent: Monday, May 07, 2007 6:26 PM
 To: R-help
 Subject: Re: [R] Bad optimization solution

 On 5/7/07, Paul Smith [EMAIL PROTECTED] wrote:
   I think the problem is the starting point.  I do not remember the
 details
   of the BFGS method, but I am almost sure the (.5, .5) starting point is
   suspect, since the abs function is not differentiable at 0.  If you
 perturb
   the starting point even slightly you will have no problem.
  
Paul Smith
[EMAIL PROTECTED]

 To
Sent by:  R-help r-help@stat.math.ethz.ch
[EMAIL PROTECTED]
 cc
at.math.ethz.ch
  
 Subject
  [R] Bad optimization solution
05/07/2007 04:30
PM
  
  
  
  
  
  
  
  
   Dear All
  
   I am trying to perform the below optimization problem, but getting
   (0.5,0.5) as optimal solution, which is wrong; the correct solution
   should be (1,0) or (0,1).
  
   Am I doing something wrong? I am using R 2.5.0 on Fedora Core 6 (Linux).
  
   Thanks in advance,
  
   Paul
  
   --
   myfunc - function(x) {
 x1 - x[1]
 x2 - x[2]
 abs(x1-x2)
   }
  
  
 optim(c(0.5,0.5),myfunc,lower=c(0,0),upper=c(1,1),method=L-BFGS-B,control=
 list(fnscale=-1))
 
  Yes, with (0.2,0.9), a correct solution comes out. However, how can
  one be sure in general that the solution obtained by optim is correct?
  In ?optim says:
 
   Method 'L-BFGS-B' is that of Byrd _et. al._ (1995) which allows
   _box constraints_, that is each variable can be given a lower
   and/or upper bound. The initial value must satisfy the
   constraints. This uses a limited-memory modification of the BFGS
   quasi-Newton method. If non-trivial bounds are supplied, this
   method will be selected, with a warning.
 
  which only demands that the initial value must satisfy the constraints.

 Furthermore, X^2 is everywhere differentiable and notwithstanding the
 reported problem occurs with

 myfunc - function(x) {
   x1 - x[1]
   x2 - x[2]
   (x1-x2)^2
 }

 optim(c(0.2,0.2),myfunc,lower=c(0,0),upper=c(1,1),method=L-BFGS-B,control=
 list(fnscale=-1))

 Paul

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] package ltm -- version 0.8-0

2007-05-08 Thread Dimitris Rizopoulos
Dear R-users,

I'd like to announce the release of the new version of package `ltm' 
(i.e., ltm_0.8-0 soon available from CRAN) for Item Response Theory 
analyses. This package provides a flexible framework for analyzing 
dichotomous and polytomous data under IRT, including the Rasch model, 
the Two-Parameter Logistic model, Birnbaum's Three-Parameter model, 
the Latent Trait model with up to two latent variables (allowing also 
for nonlinear terms), and Samejima's Graded Response model. 
Furthermore, supporting functions for descriptive statistics, 
goodness-of-fit, ability estimation and plotting are available.

New features include:

  * The new functions person.fit() and item.fit() compute p-values for 
person- and item-fit statistics for IRT models for dichotomous data. 
The `simulate.p.value' argument enables the computation of p-values 
based on a Monte Carlo procedure.

  * The new function unidimTest() checks the unidimensionality 
assumption for dichotomous data IRT models, using a Modified Parallel 
Analysis.

  * The new function testEquatingData() prepares data-sets for test 
equating by common items. In particular, two types of common item 
equating are included: alternate form equating (where common and 
unique items are analyzed simultaneously) and across sample equating 
(where different sets of unique items are analyzed separately based on 
previously calibrated anchor items).

  * grm() now works with the available cases when incomplete data 
(i.e., in the presence of NAs) are analyzed.

  * better algorithms, for Missing At Random missing data mechanisms, 
have been written for grm(), ltm(), rasch() and tpm().

  * a residuals() method has been added for `grm', `ltm', `rasch', and 
`tpm' objects that computes Pearson-type residuals.

  * factor.scores() and fitted() methods for classes `grm', `ltm', 
`rasch', and `tpm' allow now for NAs in the `resp.patterns' argument, 
enabling thus the computation of ability estimates and fitted values 
for incomplete response patterns.

  * the fitted() method now allows also for the computation of 
marginal and conditional (on the latent variable(s)) probabilities; 
this feature is controlled by the new `type' argument.

  * for more details and other news, check the CHANGES file that ships 
with the package.

More information as well as .R files illustrating the capabilities of 
the package can be found in the Rwiki page of `ltm' available at: 
http://wiki.r-project.org/rwiki/doku.php?id=packages:cran:ltm.

Future plans include the development of functions for fitting Bock's 
Nominal Response model and the option for Differential Item 
Functioning.

I'd like also to thank all users of `ltm' for providing valuable 
feedback, and welcome any additional feedback (questions, suggestions, 
bug-reports, etc.).

Best,
Dimitris


Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

___
R-packages mailing list
[EMAIL PROTECTED]
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Weighted least squares

2007-05-08 Thread hadley wickham
Dear all,

I'm struggling with weighted least squares, where something that I had
assumed to be true appears not to be the case.  Take the following
data set as an example:

df - data.frame(x = runif(100, 0, 100))
df$y - df$x + 1 + rnorm(100, sd=15)

I had expected that:

summary(lm(y ~ x, data=df, weights=rep(2, 100)))
summary(lm(y ~ x, data=rbind(df,df)))

would be equivalent, but they are not.  I suspect the difference is
how the degrees of freedom is calculated - I had expected it to be
sum(weights), but seems to be sum(weights  0).  This seems
unintuitive to me:

summary(lm(y ~ x, data=df, weights=rep(c(0,2), each=50)))
summary(lm(y ~ x, data=df, weights=rep(c(0.01,2), each=50)))

What am I missing?  And what is the usual way to do a linear
regression when you have aggregated data?

Thanks,

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Representing a statistic as a colour on a 2d plot

2007-05-08 Thread Jim Lemon
mister_bluesman wrote:
 Hello.
 
 I have a 2d plot which looks like this:
 
 http://www.nabble.com/file/8242/1.JPG 
 
 This plot is derived from a file that holds statistics about each point on
 the plot and looks like this:
 
   abc   d  e
 a00.4980.4730.524   0.528 
 b   0.498  0   0   0  0
 c   0.473  0   0   0  0
 d   0.524  0   0   0  0
 e   0.528  0   0   0  0
 
 However, I have another file called 2.txt, with the following contents:
 
 a  b  c  d  e   
 0.5   0.7  0.32 0.34 0.01
 
 What I would like to know is how do I convert these values in 2.txt to
 colours or colour intensities so that the x's in the diagram above can be
 colour coded as such.

Yo bluesman,

check color.scale in the plotrix package, cat
it'll color your points to the values they're at

Jim

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How do i calculate emp Returns

2007-05-08 Thread Soare Marcian-Alin
Hello,

I want to build empirical Returns of stockfonds, but I cant find a function
in R, which calculate this.

Alin

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] creating a new column

2007-05-08 Thread Schmitt, Corinna
Hallo,
I just tried, if my solution might be possible for you. Here is the result:

 s=2
while(s!=0){ n=20
+  m-matrix(nrow=n,ncol=4)
+  colnames(m)=c(treatmentgrp,strata,censoringTime,survivalTime)
+ for(i in 1:20)  
m[i,]-c(sample(c(1,2),1,replace=TRUE),sample(c(1,2),1,replace=TRUE),rexp(1,.007),rexp(1,.002))
+ m-cbind(m,0)
+  m[m[,3]m[,4],5]-1
+  colnames(m)[5]-censoring
+   print(m)
+s=s-1
+ }

#bilding a data frame from m

x=data.frame(m)

# adding a column, where nrow(x) = number of row in x and in the 
# new coulmn should stand the entries 1...20.

x=data.frame(x,actual surv time=c(1:nrow(x)))

 x
   treatmentgrp strata censoringTime survivalTime censoring actual.surv.time
1 1  2377.486125   1070.66287 01
2 1  2242.468604   1061.30474 02
3 1  2 40.904656 51.88263 03
4 2  2 44.025595590.15317 04
5 1  1253.093279247.32141 15
6 2  2 50.486272257.25016 06
7 1  1337.591250554.05931 07
8 2  2 74.905075873.14563 08
9 1  2 57.196581765.43142 09
101  2370.147307   1646.65368 0   10
111  2152.738532480.12355 0   11
122  2 15.313303139.19791 0   12
131  2 17.205624641.15764 0   13
142  1 81.753924107.02202 0   14
151  2 60.774221665.27500 0   15
162  1  8.712562142.90775 0   16
171  1 54.542722   1904.88060 0   17
182  2 85.626140214.66811 0   18
192  1 31.257923739.96591 0   19
201  1 85.910141306.14860 0   20


See it works! For more information of data frames, see ?data.frame 
Here are some additional examples.


 x[2,6]  # Extraction of special entries: x[2,6]= 2 - row 2, column 6
[1] 2
 x[2,6]=100  # Changing entires: x[2,6]=100
 x
   treatmentgrp strata censoringTime survivalTime censoring actual.surv.time
1 1  2377.486125   1070.66287 01
2 1  2242.468604   1061.30474 0  100
3 1  2 40.904656 51.88263 03
4 2  2 44.025595590.15317 04
5 1  1253.093279247.32141 15
6 2  2 50.486272257.25016 06
7 1  1337.591250554.05931 07
8 2  2 74.905075873.14563 08
9 1  2 57.196581765.43142 09
101  2370.147307   1646.65368 0   10
111  2152.738532480.12355 0   11
122  2 15.313303139.19791 0   12
131  2 17.205624641.15764 0   13
142  1 81.753924107.02202 0   14
151  2 60.774221665.27500 0   15
162  1  8.712562142.90775 0   16
171  1 54.542722   1904.88060 0   17
182  2 85.626140214.66811 0   18
192  1 31.257923739.96591 0   19
201  1 85.910141306.14860 0   20
 t=x
 row2=t[-2,] # deleting a row
 row2
   treatmentgrp strata censoringTime survivalTime censoring actual.surv.time
1 1  2377.486125   1070.66287 01
3 1  2 40.904656 51.88263 03
4 2  2 44.025595590.15317 04
5 1  1253.093279247.32141 15
6 2  2 50.486272257.25016 06
7 1  1337.591250554.05931 07
8 2  2 74.905075873.14563 08
9 1  2 57.196581765.43142 09
101  2370.147307   1646.65368 0   10
111  2152.738532

[R] trouble with help

2007-05-08 Thread Schmitt, Corinna
Hallo,

I just updated to the new version of R by installing everything new. Now
I have a problem with the help command:

 help.start()
updating HTML package listing
updating HTML search index
If nothing happens, you should open
'Z:\Software\R\R-2.5.0\doc\html\index.html' yourself

The browser was started but nothing was displayes.

Can anyone help me,

Corinna

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Piecewise cubic Hermite interpolation

2007-05-08 Thread Vladimir Eremeev

Which function implements the piecewise cubic Hermite interpolation?

I am looking for equivalent of matlab's interp1 with the method = 'pchip'

Here is the reference

http://www.mathworks.com/access/helpdesk/help/techdoc/index.html?/access/helpdesk/help/techdoc/ref/interp1.html;
-- 
View this message in context: 
http://www.nabble.com/Piecewise-cubic-Hermite-interpolation-tf3708765.html#a10373214
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] sample() and memory usage

2007-05-08 Thread Victor Gravenholt
As a part of a simulation, I need to sample from a large vector repeatedly.
For some reason sample() builds up the memory usage ( 500 MB for this 
example) when used inside a for loop as illustrated here:

X - 1:10
P - runif(10)
for(i in 1:500) Xsamp - sample(X,3,replace=TRUE,prob=P)

Even worse, I am not able to free up memory without quitting R.
I quickly run out of memory when trying to perform the simulation. Is 
there any way to avoid this to happen?

The problem seem to appear only when specifying replace=TRUE and 
probability weights for the vector being sampled, and this happens both 
on Windows XP and Linux (Ubuntu).


Victor

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] creating a new column

2007-05-08 Thread Schmitt, Corinna
Hallo,

if you work with matrix you can use cbind to add columns.

Corinna

  


-Ursprüngliche Nachricht-
Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Im Auftrag von raymond chiruka
Gesendet: Montag, 7. Mai 2007 16:28
An: r
Betreff: [R] creating a new column

hie l would like to create a 6th column actual surv time from the following 
data 
  
  the condition being
  if  censoringTimesurvivaltime then actual survtime =survival time
  else actual survtime =censoring time
  
  the code l used to create the data is
  
   s=2
   while(s!=0){ n=20
 m-matrix(nrow=n,ncol=4)
 colnames(m)=c(treatmentgrp,strata,censoringTime,survivalTime)
for(i in 1:20)  
m[i,]-c(sample(c(1,2),1,replace=TRUE),sample(c(1,2),1,replace=TRUE),rexp(1,.007),rexp(1,.002))
m-cbind(m,0)
 m[m[,3]m[,4],5]-1
 colnames(m)[5]-censoring
  print(m)
   s=s-1
  treatmentgrp strata censoringTime survivalTime censoring
   [1,] 1  1   1.0121591137.80922 0
   [2,] 2  2  32.971439 247.21786 0
   [3,] 2  1  85.758253 797.04949 0
   [4,] 1  1  16.999171  78.92309 0
   [5,] 2  1 272.909896 298.21483 0
   [6,] 1  2 138.230629 935.96765 0
   [7,] 2  2  91.529859 141.08405 0
  
  
  l keep getting an error message when i try to  create the 6th column
  
  
  
  
  
-


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] trouble with help

2007-05-08 Thread Vladimir Eremeev

Did you try manually opening the file mentioned in the browser
(Z:\Software\R\R-2.5.0\doc\html\index.html) ?
Does this file exist? 
Was the html help installed?


Schmitt, Corinna wrote:
 
 Hallo,
 
 I just updated to the new version of R by installing everything new. Now
 I have a problem with the help command:
 
 help.start()
 updating HTML package listing
 updating HTML search index
 If nothing happens, you should open
 'Z:\Software\R\R-2.5.0\doc\html\index.html' yourself
 
 The browser was started but nothing was displayes.
 
 Can anyone help me,
 
 Corinna
 
 

-- 
View this message in context: 
http://www.nabble.com/trouble-with-help-tf3708685.html#a10373271
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Bad optimization solution

2007-05-08 Thread Jasjeet Singh Sekhon

The issue is that you are using a derivative based optimizer for a
problem for which it is well known that such optimizers will not
perform well.  You should consider using a global optimizer.  For
example, rgenoud combines a genetic search algorithm with a BFGS
optimizer and it works well for your problem:

library(rgenoud)

myfunc - function(x) {
  x1 - x[1]
   x2 - x[2]
   abs(x1-x2)
 }

optim(c(0.5,0.5),myfunc,lower=c(0,0),upper=c(1,1),method=L-BFGS-B,control=list(fnscale=-1))

genoud(myfunc, nvars=2, 
Domains=rbind(c(0,1),c(0,1)),max=TRUE,boundary.enforcement=2)

myfunc - function(x) {
  x1 - x[1]
  x2 - x[2]
  (x1-x2)^2
}

optim(c(0.2,0.2),myfunc,lower=c(0,0),upper=c(1,1),method=L-BFGS-B,control=list(fnscale=-1))
genoud(myfunc, nvars=2, 
Domains=rbind(c(0,1),c(0,1)),max=TRUE,boundary.enforcement=2)

Cheers,
Jas.

===
Jasjeet S. Sekhon 
  
Associate Professor 
Travers Department of Political Science
Survey Research Center  
UC Berkeley 

http://sekhon.berkeley.edu/
V: 510-642-9974  F: 617-507-5524
===






Paul Smith writes:
  It seems that there is here a problem of reliability, as one never
  knows whether the solution provided by R is correct or not. In the
  case that I reported, it is fairly simple to see that the solution
  provided by R (without any warning!) is incorrect, but, in general,
  that is not so simple and one may take a wrong solution as a correct
  one.
  
  Paul
  
  
  On 5/8/07, Ravi Varadhan [EMAIL PROTECTED] wrote:
   Your function, (x1-x2)^2, has zero gradient at all the starting values such
   that x1 = x2, which means that the gradient-based search methods will
   terminate there because they have found a critical point, i.e. a point at
   which the gradient is zero (which can be a maximum or a minimum or a saddle
   point).
  
   However, I do not why optim converges to the boundary maximum, when 
   analytic
   gradient is supplied (as shown by Sundar).
  
   Ravi.
  
   
   ---
  
   Ravi Varadhan, Ph.D.
  
   Assistant Professor, The Center on Aging and Health
  
   Division of Geriatric Medicine and Gerontology
  
   Johns Hopkins University
  
   Ph: (410) 502-2619
  
   Fax: (410) 614-9625
  
   Email: [EMAIL PROTECTED]
  
   Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html
  
  
  
   
   
  
  
   -Original Message-
   From: [EMAIL PROTECTED]
   [mailto:[EMAIL PROTECTED] On Behalf Of Paul Smith
   Sent: Monday, May 07, 2007 6:26 PM
   To: R-help
   Subject: Re: [R] Bad optimization solution
  
   On 5/7/07, Paul Smith [EMAIL PROTECTED] wrote:
 I think the problem is the starting point.  I do not remember the
   details
 of the BFGS method, but I am almost sure the (.5, .5) starting point is
 suspect, since the abs function is not differentiable at 0.  If you
   perturb
 the starting point even slightly you will have no problem.

  Paul Smith
  [EMAIL PROTECTED]
  
   To
  Sent by:  R-help 
 r-help@stat.math.ethz.ch
  [EMAIL PROTECTED]
   cc
  at.math.ethz.ch

   Subject
[R] Bad optimization solution
  05/07/2007 04:30
  PM








 Dear All

 I am trying to perform the below optimization problem, but getting
 (0.5,0.5) as optimal solution, which is wrong; the correct solution
 should be (1,0) or (0,1).

 Am I doing something wrong? I am using R 2.5.0 on Fedora Core 6 
 (Linux).

 Thanks in advance,

 Paul

 --
 myfunc - function(x) {
   x1 - x[1]
   x2 - x[2]
   abs(x1-x2)
 }


   optim(c(0.5,0.5),myfunc,lower=c(0,0),upper=c(1,1),method=L-BFGS-B,control=
   list(fnscale=-1))
   
Yes, with (0.2,0.9), a correct solution comes out. However, how can
one be sure in general that the solution obtained by optim is correct?
In ?optim says:
   
 Method 'L-BFGS-B' is that of Byrd _et. al._ (1995) which allows
 _box constraints_, that is each variable can be given a lower
 and/or upper bound. The initial value must satisfy the
 constraints. This uses a limited-memory modification of the BFGS
 quasi-Newton method. If non-trivial bounds are supplied, this
 method will be selected, with a warning.
   
which only demands that the initial value must satisfy the constraints.
  
   Furthermore, X^2 is everywhere differentiable and notwithstanding the
   reported problem occurs with
 

[R] sample() and memory usage

2007-05-08 Thread Victor Gravenholt
As a part of a simulation, I need to sample from a large vector repeatedly.
For some reason sample() builds up the memory usage ( 500 MB for this 
example) when used inside a for loop as illustrated here:

X - 1:10
P - runif(10)
for(i in 1:500) Xsamp - sample(X,3,replace=TRUE,prob=P)

Even worse, I am not able to free up memory without quitting R.
I quickly run out of memory when trying to perform the simulation. Is 
there any way to avoid this to happen?

The problem seem to appear only when specifying replace=TRUE and 
probability weights for the vector being sampled, and this happens both 
on Windows XP and Linux (Ubuntu).


Victor

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] trouble with help

2007-05-08 Thread Schmitt, Corinna
How can I install the html help?

Corinna
.de
http://www.igb.fraunhofer.de



  


-Ursprüngliche Nachricht-
Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Im Auftrag von Vladimir Eremeev
Gesendet: Dienstag, 8. Mai 2007 12:22
An: r-help@stat.math.ethz.ch
Betreff: Re: [R] trouble with help


Did you try manually opening the file mentioned in the browser
(Z:\Software\R\R-2.5.0\doc\html\index.html) ?
Does this file exist? 
Was the html help installed?


Schmitt, Corinna wrote:
 
 Hallo,
 
 I just updated to the new version of R by installing everything new. Now
 I have a problem with the help command:
 
 help.start()
 updating HTML package listing
 updating HTML search index
 If nothing happens, you should open
 'Z:\Software\R\R-2.5.0\doc\html\index.html' yourself
 
 The browser was started but nothing was displayes.
 
 Can anyone help me,
 
 Corinna
 
 

-- 
View this message in context: 
http://www.nabble.com/trouble-with-help-tf3708685.html#a10373271
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] trouble with help

2007-05-08 Thread Vladimir Eremeev

You should select it in the list of components to install, when the installer
asks.
It happens after you 3rd time press Next in the installation wizard
window.
Choose Custom installation and select all you need.


Schmitt, Corinna wrote:
 
 How can I install the html help?
 
 Corinna
 
 -Ursprüngliche Nachricht-
 Von: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] Im Auftrag von Vladimir Eremeev
 Gesendet: Dienstag, 8. Mai 2007 12:22
 An: r-help@stat.math.ethz.ch
 Betreff: Re: [R] trouble with help
 
 Did you try manually opening the file mentioned in the browser
 (Z:\Software\R\R-2.5.0\doc\html\index.html) ?
 Does this file exist? 
 Was the html help installed?
 
 

-- 
View this message in context: 
http://www.nabble.com/trouble-with-help-tf3708685.html#a10374007
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] trouble with help

2007-05-08 Thread Schmitt, Corinna
Hallo,

I found the solution. I installed everything new and in the setup I choose 
Changing Startoptions and there I could mark html help.

It now works,
Corinna


  


-Ursprüngliche Nachricht-
Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Im Auftrag von Schmitt, Corinna
Gesendet: Dienstag, 8. Mai 2007 13:03
An: Vladimir Eremeev; r
Betreff: Re: [R] trouble with help

How can I install the html help?

Corinna
.de
http://www.igb.fraunhofer.de



  


-Ursprüngliche Nachricht-
Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Im Auftrag von Vladimir Eremeev
Gesendet: Dienstag, 8. Mai 2007 12:22
An: r-help@stat.math.ethz.ch
Betreff: Re: [R] trouble with help


Did you try manually opening the file mentioned in the browser
(Z:\Software\R\R-2.5.0\doc\html\index.html) ?
Does this file exist? 
Was the html help installed?


Schmitt, Corinna wrote:
 
 Hallo,
 
 I just updated to the new version of R by installing everything new. Now
 I have a problem with the help command:
 
 help.start()
 updating HTML package listing
 updating HTML search index
 If nothing happens, you should open
 'Z:\Software\R\R-2.5.0\doc\html\index.html' yourself
 
 The browser was started but nothing was displayes.
 
 Can anyone help me,
 
 Corinna
 
 

-- 
View this message in context: 
http://www.nabble.com/trouble-with-help-tf3708685.html#a10373271
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Neural Nets (nnet) - evaluating success rate of predictions

2007-05-08 Thread Antti Arppe
Nathaniel,

On Mon, 7 May 2007, [EMAIL PROTECTED] wrote:
 Date: Sun, 6 May 2007 12:02:31 + (GMT)
 From: nathaniel Grey [EMAIL PROTECTED]

 However what I really want to know is how well my nueral net is
 doing at classifying my binary output variable. I am new to R and I
 can't figure out how you can assess the success rates of
 predictions.

I've been recently tacking this myself, though with respect to
polytomous (2) outcomes. The following approaches are based on
Menard (1995), Cohen et al. (2002) and Manning  Schütze (1999).

First you have to decide what is the critical probability that you use
to classify the cases into class A (and consequently not(class[A])).
The simplest level is 0.5, but other levels might also be motivated,
see e.g. Cohen et al. (2002: 516-519).

You can then treat the classification task as two distinct types,
namely classification and prediction models, which have an effect on
how the efficiency and accuracy of prediction is exactly measured
(Menard 1995: 24-26). In a pure prediction model, we set no a priori
expectation or constraint on the overall frequencies of the predicted
classes. To the contrary, in a classification model our expectation is
that the predicted outcome classes on the long run will end up having
the same proportions as are evident in the training data.

As the starting point for evaluating prediction efficiency is to
compile a 2x2 prediction/classification table. Frequency counts on the
(decending) diagonal in the table indicate correctly predicted and
classified cases, whereas all counts off the diagonal are incorrect.
For the two alternatives overerall, we can divide the predicted
classifications into the four types presented below, on which the
basic measures of prediction efficiency are based. (Manning and
Schütze 1999: 267-271)

Original/Predicted  Class   not(Class)(=Other)
Class   TP ~ True Positive) FN ~ False Negative
not(Class) (=Other) FP ~ False Positive TN ~ True Negative

You can then go on to calculate recall and precision, or spesificity
or sensitivity. Recall is the proportion of original occurrences of
some particular class for which the prediction is correct (formula 1
below, see Manning and Schütze 1999: 269, formula 8.4), whereas
precision is the proportion of the all the predictions of some
particular class, which turn out to be correct (formula 2 below, see
Manning and Schütze 1999: 268, formula 8.3). Sensitivity is in fact
exactly equal to recall, whereas specificity is understood as the
proportion of non-cases correctly predicted or classified as
non-cases, i.e. rejected (formula 3 below) Furthermore, there is a
third pair of evaluation measures that one could also calculate,
namely accuracy and error (formula 4 below) (Manning and Schütze 1999:
268-270).

(1) Recall = TP / (TP + FN) (=Sensitivity)

(2) Precision = TP / (TP + FP)

(3) Specificity = TN / (TN + FN)

(4) Accuracy = (TP + TN) / N = diag(n[k,k])

However, as has been noted in some earlier responses these
aforementioned general measures do not in any way take into
consideration whether prediction and classification according to a
model, with the help of explanatory variables, performs any better
than knowing the overall proportions of the outcome classes.
For this purpose, the asymmetric summary measures of association based
on Proportionate Reduction of Error (PRE) are good candidates for
evaluating prediction accuracy, where we expect that the prediction or
classification process on the basis of the models should exceed some
baselines or thresholds. However, one cannot use the Goodman-Kruskal
lambda and tau as such, but make some adjustments to account for the
possibility of incorrect prediction.

With this approach one compares prediction/classification errors with
the model, error(model), to the baseline level of
prediction/classification errors without the error(model, baseline),
according to formula 10 below. (Menard 1995: 28-30). The formula for
the error with the model remains the same, irrespective of whether we
are evaluating prediction or classification accuracy, presented in
(5), but the errors without the model vary according to the intended
objective, presented in (6) and (7). Subsequently, the measure for the
proportionate reduction of prediction error is presented in (9) below,
and being analogous to the Goodman-Kruskal lambda it is designated as
lambda(prediction). Similarly, the measure for proportionate reduction
of classification error is presented in (10), and being analogous with
the Goodman-Kruskal tau it is likewise designated as
tau(classification). For both measures, positive values indicate
better than baseline classification, while negative values worse
performance.

(5)  error(model) = N - SUM{k=1...K}n[k,k] = N - SUM{diag(n)],
where n is the 2x2 prediction/classification matrix

(6)  error(baseline, prediction) = N - max(R[k]),
with R[k] = marginal row sums for each row k of altogether K 

[R] minimum from matrix

2007-05-08 Thread oarabile


I have a very large matrix with columns that have some of their
entries as zero


A small example if a=

 [,1]  [,2]  [,3]  [,4]
 [,1] 0 2 0 0
 [,2] 1 3 0 3
 [,3] 2 0 3 5
 [,4] 0 4 0 0

and what to get the minimum number from each column but that number
should not be zero. If I use apply (a,2,min) I will get a vector of
zeros as the minimum but what I want it for example from column 1 I
should get 1 i.e for all the matrix I should get a vector (1,2,3,3). I
wonder if someone can give an idea on how to go about it.

thanks in advance for your help.

Oarabile

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] minimum from matrix

2007-05-08 Thread Gabor Csardi
apply(a, 2, function(x) min(x[x!=0]) )

should do it. Might need some improvement if all numbers in a column
can be zero, try it.

Gabor

On Tue, May 08, 2007 at 09:50:43AM +0100, [EMAIL PROTECTED] wrote:
 
 
 I have a very large matrix with columns that have some of their
 entries as zero
 
 
 A small example if a=
 
  [,1]  [,2]  [,3]  [,4]
  [,1] 0 2 0 0
  [,2] 1 3 0 3
  [,3] 2 0 3 5
  [,4] 0 4 0 0
 
 and what to get the minimum number from each column but that number
 should not be zero. If I use apply (a,2,min) I will get a vector of
 zeros as the minimum but what I want it for example from column 1 I
 should get 1 i.e for all the matrix I should get a vector (1,2,3,3). I
 wonder if someone can give an idea on how to go about it.
 
 thanks in advance for your help.
 
 Oarabile
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Csardi Gabor [EMAIL PROTECTED]MTA RMKI, ELTE TTK

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] irtoys

2007-05-08 Thread Ivailo Partchev
I have just submitted irtoys_0.1.0, a package potentially useful for 
those working with IRT models. It can fit the 1PL, 2PL, and 3PL models 
through a simple and unified syntax, using either the R package ltm, 
Brad Hanson's ICL program, or the commercially available BILOG-MG. The 
purpose is basically to facilitate teaching, and especially comparisons 
across models and/or programs. Various graphs, item fit statistics, 
scaling methods, ability estimation, simulation facilities etc. are also 
included. Please notice that most options in estimating an IRT model are 
kept to the typical default values, and only a small, common subset of 
the ltm, ICL, and BILOG syntax is supported. Comments are very welcome.

Ivailo Partchev
Institute of Psychology
University of Jena
Germany

___
R-packages mailing list
[EMAIL PROTECTED]
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] minimum from matrix

2007-05-08 Thread Dimitris Rizopoulos
try this:

apply(a, 2, function(x) min(x[x  0]))


I hope it helps.

Best,
Dimitris


Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://med.kuleuven.be/biostat/
 http://www.student.kuleuven.be/~m0390867/dimitris.htm


- Original Message - 
From: [EMAIL PROTECTED]
To: r-help@stat.math.ethz.ch
Sent: Tuesday, May 08, 2007 10:50 AM
Subject: [R] minimum from matrix




 I have a very large matrix with columns that have some of their
 entries as zero


 A small example if a=

 [,1]  [,2]  [,3]  [,4]
 [,1] 0 2 0 0
 [,2] 1 3 0 3
 [,3] 2 0 3 5
 [,4] 0 4 0 0

 and what to get the minimum number from each column but that number
 should not be zero. If I use apply (a,2,min) I will get a vector of
 zeros as the minimum but what I want it for example from column 1 I
 should get 1 i.e for all the matrix I should get a vector (1,2,3,3). 
 I
 wonder if someone can give an idea on how to go about it.

 thanks in advance for your help.

 Oarabile

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Weighted least squares

2007-05-08 Thread Adaikalavan Ramasamy
See below.

hadley wickham wrote:
 Dear all,
 
 I'm struggling with weighted least squares, where something that I had
 assumed to be true appears not to be the case.  Take the following
 data set as an example:
 
 df - data.frame(x = runif(100, 0, 100))
 df$y - df$x + 1 + rnorm(100, sd=15)
 
 I had expected that:
 
 summary(lm(y ~ x, data=df, weights=rep(2, 100)))
 summary(lm(y ~ x, data=rbind(df,df)))

You assign weights to different points according to some external 
quality or reliability measure not number of times the data point was 
measured.

Look at the estimates and standard error of the two models below:

  coefficients( summary(f.w - lm(y ~ x, data=df, weights=rep(2, 100))) )
  Estimate Std. Error   t value Pr(|t|)
  (Intercept) 1.940765 3.30348066  0.587491 5.582252e-01
  x   0.982610 0.05893262 16.673448 2.264258e-30

  coefficients( summary( f.u - lm(y ~ x, data=rbind(df,df) ) ) )
  Estimate Std. Errort value Pr(|t|)
  (Intercept) 1.940765 2.32408609  0.8350659 4.046871e-01
  x   0.982610 0.04146066 23.6998165 1.012067e-59

You can see that they have same coefficient estimates but the second one 
  has smaller variances.

The repeated values artificially deflates the variance and thus inflates 
the precision. This is why you cannot treat replicate data as 
independent observations.


 would be equivalent, but they are not.  I suspect the difference is
 how the degrees of freedom is calculated - I had expected it to be
 sum(weights), but seems to be sum(weights  0).  This seems
 unintuitive to me:
 
 summary(lm(y ~ x, data=df, weights=rep(c(0,2), each=50)))
 summary(lm(y ~ x, data=df, weights=rep(c(0.01,2), each=50)))
 
 What am I missing?  And what is the usual way to do a linear
 regression when you have aggregated data?

I would be best to use the individual data points instead of aggregated 
data as it allows you to estimate the within-group variations as well.

If you had individual data points, you could try something as follows. 
Please check the codes as I am no expert in the area of repeated measures.

  x  - runif(100, 0, 100)
  y1 - x + rnorm(100, mean=1, sd=15)
  y2 - y1 + rnorm(100, sd=5)

  df - data.frame( y=c(y1, y2),
x=c(x,x),
subject=factor(rep( paste(p, 1:100, sep=), 2 ) ))

  library(nlme)
  summary( lme( y ~ x, random = ~ 1 | subject, data=df ) )

Try reading Pinheiro and Bates (http://tinyurl.com/yvvrr7) or related 
material for more information. Hope this helps.

 Thanks,
 
 Hadley

Regards, Adai

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] sample function and memory usage

2007-05-08 Thread Victor Gravenholt
As a part of a simulation, I need to sample from a large vector repeatedly.
For some reason sample() builds up the memory usage ( 500 MB for this 
example) when used inside a for loop as illustrated here:

X - 1:10
P - runif(10)
for(i in 1:500) Xsamp - sample(X,3,replace=TRUE,prob=P)

Even worse, I am not able to free up memory without quitting R.
I quickly run out of memory when trying to perform the simulation. Is 
there any way to avoid this to happen?

The problem seem to appear only when specifying both replace=TRUE and 
probability weights for the vector being sampled, and this happens both 
on Windows XP and Linux (Ubuntu).


Victor

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Weighted least squares

2007-05-08 Thread hadley wickham
On 5/8/07, Adaikalavan Ramasamy [EMAIL PROTECTED] wrote:
 See below.

 hadley wickham wrote:
  Dear all,
 
  I'm struggling with weighted least squares, where something that I had
  assumed to be true appears not to be the case.  Take the following
  data set as an example:
 
  df - data.frame(x = runif(100, 0, 100))
  df$y - df$x + 1 + rnorm(100, sd=15)
 
  I had expected that:
 
  summary(lm(y ~ x, data=df, weights=rep(2, 100)))
  summary(lm(y ~ x, data=rbind(df,df)))

 You assign weights to different points according to some external
 quality or reliability measure not number of times the data point was
 measured.

That is one type of weighting - but what if I have already aggregated
data?  That is a perfectly valid type of weighting too.

 Look at the estimates and standard error of the two models below:

   coefficients( summary(f.w - lm(y ~ x, data=df, weights=rep(2, 100))) )
   Estimate Std. Error   t value Pr(|t|)
   (Intercept) 1.940765 3.30348066  0.587491 5.582252e-01
   x   0.982610 0.05893262 16.673448 2.264258e-30

   coefficients( summary( f.u - lm(y ~ x, data=rbind(df,df) ) ) )
   Estimate Std. Errort value Pr(|t|)
   (Intercept) 1.940765 2.32408609  0.8350659 4.046871e-01
   x   0.982610 0.04146066 23.6998165 1.012067e-59

 You can see that they have same coefficient estimates but the second one
   has smaller variances.

 The repeated values artificially deflates the variance and thus inflates
 the precision. This is why you cannot treat replicate data as
 independent observations.

Hardly artificially - I have repeated observations.

  would be equivalent, but they are not.  I suspect the difference is
  how the degrees of freedom is calculated - I had expected it to be
  sum(weights), but seems to be sum(weights  0).  This seems
  unintuitive to me:
 
  summary(lm(y ~ x, data=df, weights=rep(c(0,2), each=50)))
  summary(lm(y ~ x, data=df, weights=rep(c(0.01,2), each=50)))
 
  What am I missing?  And what is the usual way to do a linear
  regression when you have aggregated data?

 I would be best to use the individual data points instead of aggregated
 data as it allows you to estimate the within-group variations as well.

There is no within group variation - these are observations that occur
with same values many times in the dataset, so have been aggregated
into the a contingency table-like format.

 If you had individual data points, you could try something as follows.
 Please check the codes as I am no expert in the area of repeated measures.

   x  - runif(100, 0, 100)
   y1 - x + rnorm(100, mean=1, sd=15)
   y2 - y1 + rnorm(100, sd=5)

   df - data.frame( y=c(y1, y2),
 x=c(x,x),
 subject=factor(rep( paste(p, 1:100, sep=), 2 ) ))

   library(nlme)
   summary( lme( y ~ x, random = ~ 1 | subject, data=df ) )

 Try reading Pinheiro and Bates (http://tinyurl.com/yvvrr7) or related
 material for more information. Hope this helps.

I'm not interested in a mixed model, and I don't have individual data points.

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] minimum from matrix

2007-05-08 Thread Schmitt, Corinna
Hallo,

I added one row:

 a=rbind(a,1:4)
 a
 [,1] [,2] [,3] [,4]
[1,]0120
[2,]2304
[3,]0030
[4,]0350
[5,]1234

And how looks like the command for the minimum of the rows? The result should 
be minOfRows = 0 0 0 0 1

Thanks,
Corinna


  


-Ursprüngliche Nachricht-
Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Im Auftrag von Gabor Csardi
Gesendet: Dienstag, 8. Mai 2007 14:03
An: [EMAIL PROTECTED]
Cc: r-help@stat.math.ethz.ch
Betreff: Re: [R] minimum from matrix

apply(a, 2, function(x) min(x[x!=0]) )

should do it. Might need some improvement if all numbers in a column
can be zero, try it.

Gabor

On Tue, May 08, 2007 at 09:50:43AM +0100, [EMAIL PROTECTED] wrote:
 
 
 I have a very large matrix with columns that have some of their
 entries as zero
 
 
 A small example if a=
 
  [,1]  [,2]  [,3]  [,4]
  [,1] 0 2 0 0
  [,2] 1 3 0 3
  [,3] 2 0 3 5
  [,4] 0 4 0 0
 
 and what to get the minimum number from each column but that number
 should not be zero. If I use apply (a,2,min) I will get a vector of
 zeros as the minimum but what I want it for example from column 1 I
 should get 1 i.e for all the matrix I should get a vector (1,2,3,3). I
 wonder if someone can give an idea on how to go about it.
 
 thanks in advance for your help.
 
 Oarabile
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Csardi Gabor [EMAIL PROTECTED]MTA RMKI, ELTE TTK

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] minimum from matrix

2007-05-08 Thread Gabor Csardi
Corinna,

what is going on here? I've answered Oarabile's question, and then 
you reply to me with this. I'm completely lost. 

1. What is your question? Minimum of every row? 
   This was written in the original mail along with Oarabile's
   question. Ok it wasn't rows but columns.
   For rows it is apply(a, 1, min)
2. Why did you reply to my message? 
3. And if you did so, why didn't you read the original message 
   you're replying to?
4. Why did you add one row to the original matrix? Why is the new
   matrix significantly different form the original one?

Gabor

On Tue, May 08, 2007 at 02:49:50PM +0200, Schmitt, Corinna wrote:
 Hallo,
 
 I added one row:
 
  a=rbind(a,1:4)
  a
  [,1] [,2] [,3] [,4]
 [1,]0120
 [2,]2304
 [3,]0030
 [4,]0350
 [5,]1234
 
 And how looks like the command for the minimum of the rows? The result should 
 be minOfRows = 0 0 0 0 1
 
 Thanks,
 Corinna
 
 
   
 
 
 -Ursprüngliche Nachricht-
 Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Im Auftrag von Gabor Csardi
 Gesendet: Dienstag, 8. Mai 2007 14:03
 An: [EMAIL PROTECTED]
 Cc: r-help@stat.math.ethz.ch
 Betreff: Re: [R] minimum from matrix
 
 apply(a, 2, function(x) min(x[x!=0]) )
 
 should do it. Might need some improvement if all numbers in a column
 can be zero, try it.
 
 Gabor
 
 On Tue, May 08, 2007 at 09:50:43AM +0100, [EMAIL PROTECTED] wrote:
  
  
  I have a very large matrix with columns that have some of their
  entries as zero
  
  
  A small example if a=
  
   [,1]  [,2]  [,3]  [,4]
   [,1] 0 2 0 0
   [,2] 1 3 0 3
   [,3] 2 0 3 5
   [,4] 0 4 0 0
  
  and what to get the minimum number from each column but that number
  should not be zero. If I use apply (a,2,min) I will get a vector of
  zeros as the minimum but what I want it for example from column 1 I
  should get 1 i.e for all the matrix I should get a vector (1,2,3,3). I
  wonder if someone can give an idea on how to go about it.
  
  thanks in advance for your help.
  
  Oarabile
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 -- 
 Csardi Gabor [EMAIL PROTECTED]MTA RMKI, ELTE TTK
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Csardi Gabor [EMAIL PROTECTED]MTA RMKI, ELTE TTK

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A function for raising a matrix to a power?

2007-05-08 Thread Paul Gilbert


Ravi Varadhan wrote:
 Paul,
 
 Your solution based on SVD does not work 

Ooops. I really am getting rusty. The idea is based on eigenvalues 
which, of course, are not always the same as singular values.

Paul
even for the matrix in your example
 (the reason it worked for e=3, was that it is an odd power and since P is a
 permutation matrix.  It will also work for all odd powers, but not for even
 powers).
   
 However, a spectral decomposition will work for all symmetric matrices (SVD
 based exponentiation doesn't work for any matrix).  
 
 Here is the function to do exponentiation based on spectral decomposition:
 
 expM.sd - function(X,e){Xsd - eigen(X); Xsd$vec %*% diag(Xsd$val^e) %*%
 t(Xsd$vec)}
 
 P - matrix(c(.3,.7, .7, .3), ncol=2)
 P
  [,1] [,2]
 [1,]  0.3  0.7
 [2,]  0.7  0.3
 P %*% P
  [,1] [,2]
 [1,] 0.58 0.42
 [2,] 0.42 0.58
 expM(P,2)  
  [,1] [,2]
 [1,] 0.42 0.58
 [2,] 0.58 0.42
 expM.sd(P,2)
  [,1] [,2]
 [1,] 0.58 0.42
 [2,] 0.42 0.58
 
 Exponentiation based on spectral decomposition should be generally more
 accurate (not sure about the speed).  A caveat is that it will only work for
 real, symmetric or complex, hermitian matrices.  It will not work for
 asymmetric matrices.  It would work for asymmetric matrix if the
 eigenvectors are made to be orthonormal.
 
 Ravi.
 
 
 ---
 
 Ravi Varadhan, Ph.D.
 
 Assistant Professor, The Center on Aging and Health
 
 Division of Geriatric Medicine and Gerontology 
 
 Johns Hopkins University
 
 Ph: (410) 502-2619
 
 Fax: (410) 614-9625
 
 Email: [EMAIL PROTECTED]
 
 Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html
 
  
 
 
 
 
 
 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Paul Gilbert
 Sent: Monday, May 07, 2007 5:16 PM
 To: Atte Tenkanen
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] A function for raising a matrix to a power?
 
 You might try this, from 9/22/2006 with subject line Exponentiate a matrix:
 
 I am getting a bit rusty on some of these things, but I seem to recall 
 that there is a numerical advantage (speed and/or accuracy?) to 
 diagonalizing:
 
   expM - function(X,e) { v - La.svd(X); v$u %*% diag(v$d^e) %*% v$vt }
 
   P - matrix(c(.3,.7, .7, .3), ncol=2)
   P %*% P %*% P
   [,1]  [,2]
 [1,] 0.468 0.532
 [2,] 0.532 0.468
   expM(P,3)
   [,1]  [,2]
 [1,] 0.468 0.532
 [2,] 0.532 0.468
 
 I think this also works for non-integer, negative, large, and complex 
 exponents:
 
   expM(P, 1.5)
   [,1]  [,2]
 [1,] 0.3735089 0.6264911
 [2,] 0.6264911 0.3735089
   expM(P, 1000)
  [,1] [,2]
 [1,]  0.5  0.5
 [2,]  0.5  0.5
   expM(P, -3)
 [,1][,2]
 [1,] -7.3125  8.3125
 [2,]  8.3125 -7.3125
   expM(P, 3+.5i)
   [,1]  [,2]
 [1,] 0.4713+0.0141531i 0.5287-0.0141531i
 [2,] 0.5287-0.0141531i 0.4713+0.0141531i
  
 
 Paul Gilbert
 
 Doran, Harold wrote:
 
   Suppose I have a square matrix P
  
   P - matrix(c(.3,.7, .7, .3), ncol=2)
  
   I know that
  
  
   P * P
  
   Returns the element by element product, whereas
  
  
  
   P%*%P
  
  
   Returns the matrix product.
  
   Now, P2 also returns the element by element product. But, is there a
   slick way to write
  
   P %*% P %*% P
  
   Obviously, P3 does not return the result I expect.
  
   Thanks,
   Harold
  
 
 
 Atte Tenkanen wrote:
 Hi,

 Is there a function for raising a matrix to a power? For example if you
 like to compute A%*%A%*%A, is there an abbreviation similar to A^3?
 Atte Tenkanen

 A=rbind(c(1,1),c(-1,-2))
 A
  [,1] [,2]
 [1,]11
 [2,]   -1   -2
 A^3
  [,1] [,2]
 [1,]11
 [2,]   -1   -8

 But:

 A%*%A%*%A
  [,1] [,2]
 [1,]12
 [2,]   -2   -5

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 
 La version française suit le texte anglais.
 
 
 
 
 This email may contain privileged and/or confidential inform...{{dropped}}
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


La version française suit le texte anglais.



This email may contain 

[R] minimum of each row in a matrix

2007-05-08 Thread Schmitt, Corinna
Hallo,

I just followed the discussion of the title: minimum from matrix.
I have a similar problem.

 a=matrix(c(0,2, 0, 0, 1, 3, 0, 3, 2, 0, 3, 5, 0, 4, 0, 0),ncol=4)
 a=rbind(a,1:4)
 a
 [,1] [,2] [,3] [,4]
[1,]0120
[2,]2304
[3,]0030
[4,]0350
[5,]1234

 minOfColumns=apply(a, 2, function(x) min(x[x!=0]) )
 minOfColumns
[1] 1 1 2 4
 maxOfColumns=apply(a, 2, function(x) max(x) )
 maxOfColumns
[1] 2 3 5 4


How looks like the command for the minimum of the rows? I tried several
possibilities with apply but did not get the wanted result. The result
should be minOfRows = 0 0 0 0 1

Thanks,
Corinna

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] minimum of each row in a matrix

2007-05-08 Thread Sarah Goslee
Check out the help for apply, particularly the MARGIN argument:

minOfRows=apply(a, 1, function(x) min(x[x!=0]) )
maxOfRows=apply(a, 1, function(x) max(x) )

Sarah

On 5/8/07, Schmitt, Corinna [EMAIL PROTECTED] wrote:
 Hallo,

 I just followed the discussion of the title: minimum from matrix.
 I have a similar problem.

  a=matrix(c(0,2, 0, 0, 1, 3, 0, 3, 2, 0, 3, 5, 0, 4, 0, 0),ncol=4)
  a=rbind(a,1:4)
  a
  [,1] [,2] [,3] [,4]
 [1,]0120
 [2,]2304
 [3,]0030
 [4,]0350
 [5,]1234
 
  minOfColumns=apply(a, 2, function(x) min(x[x!=0]) )
  minOfColumns
 [1] 1 1 2 4
  maxOfColumns=apply(a, 2, function(x) max(x) )
  maxOfColumns
 [1] 2 3 5 4
 

 How looks like the command for the minimum of the rows? I tried several
 possibilities with apply but did not get the wanted result. The result
 should be minOfRows = 0 0 0 0 1

 Thanks,
 Corinna



-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] minimum of each row in a matrix

2007-05-08 Thread Vladimir Eremeev

Looks like you are reading manuals and these mailings insufficiently
carefully.

?apply says that if its second argument is 1, it gives you what you want.

Gabor Csardi has also written you this.

If you have several vectors, not a single matrix, you can use pmin:
pmin(a[1,],a[2,],a[3,],a[4,],a[5,])
However, this variant is inefficient in this particular case (lots of
typing, and hardcoded).


Schmitt, Corinna wrote:
 
 Hallo,
 
 I just followed the discussion of the title: minimum from matrix.
 I have a similar problem.
 
 a=matrix(c(0,2, 0, 0, 1, 3, 0, 3, 2, 0, 3, 5, 0, 4, 0, 0),ncol=4)
 a=rbind(a,1:4)
 a
  [,1] [,2] [,3] [,4]
 [1,]0120
 [2,]2304
 [3,]0030
 [4,]0350
 [5,]1234

 minOfColumns=apply(a, 2, function(x) min(x[x!=0]) )
 minOfColumns
 [1] 1 1 2 4
 maxOfColumns=apply(a, 2, function(x) max(x) )
 maxOfColumns
 [1] 2 3 5 4

 
 How looks like the command for the minimum of the rows? I tried several
 possibilities with apply but did not get the wanted result. The result
 should be minOfRows = 0 0 0 0 1
 
 

-- 
View this message in context: 
http://www.nabble.com/minimum-of-each-row-in-a-matrix-tf3709616.html#a10375966
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] minimum of each row in a matrix

2007-05-08 Thread Gabor Csardi
Corinna, 

what about reading my previous mail, which you suggested to forget?

Gabor

ps. it is apply(a, 1, min)

On Tue, May 08, 2007 at 03:21:26PM +0200, Schmitt, Corinna wrote:
 Hallo,
 
 I just followed the discussion of the title: minimum from matrix.
 I have a similar problem.
 
  a=matrix(c(0,2, 0, 0, 1, 3, 0, 3, 2, 0, 3, 5, 0, 4, 0, 0),ncol=4)
  a=rbind(a,1:4)
  a
  [,1] [,2] [,3] [,4]
 [1,]0120
 [2,]2304
 [3,]0030
 [4,]0350
 [5,]1234
 
  minOfColumns=apply(a, 2, function(x) min(x[x!=0]) )
  minOfColumns
 [1] 1 1 2 4
  maxOfColumns=apply(a, 2, function(x) max(x) )
  maxOfColumns
 [1] 2 3 5 4
 
 
 How looks like the command for the minimum of the rows? I tried several
 possibilities with apply but did not get the wanted result. The result
 should be minOfRows = 0 0 0 0 1
 
 Thanks,
 Corinna
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Csardi Gabor [EMAIL PROTECTED]MTA RMKI, ELTE TTK

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] plots - scatterplots - find index of a point in a list

2007-05-08 Thread gatemaze
Hi,

is it possible to find the index of a point of a plot (e.g. scatterplot) in
an easy way?

Eg.
x - c(1:5); y - c(1:5);
plot(x, y);

On the plot if I move my cursor on top of a point or click on it is it
possible to have its index printed or its exact value? Any clues?

Thanks.

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] minimum of each row in a matrix

2007-05-08 Thread Schmitt, Corinna
Hi Peter,

thanks for the quick help. Yes it does what I want.

Thanks, Corinna



  



Von: Peter Konings [mailto:[EMAIL PROTECTED] 
Gesendet: Dienstag, 8. Mai 2007 15:29
An: Schmitt, Corinna
Betreff: Re: [R] minimum of each row in a matrix



Hi Corinna,

does
apply(a, 1, min)
do what you want?

HTH
Peter.

On 5/8/07, Schmitt, Corinna  [EMAIL PROTECTED]
mailto:[EMAIL PROTECTED]  wrote:

Hallo,

I just followed the discussion of the title: minimum from matrix.
I have a similar problem.

 a=matrix(c(0,2, 0, 0, 1, 3, 0, 3, 2, 0, 3, 5, 0, 4, 0, 0),ncol=4)
 a=rbind(a,1:4)
 a
 [,1] [,2] [,3] [,4]
[1,]0120
[2,]2304
[3,]0030
[4,]0350
[5,]1234

 minOfColumns=apply(a, 2, function(x) min(x[x!=0]) ) 
 minOfColumns
[1] 1 1 2 4
 maxOfColumns=apply(a, 2, function(x) max(x) )
 maxOfColumns
[1] 2 3 5 4


How looks like the command for the minimum of the rows? I tried several
possibilities with apply but did not get the wanted result. The result 
should be minOfRows = 0 0 0 0 1

Thanks,
Corinna

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code. 




[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] plot time series

2007-05-08 Thread jessica . gervais

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plots - scatterplots - find index of a point in a list

2007-05-08 Thread John Kane
Try ?locator
--- [EMAIL PROTECTED] wrote:

 Hi,
 
 is it possible to find the index of a point of a
 plot (e.g. scatterplot) in
 an easy way?
 
 Eg.
 x - c(1:5); y - c(1:5);
 plot(x, y);
 
 On the plot if I move my cursor on top of a point or
 click on it is it
 possible to have its index printed or its exact
 value? Any clues?
 
 Thanks.
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot time series

2007-05-08 Thread Roland Rau
Hi Jessica

[EMAIL PROTECTED] wrote:
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 

your email is not really very informative.
Did you read the posting guide?
As suggested there, did you read the relevant section in 'An 
Introduction to R' (hint: did you look at section 12.1.1?)?
If you have still problems after these steps, it might be useful to 
provide a commented, minimal, self-contained, reproducible code example.

Best,
Roland

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Weighted least squares

2007-05-08 Thread Adaikalavan Ramasamy
Sorry, you did not explain that your weights correspond to your 
frequency in the original post. I assumed they were repeated 
measurements with within group variation.

I was merely responding to your query why the following differed.
summary(lm(y ~ x, data=df, weights=rep(2, 100)))
summary(lm(y ~ x, data=rbind(df,df)))

Let me also clarify my statement about artificial. If one treats 
repeated observations as independent, then they obtain estimates with 
inflated precision. I was not calling your data artificial in any way.

Using frequency as weights may be valid. Your data points appear to 
arise from discrete distribution, so I am not entirely sure if you can 
use the linear model which assumes the errors are normally distributed.

Regards, Adai



hadley wickham wrote:
 On 5/8/07, Adaikalavan Ramasamy [EMAIL PROTECTED] wrote:
 See below.

 hadley wickham wrote:
  Dear all,
 
  I'm struggling with weighted least squares, where something that I had
  assumed to be true appears not to be the case.  Take the following
  data set as an example:
 
  df - data.frame(x = runif(100, 0, 100))
  df$y - df$x + 1 + rnorm(100, sd=15)
 
  I had expected that:
 
  summary(lm(y ~ x, data=df, weights=rep(2, 100)))
  summary(lm(y ~ x, data=rbind(df,df)))

 You assign weights to different points according to some external
 quality or reliability measure not number of times the data point was
 measured.
 
 That is one type of weighting - but what if I have already aggregated
 data?  That is a perfectly valid type of weighting too.
 
 Look at the estimates and standard error of the two models below:

   coefficients( summary(f.w - lm(y ~ x, data=df, weights=rep(2, 100))) )
   Estimate Std. Error   t value Pr(|t|)
   (Intercept) 1.940765 3.30348066  0.587491 5.582252e-01
   x   0.982610 0.05893262 16.673448 2.264258e-30

   coefficients( summary( f.u - lm(y ~ x, data=rbind(df,df) ) ) )
   Estimate Std. Errort value Pr(|t|)
   (Intercept) 1.940765 2.32408609  0.8350659 4.046871e-01
   x   0.982610 0.04146066 23.6998165 1.012067e-59

 You can see that they have same coefficient estimates but the second one
   has smaller variances.

 The repeated values artificially deflates the variance and thus inflates
 the precision. This is why you cannot treat replicate data as
 independent observations.
 
 Hardly artificially - I have repeated observations.
 
  would be equivalent, but they are not.  I suspect the difference is
  how the degrees of freedom is calculated - I had expected it to be
  sum(weights), but seems to be sum(weights  0).  This seems
  unintuitive to me:
 
  summary(lm(y ~ x, data=df, weights=rep(c(0,2), each=50)))
  summary(lm(y ~ x, data=df, weights=rep(c(0.01,2), each=50)))
 
  What am I missing?  And what is the usual way to do a linear
  regression when you have aggregated data?

 I would be best to use the individual data points instead of aggregated
 data as it allows you to estimate the within-group variations as well.
 
 There is no within group variation - these are observations that occur
 with same values many times in the dataset, so have been aggregated
 into the a contingency table-like format.
 
 If you had individual data points, you could try something as follows.
 Please check the codes as I am no expert in the area of repeated 
 measures.

   x  - runif(100, 0, 100)
   y1 - x + rnorm(100, mean=1, sd=15)
   y2 - y1 + rnorm(100, sd=5)

   df - data.frame( y=c(y1, y2),
 x=c(x,x),
 subject=factor(rep( paste(p, 1:100, sep=), 2 ) ))

   library(nlme)
   summary( lme( y ~ x, random = ~ 1 | subject, data=df ) )

 Try reading Pinheiro and Bates (http://tinyurl.com/yvvrr7) or related
 material for more information. Hope this helps.
 
 I'm not interested in a mixed model, and I don't have individual data 
 points.
 
 Hadley
 
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] getting informative error messages

2007-05-08 Thread Spencer Graves
Hi, Tony: 

  Are you familiar with the 'debug' command?  I agree that more 
informative error messages and 'traceback' would be nice, but I've found 
the 'debug' facility quite useful.  [I even sometimes prepare a shell of 
a function 'fn', then say debug(fn) and fn(), and complete writing the 
function in its native environment where I can more easily check what 
each step does.]  I've heard that 'debug' does not work will with S4 
class generics, but I have not so far had to deal with that.  {There is 
also a 'debug' package, which is completely separate from the debug 
command in the 'base' package.  I've heard that it has more extensive 
capabilities, but I've never used it.}

  I suspect you may already know 'debug', but for those who don't, I 
think it's worth noting its utility for this kind of thing. 

  Hope this helps. 
  Spencer Graves

Tony Plate wrote:
 Certain errors seem to generate messages that are less informative than 
 most -- they just tell you which function an error happened in, but 
 don't indicate which line or expression the error occurred in.

 Here's a toy example:

   f - function(x) {a - 1; y - x[list(1:3)]; b - 2; return(y)}
   options(error=NULL)
   f(1:3)
 Error in f(1:3) : invalid subscript type
   traceback()
 1: f(1:3)
  

 In this function, it's clear that the error is in subscripting 'x', but 
 it's not always so immediately obvious in lengthier functions.

 Is there anything I can do to get a more informative error message in 
 this type of situation?  I couldn't find any help in the section 
 Debugging R Code in R-exts (or anything at all relevant in R-intro).

 (Different values for options(error=...) and different formatting of the 
 function made no difference.)

 -- Tony Plate

   sessionInfo()
 R version 2.5.0 (2007-04-23)
 i386-pc-mingw32

 locale:
 LC_COLLATE=English_United States.1252;LC_CTYPE=English_United 
 States.1252;LC_MONETARY=English_United 
 States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods
 [7] base

 other attached packages:
 tap.misc
 1.0
  

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Bad optimization solution

2007-05-08 Thread RAVI VARADHAN
Paul,

The problem lies neither with R nor with numercial methods.  The onus is always 
on the user to understand what the numerical schemes can do and what they can't 
do.  One should never blindly take the results given by a numerical scheme and 
run with it.  In your example, the optimization method is doing what it was 
designed to do: to find a critical point of a function where the gradient is 
zero.  It is your responsibility to ensure that the result makes sense, and if 
it doesn't, to understand why it doesn't make sense.  In your problem, maxima 
((1,0) and (0,1)) lie on the boundary of the parameter space, and the gradient 
at the maxima (defined as the limit from within the domain) are clearly not 
zero.  Another problem with your example is that the hessian for your function 
is singular, it has eigenvalues of 0 and 2.  In short, there is no substitute 
to using your analytic powers!

Ravi.

- Original Message -
From: Paul Smith [EMAIL PROTECTED]
Date: Tuesday, May 8, 2007 4:33 am
Subject: Re: [R] Bad optimization solution
To: R-help r-help@stat.math.ethz.ch


 It seems that there is here a problem of reliability, as one never
  knows whether the solution provided by R is correct or not. In the
  case that I reported, it is fairly simple to see that the solution
  provided by R (without any warning!) is incorrect, but, in general,
  that is not so simple and one may take a wrong solution as a correct
  one.
  
  Paul
  
  
  On 5/8/07, Ravi Varadhan [EMAIL PROTECTED] wrote:
   Your function, (x1-x2)^2, has zero gradient at all the starting 
 values such
   that x1 = x2, which means that the gradient-based search methods will
   terminate there because they have found a critical point, i.e. a 
 point at
   which the gradient is zero (which can be a maximum or a minimum or 
 a saddle
   point).
  
   However, I do not why optim converges to the boundary maximum, when 
 analytic
   gradient is supplied (as shown by Sundar).
  
   Ravi.
  
   
 
   ---
  
   Ravi Varadhan, Ph.D.
  
   Assistant Professor, The Center on Aging and Health
  
   Division of Geriatric Medicine and Gerontology
  
   Johns Hopkins University
  
   Ph: (410) 502-2619
  
   Fax: (410) 614-9625
  
   Email: [EMAIL PROTECTED]
  
   Webpage:  
  
  
  
   
 
   
  
  
   -Original Message-
   From: [EMAIL PROTECTED]
   [ On Behalf Of Paul Smith
   Sent: Monday, May 07, 2007 6:26 PM
   To: R-help
   Subject: Re: [R] Bad optimization solution
  
   On 5/7/07, Paul Smith [EMAIL PROTECTED] wrote:
 I think the problem is the starting point.  I do not remember the
   details
 of the BFGS method, but I am almost sure the (.5, .5) starting 
 point is
 suspect, since the abs function is not differentiable at 0.  If 
 you
   perturb
 the starting point even slightly you will have no problem.

  Paul Smith
  [EMAIL PROTECTED]
  
   To
  Sent by:  R-help 
 r-help@stat.math.ethz.ch
  [EMAIL PROTECTED]
   cc
  at.math.ethz.ch

   Subject
[R] Bad optimization solution
  05/07/2007 04:30
  PM








 Dear All

 I am trying to perform the below optimization problem, but getting
 (0.5,0.5) as optimal solution, which is wrong; the correct solution
 should be (1,0) or (0,1).

 Am I doing something wrong? I am using R 2.5.0 on Fedora Core 6 
 (Linux).

 Thanks in advance,

 Paul

 --
 myfunc - function(x) {
   x1 - x[1]
   x2 - x[2]
   abs(x1-x2)
 }


   
 optim(c(0.5,0.5),myfunc,lower=c(0,0),upper=c(1,1),method=L-BFGS-B,control=
   list(fnscale=-1))
   
Yes, with (0.2,0.9), a correct solution comes out. However, how can
one be sure in general that the solution obtained by optim is correct?
In ?optim says:
   
 Method 'L-BFGS-B' is that of Byrd _et. al._ (1995) which allows
 _box constraints_, that is each variable can be given a lower
 and/or upper bound. The initial value must satisfy the
 constraints. This uses a limited-memory modification of the 
 BFGS
 quasi-Newton method. If non-trivial bounds are supplied, this
 method will be selected, with a warning.
   
which only demands that the initial value must satisfy the constraints.
  
   Furthermore, X^2 is everywhere differentiable and notwithstanding the
   reported problem occurs with
  
   myfunc - function(x) {
 x1 - x[1]
 x2 - x[2]
 (x1-x2)^2
   }
  
   
 optim(c(0.2,0.2),myfunc,lower=c(0,0),upper=c(1,1),method=L-BFGS-B,control=
   list(fnscale=-1))
  
   Paul
  
   

[R] strange behavior in data frames with duplicated column names

2007-05-08 Thread William Revelle
Dear R gurus,

There is an interesting problem with accessing specific items in a 
column of data frame that has incorrectly been given a duplicate 
name, even though addressing the item by row and column number.
Although the column is correctly listed, an item addressed by row and 
column number gives the item with the correct row and the original 
not the duplicated column.

Here are the instructions to get this problem

x - matrix(seq(1:12),ncol=3)
colnames(x) - c(A,B,A)   #a redundant name for column 2
x.df - data.frame(x)
x.df#the redundant name is corrected
x.df[,3]#show the column -- this always works
x.df[2,3]   #this works here
#now incorrectly label the columns with a duplicate name
colnames(x.df) - c(A,B,A)  #the redundant name is not detected
x.df
x.df[,3] #this works as above and shows the column
x.df[2,3]#but this gives the value of the first column, not the third  ---
rownames(x.df) - c(First,Second,Third,Third)  #detects duplicate name
x.df
x.df[4,] #correct second row and corrected column names!
x.df[4,3]#wrong column
x.df #still has the original names with the duplication


and corresponding output:

  x - matrix(seq(1:12),ncol=3)
  colnames(x) - c(A,B,A)   #a redundant name for column 2
  x.df - data.frame(x)
  x.df#the redundant name is corrected
   A B A.1
1 1 5   9
2 2 6  10
3 3 7  11
4 4 8  12
  x.df[,3]#show the column -- this always works
[1]  9 10 11 12
  x.df[2,3]   #this works here
[1] 10
  #now incorrectly label the columns with a duplicate name
  colnames(x.df) - c(A,B,A)  #the redundant name is not detected
  x.df
   A B  A
1 1 5  9
2 2 6 10
3 3 7 11
4 4 8 12
  x.df[,3] #this works as above and shows the column
[1]  9 10 11 12
  x.df[2,3]#but this gives the value of the first column, not the 
third  ---
[1] 2
  rownames(x.df) - c(First,Second,Third,Third)  #detects 
duplicate name
Error in `row.names-.data.frame`(`*tmp*`, value = c(First, Second,  :
duplicate 'row.names' are not allowed
  x.df
   A B  A
1 1 5  9
2 2 6 10
3 3 7 11
4 4 8 12
  x.df[4,] #correct second row and corrected column names!
   A B A.1
4 4 8  12
  x.df[4,3]#wrong column
[1] 4
  x.df #still has the original names with the duplication

  unlist(R.Version())
  platform 
archos
  i386-apple-darwin8.9.1 
i386 darwin8.9.1
system 
status major
   i386, darwin8.9.1 
Patched   2
 minor 
year month
 5.0 
2007  04
   day 
svn rev  language
  25 
41315   R
version.string
R version 2.5.0 Patched (2007-04-25 r41315)



Bill

-- 
William Revelle http://personality-project.org/revelle.html
Professor   http://personality-project.org/personality.html
Department of Psychology   http://www.wcas.northwestern.edu/psych/
Northwestern University http://www.northwestern.edu/
Use R for statistics: http://personality-project.org/r

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Looking for a comprehensive descriptive statistics package

2007-05-08 Thread David Kaplan
Hi all,

I'm looking for a package that will give me comprehensive set of 
descriptive statistics, including skew and kurtosis.  Also, is there a 
similar package that will provide multivariate descriptive statistics as 
well?  Thanks in advance,

David


-- 
===
David Kaplan, Ph.D.
Professor
Department of Educational Psychology
University of Wisconsin - Madison
Educational Sciences, Room, 1061
1025 W. Johnson Street
Madison, WI 53706

email: [EMAIL PROTECTED]
homepage: http://www.education.wisc.edu/edpsych/facstaff/kaplan/kaplan.htm
Phone: 608-262-0836

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Weighted least squares

2007-05-08 Thread John Fox
Dear Hadley,

I think that the problem is that the term weights has different meanings,
which, although they are related, are not quite the same. 

The weights used by lm() are (inverse-)variance weights, reflecting the
variances of the errors, with observations that have low-variance errors
therefore being accorded greater weight in the resulting WLS regression.
What you have are sometimes called case weights, and I'm unaware of a
general way of handling them in R, although you could regenerate the
unaggregated data. As you discovered, you get the same coefficients with
case weights as with variance weights, but different standard errors.
Finally, there are sampling weights, which are inversely proportional to
the probability of selection; these are accommodated by the survey package. 

To complicate matters, this terminology isn't entirely standard.

I hope this helps,
 John


John Fox, Professor
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of hadley wickham
 Sent: Tuesday, May 08, 2007 5:09 AM
 To: R Help
 Subject: [R] Weighted least squares
 
 Dear all,
 
 I'm struggling with weighted least squares, where something 
 that I had assumed to be true appears not to be the case.  
 Take the following data set as an example:
 
 df - data.frame(x = runif(100, 0, 100)) df$y - df$x + 1 + 
 rnorm(100, sd=15)
 
 I had expected that:
 
 summary(lm(y ~ x, data=df, weights=rep(2, 100))) summary(lm(y 
 ~ x, data=rbind(df,df)))
 
 would be equivalent, but they are not.  I suspect the 
 difference is how the degrees of freedom is calculated - I 
 had expected it to be sum(weights), but seems to be 
 sum(weights  0).  This seems unintuitive to me:
 
 summary(lm(y ~ x, data=df, weights=rep(c(0,2), each=50))) 
 summary(lm(y ~ x, data=df, weights=rep(c(0.01,2), each=50)))
 
 What am I missing?  And what is the usual way to do a linear 
 regression when you have aggregated data?
 
 Thanks,
 
 Hadley
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plots - scatterplots - find index of a point in a list

2007-05-08 Thread gatemaze
On 08/05/07, John Kane [EMAIL PROTECTED] wrote:

 Try ?locator


Thanks. Your tip also lead to another function: ?identify to add my two
cents.

--- [EMAIL PROTECTED] wrote:

  Hi,
 
  is it possible to find the index of a point of a
  plot (e.g. scatterplot) in
  an easy way?
 
  Eg.
  x - c(1:5); y - c(1:5);
  plot(x, y);
 
  On the plot if I move my cursor on top of a point or
  click on it is it
  possible to have its index printed or its exact
  value? Any clues?
 
  Thanks.
 
[[alternative HTML version deleted]]
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained,
  reproducible code.
 



   Be smarter than spam. See how smart SpamGuard is at giving junk
 email the boot with the All-new Yahoo! Mail at
 http://mrd.mail.yahoo.com/try_beta?.intl=ca



[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Bad optimization solution

2007-05-08 Thread S Ellison
Paul,
You have picked a function that is not smoothly differentiable and also started 
at one of many 'stationary' points in a system with multiple solutions. In 
practice, I think it'll get a zero gradient as the algorithm does things 
numerically and you have a symmetric function. It probably then chooses 
gradient-related step sizes of zero and goes nowhere, converging instantly. The 
same happens at (0.1,0.1) and anywhere else along x=y.

The problem affects pretty much all gradient-only algorithms handed stationary 
points in a symmetric function.

Solution? Ermm.. don't do that with a gradient method, I suspect, though 
wiser heads may have more to say on the topic.

S

 Paul Smith [EMAIL PROTECTED] 07/05/2007 22:30:32 
Dear All

I am trying to perform the below optimization problem, but getting
(0.5,0.5) as optimal solution, which is wrong; the correct solution
should be (1,0) or (0,1).

Am I doing something wrong? I am using R 2.5.0 on Fedora Core 6 (Linux).

Thanks in advance,

Paul

--
myfunc - function(x) {
  x1 - x[1]
  x2 - x[2]
  abs(x1-x2)
}

optim(c(0.5,0.5),myfunc,lower=c(0,0),upper=c(1,1),method=L-BFGS-B,control=list(fnscale=-1))

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help 
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html 
and provide commented, minimal, self-contained, reproducible code.

***
This email and any attachments are confidential. Any use, co...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Bad optimization solution

2007-05-08 Thread Paul Smith
Thanks, Ravi, for your clear explanation!

Paul


On 5/8/07, RAVI VARADHAN [EMAIL PROTECTED] wrote:
 Paul,

 The problem lies neither with R nor with numercial methods.  The onus is 
 always on the user to understand what the numerical schemes can do and what 
 they can't do.  One should never blindly take the results given by a 
 numerical scheme and run with it.  In your example, the optimization method 
 is doing what it was designed to do: to find a critical point of a function 
 where the gradient is zero.  It is your responsibility to ensure that the 
 result makes sense, and if it doesn't, to understand why it doesn't make 
 sense.  In your problem, maxima ((1,0) and (0,1)) lie on the boundary of the 
 parameter space, and the gradient at the maxima (defined as the limit from 
 within the domain) are clearly not zero.  Another problem with your example 
 is that the hessian for your function is singular, it has eigenvalues of 0 
 and 2.  In short, there is no substitute to using your analytic powers!

 Ravi.

 - Original Message -
 From: Paul Smith [EMAIL PROTECTED]
 Date: Tuesday, May 8, 2007 4:33 am
 Subject: Re: [R] Bad optimization solution
 To: R-help r-help@stat.math.ethz.ch


  It seems that there is here a problem of reliability, as one never
   knows whether the solution provided by R is correct or not. In the
   case that I reported, it is fairly simple to see that the solution
   provided by R (without any warning!) is incorrect, but, in general,
   that is not so simple and one may take a wrong solution as a correct
   one.
 
   Paul
 
 
   On 5/8/07, Ravi Varadhan [EMAIL PROTECTED] wrote:
Your function, (x1-x2)^2, has zero gradient at all the starting
  values such
that x1 = x2, which means that the gradient-based search methods will
terminate there because they have found a critical point, i.e. a
  point at
which the gradient is zero (which can be a maximum or a minimum or
  a saddle
point).
   
However, I do not why optim converges to the boundary maximum, when
  analytic
gradient is supplied (as shown by Sundar).
   
Ravi.
   

  
---
   
Ravi Varadhan, Ph.D.
   
Assistant Professor, The Center on Aging and Health
   
Division of Geriatric Medicine and Gerontology
   
Johns Hopkins University
   
Ph: (410) 502-2619
   
Fax: (410) 614-9625
   
Email: [EMAIL PROTECTED]
   
Webpage:
   
   
   

  

   
   
-Original Message-
From: [EMAIL PROTECTED]
[ On Behalf Of Paul Smith
Sent: Monday, May 07, 2007 6:26 PM
To: R-help
Subject: Re: [R] Bad optimization solution
   
On 5/7/07, Paul Smith [EMAIL PROTECTED] wrote:
  I think the problem is the starting point.  I do not remember the
details
  of the BFGS method, but I am almost sure the (.5, .5) starting
  point is
  suspect, since the abs function is not differentiable at 0.  If
  you
perturb
  the starting point even slightly you will have no problem.
 
   Paul Smith
   [EMAIL PROTECTED]
   
To
   Sent by:  R-help 
  r-help@stat.math.ethz.ch
   [EMAIL PROTECTED]
cc
   at.math.ethz.ch
 
Subject
 [R] Bad optimization solution
   05/07/2007 04:30
   PM
 
 
 
 
 
 
 
 
  Dear All
 
  I am trying to perform the below optimization problem, but getting
  (0.5,0.5) as optimal solution, which is wrong; the correct solution
  should be (1,0) or (0,1).
 
  Am I doing something wrong? I am using R 2.5.0 on Fedora Core 6
  (Linux).
 
  Thanks in advance,
 
  Paul
 
  --
  myfunc - function(x) {
x1 - x[1]
x2 - x[2]
abs(x1-x2)
  }
 
 

  optim(c(0.5,0.5),myfunc,lower=c(0,0),upper=c(1,1),method=L-BFGS-B,control=
list(fnscale=-1))

 Yes, with (0.2,0.9), a correct solution comes out. However, how can
 one be sure in general that the solution obtained by optim is correct?
 In ?optim says:

  Method 'L-BFGS-B' is that of Byrd _et. al._ (1995) which allows
  _box constraints_, that is each variable can be given a lower
  and/or upper bound. The initial value must satisfy the
  constraints. This uses a limited-memory modification of the
  BFGS
  quasi-Newton method. If non-trivial bounds are supplied, this
  method will be selected, with a warning.

 which only demands that the initial value must satisfy the 
  constraints.
   
Furthermore, X^2 is everywhere differentiable and 

Re: [R] Weighted least squares

2007-05-08 Thread S Ellison
Doubling the length of the data doubles the apparent number of observations. 
You would expect the standard error to reduce by sqrt(2) (which it just about 
does, though I'm not clear on why its not exact here)

Weights are not as simple as they look. You have given all your data the same 
weight, so the answer is independent of the weights (!). Try again with 
weights=rep(4,100) etc. Equal weights simply cancel out in the lm process. In 
fact, some linear regression algorithms rescale all weights to sum to 1; in 
others, weights are scaled to average 1; done 'naturally' the weights simply 
appear in two places which cancel out in the final covariance matrix 
calculation (eg in the weighted 'residual sd' and in the hessian for the 
chi-squared function, if I remember correctly). 

Bottom line - equal weights make no difference in lm, so choose what you like. 
1 is a good number, though.

Steve e

 hadley wickham [EMAIL PROTECTED] 08/05/2007 10:08:34 
Dear all,

I'm struggling with weighted least squares, where something that I had
assumed to be true appears not to be the case.  Take the following
data set as an example:


***
This email and any attachments are confidential. Any use, co...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] getting informative error messages

2007-05-08 Thread Prof Brian Ripley
It is not clear to me what you want here.

Errors are tagged by a 'call', and f(1:3) is the innermost 'call' (special 
primitives do not set a context and so do not count if you consider '[' 
to be a function).

The message could tell you what the type was, but it does not and we have 
lost the pool of active contributors we once had to submit tested patches 
for things like that.


On Mon, 7 May 2007, Tony Plate wrote:

 Certain errors seem to generate messages that are less informative than
 most -- they just tell you which function an error happened in, but
 don't indicate which line or expression the error occurred in.

 Here's a toy example:

  f - function(x) {a - 1; y - x[list(1:3)]; b - 2; return(y)}
  options(error=NULL)
  f(1:3)
 Error in f(1:3) : invalid subscript type
  traceback()
 1: f(1:3)
 

 In this function, it's clear that the error is in subscripting 'x', but
 it's not always so immediately obvious in lengthier functions.

 Is there anything I can do to get a more informative error message in
 this type of situation?  I couldn't find any help in the section
 Debugging R Code in R-exts (or anything at all relevant in R-intro).

 (Different values for options(error=...) and different formatting of the
 function made no difference.)

 -- Tony Plate

  sessionInfo()
 R version 2.5.0 (2007-04-23)
 i386-pc-mingw32

 locale:
 LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
 States.1252;LC_MONETARY=English_United
 States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods
 [7] base

 other attached packages:
 tap.misc
1.0
 

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] The match of ave() for FUN=SD

2007-05-08 Thread Felix Wave
Hello,
I have many subsets of x. I want to get the standard deviation
for each subset with the same factor levels. For FUN=mean and
FUN=median I am using ave(). 
Can anybody tell me the match of ave() for using FUN=SD?

At the beginning I used aggregate(), also for mean and median. But 
aggregegate make arithmetical errors in computing huge records.


Thank's a lot. 

Felix



R-CODE:
---
MEAN   - ave(INPUT[,3], INPUT[,1], INPUT[,2], FUN = mean)
mMEAN  - matrix( c(INPUT[,1], INPUT[,2], MEAN), ncol=3, byrow=FALSE)


SD  - na.omit( aggregate(INPUT[,3], by=list(INPUT[,2],INPUT[,1] ), FUN=sd 
) ) 
mSD - matrix( c( SD[,2],SD[,1],  SD[,3]), ncol=3, 
byrow=FALSE)
mSD[,1] - (mSD[,1] - 1 )#eleminate the converting-fault list- 
matrix
mSD[,2] - (mSD[,2] - 1 ) / 10   #eleminate the converting-fault list- 
matrix



the arithmetical errors:

list - matrix
0   - 11
1   - 2
2.1 - 22
3.5 - 36

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Question on bivariate GEE fit

2007-05-08 Thread souvik banerjee
Hi,
 I have a bivariate longitudinal dataset.  As an example say,
i have the data frame with column names

var1  var2  Unit  time  trt

(trt represents the treatment)

 Now suppose I want to fit a joint model of the form for the *i* th unit
 var1jk = alpha1 + beta1*timejk  + gamma1* trtjk + delta1* timejk:trtjk +
error1jk

 var2 = alpha2 + beta2*timejk  + gamma2* trtjk + delta2* timejk:trtjk +
error2jk

where j index time and k index the treatment received

Using indicator variables I have been able to fit and run the code for
a bivariate model using unstructured covariance matrix. However,
I want to fit a model for a structured variance covariance matrix.
The error structure for the grouping unit is as follows

sigma = ( sigma1  sigma12 )
 ( sigma12   sigma2)

sigma1, sigma2 and sigma12 are matrices with
where
sigma1 = sig1 * AR1(rho1)
sigma2 = sig2* AR1(rho2)
sigma12 = sig12 * AR1(rho12)

My question is whether there is any method to fit such data using
packages like gee or geepack (or may be any other package ) in R.  The
function  genZcor() of geepack can be used to construct correlation but
I have been unable to use it in the present context.
Any help is greatly appreciated.

Regards
Souvik Banerjee
Lecturer
Department of statistics
Memari College
Burdwan
India

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot time series

2007-05-08 Thread jessica . gervais
Dear All,

I sended my first mail as HTML by accident.
It has probably been stripped off... (see first mail below)
During that time, I was actually able to find a solution to my problem : I
wanted to plot times on a graph representing precipitation=f(time)


here is an example:

time-c(2004-10-18 17:20:00,2004-10-18 17:50:00 ,2004-10-18
18:40:00,2004-10-18 19:50:00,2004-10-18 20:00:00 ,2004-10-18
20:10:00,2004-10-18 21:20:00 ,2004-10-19 22:00:00 ,2004-10-20
23:40:00)

precipitation-c(0.1,0.5,0.0,0.8,1,2,5,9,1)

tt-strptime(time,%Y-%m-%d %H:%M:%S)
plot(tt,precipitation,xlab=time, xaxt=n)
r-as.POSIXct(round(range(tt),hours))
axis.POSIXct(1,tt,at=seq(r[1],r[2],by=hour),format=%Y-%m-%d %H:%M)

If you have better solution, I would be happy to know them,

Thanks in advance,

Jessica


-Jessica Gervais/CRTE/TUDOR wrote: -

To: R-help@stat.math.ethz.ch
From: Jessica Gervais/CRTE/TUDOR
Date: 05/08/2007 03:37PM
Subject: plot time series

Dear all,

I have a question concerning plotting time measurements.

I have a time serie which record precipitation at different time steps for
different meteo sations.
Data are stored into a table :
first column is time (time steps between 2 measurement are variables)
secondcolumn is the measurement
I would like to plot precipitation=f(time) and write time in axis. I can
not use the plot.ts function as time steps between 2 measurements are
variables. I don't want to do a plot.ts as time steps are variable and also
as I
would like the different times to be written on the x-axis.I also would
like space
between each time step on the x-axis to be representativ of the real
amount of time.

Here is a example

time-c(2004-10-18 17:20:00,2004-10-18 17:50:00 ,2004-10-18
18:40:00,2004-10-18 19:50:00,2004-10-18 20:00:00 ,2004-10-18
20:10:00,2004-10-18 21:20:00 ,2004-10-18 22:00:00 ,2004-10-18
23:40:00)

(is it automatically recognized as a time format?)

precipitation-c(0.1,0.5,0.0,0.8,1,2,5,0.2,3)

plot(time,precipitation)
Error in plot.window(xlim, ylim, log, asp, ...) :
need finite 'xlim' values
In addition: Warning messages:
1: NAs introduced by coercion
2: no non-missing arguments to min; returning Inf
3: no non-missing arguments to max; returning -Inf

Do anyone knows how to plot this kind os time dependant datas ?

Thanks in advance,

Regards,

Jessica

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help with systemfit

2007-05-08 Thread adschai
Hi - I encounter two problems with
SystemFit. I have a matrix of 20 variables (38 observations each).
I am trying to fit using 2SLS in Systemfit because I want to be able
to simulate new observations with the resulting model. The first
problem I found is that it ran out of memory given that I have 2GB of
RAM. How can I circumvent this? Can I do some mixing or bootstrapping
with small sample each time instead?Another problem is I got
system is computationally singular when I reduce the size of
observation to 10 instead of 38. Any help would be appreciated.
I think my variables are not really well correlated to that extent. adschai

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] minimum from matrix

2007-05-08 Thread Oarabile Molaodi
Thanks it worked.
Oarabile

Gabor Csardi wrote:

apply(a, 2, function(x) min(x[x!=0]) )

should do it. Might need some improvement if all numbers in a column
can be zero, try it.

Gabor

On Tue, May 08, 2007 at 09:50:43AM +0100, [EMAIL PROTECTED] wrote:
  

I have a very large matrix with columns that have some of their
entries as zero


A small example if a=

 [,1]  [,2]  [,3]  [,4]
 [,1] 0 2 0 0
 [,2] 1 3 0 3
 [,3] 2 0 3 5
 [,4] 0 4 0 0

and what to get the minimum number from each column but that number
should not be zero. If I use apply (a,2,min) I will get a vector of
zeros as the minimum but what I want it for example from column 1 I
should get 1 i.e for all the matrix I should get a vector (1,2,3,3). I
wonder if someone can give an idea on how to go about it.

thanks in advance for your help.

Oarabile

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



  


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] minimum from matrix

2007-05-08 Thread Oarabile Molaodi
Thanks it worked

Oarabile

Dimitris Rizopoulos wrote:

 try this:

 apply(a, 2, function(x) min(x[x  0]))


 I hope it helps.

 Best,
 Dimitris

 
 Dimitris Rizopoulos
 Ph.D. Student
 Biostatistical Centre
 School of Public Health
 Catholic University of Leuven

 Address: Kapucijnenvoer 35, Leuven, Belgium
 Tel: +32/(0)16/336899
 Fax: +32/(0)16/337015
 Web: http://med.kuleuven.be/biostat/
 http://www.student.kuleuven.be/~m0390867/dimitris.htm


 - Original Message - From: [EMAIL PROTECTED]
 To: r-help@stat.math.ethz.ch
 Sent: Tuesday, May 08, 2007 10:50 AM
 Subject: [R] minimum from matrix




 I have a very large matrix with columns that have some of their
 entries as zero


 A small example if a=

 [,1]  [,2]  [,3]  [,4]
 [,1] 0 2 0 0
 [,2] 1 3 0 3
 [,3] 2 0 3 5
 [,4] 0 4 0 0

 and what to get the minimum number from each column but that number
 should not be zero. If I use apply (a,2,min) I will get a vector of
 zeros as the minimum but what I want it for example from column 1 I
 should get 1 i.e for all the matrix I should get a vector (1,2,3,3). I
 wonder if someone can give an idea on how to go about it.

 thanks in advance for your help.

 Oarabile

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] help with memory problem in SystemFit

2007-05-08 Thread adschai
Hi - I encounter two problems with SystemFit. I have a matrix of 20 variables 
(38 observations each). I am trying to fit using 2SLS in Systemfit 
because I want to be able to simulate new observations with the resulting 
model. The first problem I found is that it ran out of memory given that I have 
2GB of RAM. How can I circumvent this? Can I do some mixing or bootstrapping 
with small sample each time instead?Another problem is I got system is 
computationally singular when I reduce the size of observation to 10 
instead of 38. Any help would be appreciated. I think my variables are not 
really well correlated to that extent. adschai

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ordered logistic regression with random effects. Howto?

2007-05-08 Thread Cody_Hamilton

Paul,

I believe the model you describe below can be fitted in GENMOD and GLIMMIX
in SAS.

Alternatively, as Brian Ripley suggests, you could use MCMC.  BUGS has a
nice example of a multinomial logit model in the second example manual.
While this example considers only fixed effects, it's not difficult to
extend the model to include a random effect (see the 'Seeds' example in the
first example manual).

Regards,
   -Cody



   
 Paul Johnson
 [EMAIL PROTECTED] 
 .com  To 
 Sent by:  r-help@stat.math.ethz.ch
 [EMAIL PROTECTED]  cc 
 at.math.ethz.ch   
   Subject 
   [R] ordered logistic regression 
 05/07/2007 07:03  with random effects. Howto? 
 PM
   
   
   
   
   




I'd like to estimate an ordinal logistic regression with a random
effect for a grouping variable.   I do not find a pre-packaged
algorithm for this.  I've found methods glmmML (package: glmmML) and
lmer (package: lme4) both work fine with dichotomous dependent
variables. I'd like a model similar to polr (package: MASS) or lrm
(package: Design) that allows random effects.

I was thinking there might be a trick that might allow me to use a
program written for a dichotomous dependent variable with a mixed
effect to estimate such a model.  The proportional odds logistic
regression is often written as a sequence of dichotomous comparisons.
But it seems to me that, if it would work, then somebody would have
proposed it already.

I've found some commentary about methods of fitting ordinal logistic
regression with other procedures, however, and I would like to ask for
your advice and experience with this. In this article,

Ching-Fan Sheu, Fitting mixed-effects models for repeated ordinal
outcomes with the NLMIXED procedure Behavior Research Methods,
Instruments,  Computers, 2002, 34(2): 151-157.

the other gives an approach that works in SAS's NLMIXED procedure.  In
this approach, one explicitly writes down the probability that each
level will be achieved (using the linear predictor and constants for
each level).  I THINK I could find a way to translate this approach
into a model that can be fitted with either nlme or lmer.  Has someone
done it?

What other strategies for fitting mixed ordinal models are there in R?

Finally, a definitional question.  Is a multi-category logistic
regression (either ordered or not) a member of the glm family?  I had
thought the answer is no, mainly because glm and other R functions for
glms never mention multi-category qualitative dependent variables and
also because the distribution does not seem to fall into the
exponential family.  However, some textbooks do include the
multicategory models in the GLM treatment.


--
Paul E. Johnson
Professor, Political Science
1541 Lilac Lane, Room 504
University of Kansas

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] creating a new column

2007-05-08 Thread raymond chiruka
thanks for the help But l ened up using
  
act.surv.time-pmin(m[,censoringTime],m[,survivalTime])
  
m-cbind(m,act.surv.time)
which seems to work
thanks

  

Schmitt, Corinna [EMAIL PROTECTED] wrote:  Hallo,
I just tried, if my solution might be possible for you. Here is the result:

 s=2
while(s!=0){ n=20
+  m-matrix(nrow=n,ncol=4)
+  colnames(m)=c(treatmentgrp,strata,censoringTime,survivalTime)
+ for(i in 1:20)  
m[i,]-c(sample(c(1,2),1,replace=TRUE),sample(c(1,2),1,replace=TRUE),rexp(1,.007),rexp(1,.002))
+ m-cbind(m,0)
+  m[m[,3]m[,4],5]-1
+  colnames(m)[5]-censoring
+   print(m)
+s=s-1
+ }

#bilding a data frame from m

x=data.frame(m)

# adding a column, where nrow(x) = number of row in x and in the 
# new coulmn should stand the entries 1...20.

x=data.frame(x,actual surv time=c(1:nrow(x)))

 x
   treatmentgrp strata censoringTime survivalTime censoring actual.surv.time
1 1  2377.486125   1070.66287 01
2 1  2242.468604   1061.30474 02
3 1  2 40.904656 51.88263 03
4 2  2 44.025595590.15317 04
5 1  1253.093279247.32141 15
6 2  2 50.486272257.25016 06
7 1  1337.591250554.05931 07
8 2  2 74.905075873.14563 08
9 1  2 57.196581765.43142 09
101  2370.147307   1646.65368 0   10
111  2152.738532480.12355 0   11
122  2 15.313303139.19791 0   12
131  2 17.205624641.15764 0   13
142  1 81.753924107.02202 0   14
151  2 60.774221665.27500 0   15
162  1  8.712562142.90775 0   16
171  1 54.542722   1904.88060 0   17
182  2 85.626140214.66811 0   18
192  1 31.257923739.96591 0   19
201  1 85.910141306.14860 0   20


See it works! For more information of data frames, see ?data.frame 
Here are some additional examples.


 x[2,6]  # Extraction of special entries: x[2,6]= 2 - row 2, column 6
[1] 2
 x[2,6]=100  # Changing entires: x[2,6]=100
 x
   treatmentgrp strata censoringTime survivalTime censoring actual.surv.time
1 1  2377.486125   1070.66287 01
2 1  2242.468604   1061.30474 0  100
3 1  2 40.904656 51.88263 03
4 2  2 44.025595590.15317 04
5 1  1253.093279247.32141 15
6 2  2 50.486272257.25016 06
7 1  1337.591250554.05931 07
8 2  2 74.905075873.14563 08
9 1  2 57.196581765.43142 09
101  2370.147307   1646.65368 0   10
111  2152.738532480.12355 0   11
122  2 15.313303139.19791 0   12
131  2 17.205624641.15764 0   13
142  1 81.753924107.02202 0   14
151  2 60.774221665.27500 0   15
162  1  8.712562142.90775 0   16
171  1 54.542722   1904.88060 0   17
182  2 85.626140214.66811 0   18
192  1 31.257923739.96591 0   19
201  1 85.910141306.14860 0   20
 t=x
 row2=t[-2,] # deleting a row
 row2
   treatmentgrp strata censoringTime survivalTime censoring actual.surv.time
1 1  2377.486125   1070.66287 01
3 1  2 40.904656 51.88263 03
4 2  2 44.025595590.15317 04
5 1  1253.093279247.32141 15
6 2  2 50.486272257.25016 06
7 1  1337.591250554.05931 07
8 2  2 74.905075873.14563 0   

Re: [R] plot time series

2007-05-08 Thread jessica . gervais
Dear all,

I actually would like to improve the label orientation on the x-axis (turn
them to 45 degrees)
I tried the par(las=2) ... but doesn't work...

Do anyone knows how to do ?

Jessica

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Weighted least squares

2007-05-08 Thread S Ellison
Hadley,

You asked
 .. what is the usual way to do a linear 
 regression when you have aggregated data?

Least squares generally uses inverse variance weighting. For aggregated data 
fitted as mean values, you just need the variances for the _means_. 

So if you have individual means x_i and sd's s_i that arise from aggregated 
data with n_i observations in group i, the natural weighting is by inverse 
squared standard error of the mean. The appropriate weight for x_i would then 
be n_i/(s_i^2). In R, that's n/(s^2), as n and s would be vectors with the same 
length as x. If all the groups had the same variance, or nearly so, s is a 
scalar; if they have the same number of observations, n is a scalar. 

Of course, if they have the same variance and same number of observations, they 
all have the same weight and you needn't weight them at all: see previous 
posting!

Steve E



***
This email and any attachments are confidential. Any use, co...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] statistics/correlation question NOT R question

2007-05-08 Thread Leeds, Mark \(IED\)
This is not an R question but if anyone can help me, it's much
appreciated.

Suppose I have a series ( stationary ) y_t and a series x_t ( stationary
)and x_t has variance sigma^2_x and epsilon is normal 
(0, sigma^2_epsilon )

and the two series have the relation

y_t = Beta*x_t + epsilon

My question is if there are particular values that sigma^2_x and
sigma^2_epsilon have to take in order for corr(x_t,y_t) to equal Beta ?

I attempted to figure this out using two different methods and in one
case I end up involving sigma^2_epsilon and in the other I don't
and I'm not sure if either method is correct. I think I need to use
results form the conditional bivariate normal but i'm really not sure.
Also, it's not a homework problem because I am too old to have homework.
Thanks for any insights/solutions.


This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] getting informative error messages

2007-05-08 Thread Spencer Graves
Dear Prof. Ripley: 

  1.  I very much appreciate your contributions to the R project. 

  2.  Whether with release 2.4.0 or earlier, I noticed that 
'traceback()' had become less informative.  This loss was more than 
offset when I learned to use the 'debug' function in the 'base' 
package:  It increased my productivity so much that it helped complete 
my transition from S-Plus:  The last few times I had to use S-Plus, I 
ported them to R, got the code working using 'debug', then ported the 
results back to S-Plus.  That was quicker for me than debugging directly 
in S-Plus.

  3.  Thanks again for your many contributions to the R project and 
to my education more generally. 

  Best Wishes,
  Spencer Graves

Prof Brian Ripley wrote:
 It is not clear to me what you want here.

 Errors are tagged by a 'call', and f(1:3) is the innermost 'call' (special 
 primitives do not set a context and so do not count if you consider '[' 
 to be a function).

 The message could tell you what the type was, but it does not and we have 
 lost the pool of active contributors we once had to submit tested patches 
 for things like that.


 On Mon, 7 May 2007, Tony Plate wrote:

   
 Certain errors seem to generate messages that are less informative than
 most -- they just tell you which function an error happened in, but
 don't indicate which line or expression the error occurred in.

 Here's a toy example:

 
 f - function(x) {a - 1; y - x[list(1:3)]; b - 2; return(y)}
 options(error=NULL)
 f(1:3)
   
 Error in f(1:3) : invalid subscript type
 
 traceback()
   
 1: f(1:3)
 
 In this function, it's clear that the error is in subscripting 'x', but
 it's not always so immediately obvious in lengthier functions.

 Is there anything I can do to get a more informative error message in
 this type of situation?  I couldn't find any help in the section
 Debugging R Code in R-exts (or anything at all relevant in R-intro).

 (Different values for options(error=...) and different formatting of the
 function made no difference.)

 -- Tony Plate

 
 sessionInfo()
   
 R version 2.5.0 (2007-04-23)
 i386-pc-mingw32

 locale:
 LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
 States.1252;LC_MONETARY=English_United
 States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods
 [7] base

 other attached packages:
 tap.misc
1.0
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] censoring

2007-05-08 Thread raymond chiruka
in R when carring out the log rank test is the censored variable denoted by 1 
or 0 or its of no consequence.
  
  thanks
  
 
-

always stay connected to friends.
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] strange behavior in data frames with duplicated column names

2007-05-08 Thread Prof Brian Ripley
First, you should not be using colnames-, which is for a matrix, on a 
data frame.  Use names- for data frames (and as.data.frame to convert to 
a data frame).

Second, whereas duplicate row names are not allowed in a data frame, 
duplicate column names are but at your own risk.

Third, there is a 'optimization too far' here which I will change in 2.5.0 
patched.  Often with R development there is a tradeoff between speed and 
generality.

On Tue, 8 May 2007, William Revelle wrote:

 Dear R gurus,

 There is an interesting problem with accessing specific items in a
 column of data frame that has incorrectly been given a duplicate
 name, even though addressing the item by row and column number.
 Although the column is correctly listed, an item addressed by row and
 column number gives the item with the correct row and the original
 not the duplicated column.

 Here are the instructions to get this problem

 x - matrix(seq(1:12),ncol=3)
 colnames(x) - c(A,B,A)   #a redundant name for column 2
 x.df - data.frame(x)
 x.df#the redundant name is corrected
 x.df[,3]#show the column -- this always works
 x.df[2,3]   #this works here
 #now incorrectly label the columns with a duplicate name
 colnames(x.df) - c(A,B,A)  #the redundant name is not detected
 x.df
 x.df[,3] #this works as above and shows the column
 x.df[2,3]#but this gives the value of the first column, not the third  
 ---
 rownames(x.df) - c(First,Second,Third,Third)  #detects duplicate name
 x.df
 x.df[4,] #correct second row and corrected column names!
 x.df[4,3]#wrong column
 x.df #still has the original names with the duplication


 and corresponding output:

  x - matrix(seq(1:12),ncol=3)
  colnames(x) - c(A,B,A)   #a redundant name for column 2
  x.df - data.frame(x)
  x.df#the redundant name is corrected
   A B A.1
 1 1 5   9
 2 2 6  10
 3 3 7  11
 4 4 8  12
  x.df[,3]#show the column -- this always works
 [1]  9 10 11 12
  x.df[2,3]   #this works here
 [1] 10
  #now incorrectly label the columns with a duplicate name
  colnames(x.df) - c(A,B,A)  #the redundant name is not detected
  x.df
   A B  A
 1 1 5  9
 2 2 6 10
 3 3 7 11
 4 4 8 12
  x.df[,3] #this works as above and shows the column
 [1]  9 10 11 12
  x.df[2,3]#but this gives the value of the first column, not the
 third  ---
 [1] 2
  rownames(x.df) - c(First,Second,Third,Third)  #detects
 duplicate name
 Error in `row.names-.data.frame`(`*tmp*`, value = c(First, Second,  :
   duplicate 'row.names' are not allowed
  x.df
   A B  A
 1 1 5  9
 2 2 6 10
 3 3 7 11
 4 4 8 12
  x.df[4,] #correct second row and corrected column names!
   A B A.1
 4 4 8  12
  x.df[4,3]#wrong column
 [1] 4
  x.df #still has the original names with the duplication

  unlist(R.Version())
  platform
 archos
  i386-apple-darwin8.9.1
 i386 darwin8.9.1
system
 status major
   i386, darwin8.9.1
 Patched   2
 minor
 year month
 5.0
 2007  04
   day
 svn rev  language
  25
 41315   R
version.string
 R version 2.5.0 Patched (2007-04-25 r41315)



 Bill



-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] statistics/correlation question NOT R question

2007-05-08 Thread Horace Tso
Mark, I suppose you make the usual assumptions, ie. E[x]=0, E[x*epsilon]=0, the 
correlation is just simply,

corr(x,y) = beta * ( var(x) / var(y) )

And you could get var(y) from var(x) and var(epsilon).

HTH.

Horace



 Leeds, Mark (IED) [EMAIL PROTECTED] 5/8/2007 10:25:11 AM 
This is not an R question but if anyone can help me, it's much
appreciated.

Suppose I have a series ( stationary ) y_t and a series x_t ( stationary
)and x_t has variance sigma^2_x and epsilon is normal 
(0, sigma^2_epsilon )

and the two series have the relation

y_t = Beta*x_t + epsilon

My question is if there are particular values that sigma^2_x and
sigma^2_epsilon have to take in order for corr(x_t,y_t) to equal Beta ?

I attempted to figure this out using two different methods and in one
case I end up involving sigma^2_epsilon and in the other I don't
and I'm not sure if either method is correct. I think I need to use
results form the conditional bivariate normal but i'm really not sure.
Also, it's not a homework problem because I am too old to have homework.
Thanks for any insights/solutions.


This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help 
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html 
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] censoring

2007-05-08 Thread Roland Rau
Well, I guess it makes quite a difference in survival analysis whether 
you know that a person was alive/censored or experienced the event of 
interest at a certain point of time/age. You could have tried it easily 
for yourself by slightly modifying the example on the help page of 
'survdiff'.

library(survival)
survdiff(Surv(futime, fustat) ~ rx,data=ovarian)
survdiff(Surv(futime, 1-fustat) ~ rx,data=ovarian)


If you want to work more in survival analysis, I can recommend the book 
John P. Klein / Melvin L. Moeschberger (2003): Survival Analysis. 
Techniques for Censored and Truncated Data. Springer. (but it gives no 
recipes in R).

Hope this helps,
Roland


raymond chiruka wrote:
 in R when carring out the log rank test is the censored variable denoted by 1 
 or 0 or its of no consequence.
   
   thanks
   
  
 -
 
 always stay connected to friends.
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Looking for a comprehensive descriptive statistics package

2007-05-08 Thread Roland Rau
Hi,

summary(yourdata) is often a very good starting point for descriptive 
data statistics. Or you can write your own little function which returns 
what you actually like to see (the code below was written very quickly. 
No care is taken for the presence of missing values or anything else).


exampledata - rnorm(1)
summary(exampledata)
desc - function(mydata) {
   require(e1071)
   quantls - quantile(x=mydata, probs=seq(from=0, to=1, by=0.25))
   themean - mean(mydata)
   thesd - sd(mydata)
   kurt - kurtosis(mydata)
   skew - skewness(mydata)
   retlist - list(Quantiles=quantls, Mean=themean, StandDev=thesd,
   Skewness=skew, Kurtosis=kurt)
   return(retlist)
}

descstats - desc(exampledata)
descstats



I hope this helps,
Roland


David Kaplan wrote:
 Hi all,
 
 I'm looking for a package that will give me comprehensive set of 
 descriptive statistics, including skew and kurtosis.  Also, is there a 
 similar package that will provide multivariate descriptive statistics as 
 well?  Thanks in advance,
 
 David
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] creating a new column

2007-05-08 Thread Bos, Roger
I haven't been following your discussion, but based on your method, the 
following should also work and would be the one-line equivalent:

m$act.surv.time-pmin(m[,censoringTime],m[,survivalTime]) 

In other words, you can add a column by just assigning a value to a new column 
name.  You can also use the following syntax:

m$act.surv.time-pmin(m$censoringTime,m$survivalTime) 

I just thought you might want to see different version of accomplishing the 
same thing.

Roger.

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of raymond chiruka
Sent: Tuesday, May 08, 2007 12:36 PM
To: Schmitt, Corinna; r
Subject: Re: [R] creating a new column

thanks for the help But l ened up using
  
act.surv.time-pmin(m[,censoringTime],m[,survivalTime])
  
m-cbind(m,act.surv.time)
which seems to work
thanks

  

Schmitt, Corinna [EMAIL PROTECTED] wrote:  Hallo, I just tried, if my 
solution might be possible for you. Here is the result:

 s=2
while(s!=0){ n=20
+  m-matrix(nrow=n,ncol=4)
+  colnames(m)=c(treatmentgrp,strata,censoringTime,survivalTime)
+ for(i in 1:20)  
m[i,]-c(sample(c(1,2),1,replace=TRUE),sample(c(1,2),1,replace=TRUE),rexp(1,.007),rexp(1,.002))
+ m-cbind(m,0)
+  m[m[,3]m[,4],5]-1
+  colnames(m)[5]-censoring
+   print(m)
+s=s-1
+ }

#bilding a data frame from m

x=data.frame(m)

# adding a column, where nrow(x) = number of row in x and in the # new coulmn 
should stand the entries 1...20.

x=data.frame(x,actual surv time=c(1:nrow(x)))

 x
   treatmentgrp strata censoringTime survivalTime censoring actual.surv.time
1 1  2377.486125   1070.66287 01
2 1  2242.468604   1061.30474 02
3 1  2 40.904656 51.88263 03
4 2  2 44.025595590.15317 04
5 1  1253.093279247.32141 15
6 2  2 50.486272257.25016 06
7 1  1337.591250554.05931 07
8 2  2 74.905075873.14563 08
9 1  2 57.196581765.43142 09
101  2370.147307   1646.65368 0   10
111  2152.738532480.12355 0   11
122  2 15.313303139.19791 0   12
131  2 17.205624641.15764 0   13
142  1 81.753924107.02202 0   14
151  2 60.774221665.27500 0   15
162  1  8.712562142.90775 0   16
171  1 54.542722   1904.88060 0   17
182  2 85.626140214.66811 0   18
192  1 31.257923739.96591 0   19
201  1 85.910141306.14860 0   20


See it works! For more information of data frames, see ?data.frame Here are 
some additional examples.


 x[2,6]  # Extraction of special entries: x[2,6]= 2 - row 2, column 6
[1] 2
 x[2,6]=100  # Changing entires: x[2,6]=100
 x
   treatmentgrp strata censoringTime survivalTime censoring actual.surv.time
1 1  2377.486125   1070.66287 01
2 1  2242.468604   1061.30474 0  100
3 1  2 40.904656 51.88263 03
4 2  2 44.025595590.15317 04
5 1  1253.093279247.32141 15
6 2  2 50.486272257.25016 06
7 1  1337.591250554.05931 07
8 2  2 74.905075873.14563 08
9 1  2 57.196581765.43142 09
101  2370.147307   1646.65368 0   10
111  2152.738532480.12355 0   11
122  2 15.313303139.19791 0   12
131  2 17.205624641.15764 0   13
142  1 81.753924107.02202 0   14
151  2 60.774221665.27500 0   15
162  1  8.712562142.90775 0   16
171  1 54.542722   1904.88060 0   17
182  2 85.626140214.66811 0   18
192  1 31.257923739.96591 0   19
201  1 85.910141

[R] Separate x-axis ticks for panels in xyplot()?

2007-05-08 Thread Michael Kubovy
Dear r-helpers,

In an xyplot I have

xyplot(y ~ x | c, groups = g,
scales = list(
x = list(
at = v1,
labels = c('A', 'B', 'C', 'D'),
rot = 45
)
)
)

g consists of two groups. How do I define a different set of x ticks  
and labels for the second panel? I'm looking for something like

xyplot(y ~ x | c, groups = g,
scales = list(
x = list(ifelse(g == g1, {
at = v1,
labels = c('A', 'B', 'C', 'D'),
rot = 45},
{at = v2,
labels = c('E', 'F' 'G),
rot = 45}
)
)
)


_
Professor Michael Kubovy
University of Virginia
Department of Psychology
USPS: P.O.Box 400400Charlottesville, VA 22904-4400
Parcels:Room 102Gilmer Hall
 McCormick RoadCharlottesville, VA 22903
Office:B011+1-434-982-4729
Lab:B019+1-434-982-4751
Fax:+1-434-982-4766
WWW:http://www.people.virginia.edu/~mk9y/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Representing a statistic as a colour on a 2d plot

2007-05-08 Thread mister_bluesman

Thanks people! Sorry to sound thick but after I download the poltrix package,
how do i install it?

Thank you!
-- 
View this message in context: 
http://www.nabble.com/Representing-a-statistic-as-a-colour-on-a-2d-plot-tf3703885.html#a10383035
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot time series

2007-05-08 Thread John Kane
The short answer is that you can't.  

The longer answer is that you need to replace them
with text. Have a look at the FAQ 7.27 How can I
create rotated axis labels?

It provides a pretty good example.
--- [EMAIL PROTECTED] wrote:

 Dear all,
 
 I actually would like to improve the label
 orientation on the x-axis (turn
 them to 45 degrees)
 I tried the par(las=2) ... but doesn't work...
 
 Do anyone knows how to do ?
 
 Jessica
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Representing a statistic as a colour on a 2d plot

2007-05-08 Thread John Kane
How did you download it?  In Windows if you used the
Packages  Install packages route it is installed.
Just type  library(plotrix) to load it.  No idea for
Linux or Mac.
--- mister_bluesman [EMAIL PROTECTED]
wrote:

 
 Thanks people! Sorry to sound thick but after I
 download the poltrix package,
 how do i install it?
 
 Thank you!
 -- 
 View this message in context:

http://www.nabble.com/Representing-a-statistic-as-a-colour-on-a-2d-plot-tf3703885.html#a10383035
 Sent from the R help mailing list archive at
 Nabble.com.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Representing a statistic as a colour on a 2d plot

2007-05-08 Thread Roland Rau
mister_bluesman wrote:
 Thanks people! Sorry to sound thick but after I download the poltrix package,
 how do i install it?
 
 Thank you!

which platform are you on?

On Windows I would suggest you use the Menu of the GUI Install packages 
from local zip files in the main category 'packages'
On Linux I usually do on the command line:
R CMD INSTALL plotrix_2.2.tar.gz

I hope this helps?

Best,
Roland

P.S. I am pretty sure it is described somewhere in the included manuals.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Representing a statistic as a colour on a 2d plot

2007-05-08 Thread mister_bluesman

Thanks for you all your help guys. Much appreciated.


-- 
View this message in context: 
http://www.nabble.com/Representing-a-statistic-as-a-colour-on-a-2d-plot-tf3703885.html#a10383998
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Looking for a cleaner way to implement a setting certain indices of a matrix to 1 function

2007-05-08 Thread Leeds, Mark \(IED\)
I wrote an ugly algorithm to set certain elements of a matrix to 1
without looping and below works and you can see what
The output is below the code.

K-6
lagnum-2

restrictmat-matrix(0,nrow=K,ncol=K*3)
restrictmat[((col(restrictmat) - row(restrictmat) = 0 ) 
(col(restrictmat)-row(restrictmat)) %% K == 0)]-1
restrictmat[,(lagnum*K+1):ncol(restrictmat)]-0

 restrictmat
 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
[,13] [,14] [,15] [,16] [,17] [,18]
[1,]100000100 0 0 0
0 0 0 0 0 0
[2,]010000010 0 0 0
0 0 0 0 0 0
[3,]001000001 0 0 0
0 0 0 0 0 0
[4,]000100000 1 0 0
0 0 0 0 0 0
[5,]000010000 0 1 0
0 0 0 0 0 0
[6,]000001000 0 0 1
0 0 0 0 0 0

For lagnum equals 1 , it also works :

 restrictmat
 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
[,13] [,14] [,15] [,16] [,17] [,18]
[1,]100000000 0 0 0
0 0 0 0 0 0
[2,]010000000 0 0 0
0 0 0 0 0 0
[3,]001000000 0 0 0
0 0 0 0 0 0
[4,]000100000 0 0 0
0 0 0 0 0 0
[5,]000010000 0 0 0
0 0 0 0 0 0
[6,]000001000 0 0 0
0 0 0 0 0 0

But I am thinking that there has to be a better way particularly because
I'll get an error if I set lagnum to 3. 
Any improvements or total revampings are appreciated. The number of
columns will always be a multiple of the number of rows
So K doesn't have to be 6. that was just to show what the commands do.
thanks.


This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Skipping rows and columns in large matrix

2007-05-08 Thread Silvia Lomascolo

I work on Windows, R version 2.4.1
I need to leave out many rows and columns when reading a matrix.  The 'skip'
commands in read.table, seem to be just for skipping rows, if I understand
well, but I need to skip many columns as well. I tried something very simple
and left one row and one column out, but I don't know how to do it for many
rows and columns at a time.

Example:

zzz # euclidean distances matrix

  X1   X2   X3   X4   X5
X10   11   12 1   15
X2   110 5   12 3
X3   125 0 8 4
X4 1  12 8 0   16
X5   153 4160

skip.data = zzz [-3, -3]   #  gives me the same matrix without row 3 and
column 3... how can I tell R to skip more than one row and column?  Columns
and rows to be left out are not contiguous  The matrix I'm using is fairly
large (335x335), so excel won't read the whole thing, and it's not a
fixed-width txt file, so I cannot use read.fwf.  I also don't have access to
Unix, as was suggested to other users to solve this problem.


-- 
View this message in context: 
http://www.nabble.com/Skipping-rows-and-columns-in-large-matrix-tf3712332.html#a10384400
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Looking for a cleaner way to implement a setting certainindices of a matrix to 1 function

2007-05-08 Thread Bert Gunter
Suggestion:

You might make it easier for folks to help if you explained in clear and
simple terms what you are trying to do. Code is hard to deconstruct.


Bert Gunter
Genentech Nonclinical Statistics


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Leeds, Mark (IED)
Sent: Tuesday, May 08, 2007 2:22 PM
To: r-help@stat.math.ethz.ch
Subject: [R] Looking for a cleaner way to implement a setting certainindices
of a matrix to 1 function

I wrote an ugly algorithm to set certain elements of a matrix to 1
without looping and below works and you can see what
The output is below the code.

K-6
lagnum-2

restrictmat-matrix(0,nrow=K,ncol=K*3)
restrictmat[((col(restrictmat) - row(restrictmat) = 0 ) 
(col(restrictmat)-row(restrictmat)) %% K == 0)]-1
restrictmat[,(lagnum*K+1):ncol(restrictmat)]-0

 restrictmat
 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
[,13] [,14] [,15] [,16] [,17] [,18]
[1,]100000100 0 0 0
0 0 0 0 0 0
[2,]010000010 0 0 0
0 0 0 0 0 0
[3,]001000001 0 0 0
0 0 0 0 0 0
[4,]000100000 1 0 0
0 0 0 0 0 0
[5,]000010000 0 1 0
0 0 0 0 0 0
[6,]000001000 0 0 1
0 0 0 0 0 0

For lagnum equals 1 , it also works :

 restrictmat
 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
[,13] [,14] [,15] [,16] [,17] [,18]
[1,]100000000 0 0 0
0 0 0 0 0 0
[2,]010000000 0 0 0
0 0 0 0 0 0
[3,]001000000 0 0 0
0 0 0 0 0 0
[4,]000100000 0 0 0
0 0 0 0 0 0
[5,]000010000 0 0 0
0 0 0 0 0 0
[6,]000001000 0 0 0
0 0 0 0 0 0

But I am thinking that there has to be a better way particularly because
I'll get an error if I set lagnum to 3. 
Any improvements or total revampings are appreciated. The number of
columns will always be a multiple of the number of rows
So K doesn't have to be 6. that was just to show what the commands do.
thanks.


This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Mantel-Haenszel relative risk with Greenland-Robins variance estimate

2007-05-08 Thread Frank E Harrell Jr
Does anyone know of an R function for computing the Greenland-Robins 
variance for Mantel-Haenszel relative risks?

Thanks
Frank
-- 
Frank E Harrell Jr   Professor and Chair   School of Medicine
  Department of Biostatistics   Vanderbilt University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] to draw a smooth arc

2007-05-08 Thread Paulo Barata

Prof. Ripley,

Maybe the fact that few R plot regions have a 1:1 aspect ratio
is not a major problem here. One has to deal with the same
issue when drawing a circle parametrically. Depending on the
window size, a (cos(t),sin(t)) plot appears as an ellipse.
To get a circle parametrically, one has to resize the plot region,
or define it (possibly by trial and error) with functions like
windows() or win.graph() (package grDevices), to have a 1:1 aspect
ratio.

Maybe more important is to have some sure way to draw smooth arcs
and arrows. Could it be done even if the plot region does not have
a 1:1 aspect ratio, when the arc (intended to be an arc of a circle)
would show as an arc of an ellipse? Would the smooth ellipse arc
then turn into a smooth circle arc, when the plot region is resized?

Paulo Barata
(Rio de Janeiro - Brazil)

---

Prof Brian Ripley wrote:
 There is now an xspline() function in R-devel, with an example showing 
 how to add arrows.
 
 I thought a bit more about a 'circular arc' function, but there really 
 is a problem with that.  Few R plot regions have a 1:1 aspect ratio 
 including some that are intended to do so (see the rw-FAQ).  symbols() 
 is designed to draw circles in device coordinates, but attempting to 
 specify circular arcs by endpoints in user coordinates is fraught.
 
 On Wed, 2 May 2007, Paul Murrell wrote:
 
 Hi


 Paulo Barata wrote:
 Dr. Murrell and all,

 One final suggestion: a future function arc() in package graphics,
 with centre-radius-angle parameterisation, could also include an
 option to draw arrows at either end of the arc, as one can find
 in function arrows().


 ... and in grid.xspline() and grid.curve().

 Paul


 Thank you.

 Paulo Barata

 ---
 Paul Murrell wrote:
 Hi


 Paulo Barata wrote:
 Dr. Snow and Prof. Ripley,

 Dr. Snow's suggestion, using clipplot (package TeachingDemos),
 is maybe a partial solution to the problem of drawing an arc of
 a circle (as long as the line width of the arc is not that large,
 as pointed out by Prof. Ripley). If the arc is symmetrical around
 a vertical line, then it is not so difficult to draw it that way.
 But an arc that does not have this kind of symmetry would possibly
 require some geometrical computations to find the proper rectangle
 to be used for clipping.

 I would like to suggest that in a future version of R some function
 be included in the graphics package to draw smooth arcs with
 given center, radius, initial and final angles. I suppose
 that the basic ingredients are available in function symbols
 (graphics).

 Just to back up a few previous posts ...

 There is something like this facility already available via the
 grid.xspline() function in the grid package.  This provides very
 flexible curve drawing (including curves very close to Bezier curves)
 based on the X-Splines implemented in xfig.  The grid.curve() function
 provides a convenience layer that allows for at least certain
 parameterisations of arcs (you specify the arc end points and the 
 angle).

 These functions are built on functionality within the core graphics
 engine, so exposing a similar interface (e.g., an xspline() function)
 within traditional graphics would be relatively straightforward.

 The core functionality draws the curves as line segments (but
 automatically figures out how many segments to use so that the curve
 looks smooth);  it does NOT call curve-drawing primitives in the
 graphics device (like PostScript's curveto).

 In summary:  there is some support for smooth curves, but we could 
 still
 benefit from a specific arc() function with the standard
 centre-radius-angle parameterisation and we could also benefit from
 exposing the native strengths of different graphics devices (rather 
 than
 the current lowest-common-denominator approach).

 Paul


 Thank you very much.

 Paulo Barata
 (Rio de Janeiro - Brazil)

 --- 

 Prof Brian Ripley wrote:
 On Tue, 1 May 2007, Greg Snow wrote:

 Here is an approach that clips the circle you like from symbols 
 down to
 an arc (this will work as long as the arc is less than half a 
 circle,
 for arcs greater than half a circle, you could draw the whole circle
 then use this to draw an arc of the bacground color over the 
 section you
 don't want):

 library(TeachingDemos)
 plot(-5:5, -5:5, type='n')
 clipplot( symbols(0,0,circles=2, add=TRUE), c(0,5), c(0,5) )
 I had considered this approach: clipping a circle to a rectangle 
 isn't
 strictly an arc, as will be clear if the line width is large.
 Consider

 clipplot(symbols(0, 0 ,circles=2, add=TRUE, lwd=5), c(-1,5), c(-1,5))

 Note too that what happens with clipping is device-dependent.  If R's
 internal clipping is used, the part-circle is converted to a polygon.


 __
 

Re: [R] Looking for a cleaner way to implement a setting certainindices of a matrix to 1 function

2007-05-08 Thread Leeds, Mark \(IED\)
That's a good idea : I didn't realize that my matrices would look so bad
in the final email. All I want
To do is output 1's in the diagonal elements and zero's everywhere else
but the matrix is not square so by diagonals I
Really mean if

Lagnum = 1 then the elements are (1,1), (2,2), (3,3),(4,4),(5,5),(6,6)

Lagnum = 2 then the elements (1,1), (2,2),
(3,3),(4,4),(5,5),(6,6),(7,1),(8,2),(9,3),(10,4),(11,5),(12,6)

Lagnum = 3 then the elements (1,1), (2,2),
(3,3),(4,4),(5,5),(6,6),(7,1),(8,2),(9,3),(10,4),(11,5),(12,6),(13,1),(1
4,2),(15,3),(16,4),(17,5),
(18,6)

And lagnum always has to be greater than or equal to 1 and less than or
equal to (number of cols/number of rows ). Thanks
For your advice.


-Original Message-
From: Bert Gunter [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, May 08, 2007 5:34 PM
To: Leeds, Mark (IED); r-help@stat.math.ethz.ch
Subject: RE: [R] Looking for a cleaner way to implement a setting
certainindices of a matrix to 1 function

Suggestion:

You might make it easier for folks to help if you explained in clear and
simple terms what you are trying to do. Code is hard to deconstruct.


Bert Gunter
Genentech Nonclinical Statistics


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Leeds, Mark (IED)
Sent: Tuesday, May 08, 2007 2:22 PM
To: r-help@stat.math.ethz.ch
Subject: [R] Looking for a cleaner way to implement a setting
certainindices of a matrix to 1 function

I wrote an ugly algorithm to set certain elements of a matrix to 1
without looping and below works and you can see what The output is below
the code.

K-6
lagnum-2

restrictmat-matrix(0,nrow=K,ncol=K*3)
restrictmat[((col(restrictmat) - row(restrictmat) = 0 ) 
(col(restrictmat)-row(restrictmat)) %% K == 0)]-1
restrictmat[,(lagnum*K+1):ncol(restrictmat)]-0

 restrictmat
 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
[,13] [,14] [,15] [,16] [,17] [,18]
[1,]100000100 0 0 0
0 0 0 0 0 0
[2,]010000010 0 0 0
0 0 0 0 0 0
[3,]001000001 0 0 0
0 0 0 0 0 0
[4,]000100000 1 0 0
0 0 0 0 0 0
[5,]000010000 0 1 0
0 0 0 0 0 0
[6,]000001000 0 0 1
0 0 0 0 0 0

For lagnum equals 1 , it also works :

 restrictmat
 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
[,13] [,14] [,15] [,16] [,17] [,18]
[1,]100000000 0 0 0
0 0 0 0 0 0
[2,]010000000 0 0 0
0 0 0 0 0 0
[3,]001000000 0 0 0
0 0 0 0 0 0
[4,]000100000 0 0 0
0 0 0 0 0 0
[5,]000010000 0 0 0
0 0 0 0 0 0
[6,]000001000 0 0 0
0 0 0 0 0 0

But I am thinking that there has to be a better way particularly because
I'll get an error if I set lagnum to 3. 
Any improvements or total revampings are appreciated. The number of
columns will always be a multiple of the number of rows So K doesn't
have to be 6. that was just to show what the commands do.
thanks.


This is not an offer (or solicitation of an offer) to
buy/se...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Skipping rows and columns in large matrix

2007-05-08 Thread jim holtman
zzz - zzz[-c(1,45,23,321), -c(23,34,45,67)]


On 5/8/07, Silvia Lomascolo [EMAIL PROTECTED] wrote:

 I work on Windows, R version 2.4.1
 I need to leave out many rows and columns when reading a matrix.  The 'skip'
 commands in read.table, seem to be just for skipping rows, if I understand
 well, but I need to skip many columns as well. I tried something very simple
 and left one row and one column out, but I don't know how to do it for many
 rows and columns at a time.

 Example:

 zzz # euclidean distances matrix

  X1   X2   X3   X4   X5
 X10   11   12 1   15
 X2   110 5   12 3
 X3   125 0 8 4
 X4 1  12 8 0   16
 X5   153 4160

 skip.data = zzz [-3, -3]   #  gives me the same matrix without row 3 and
 column 3... how can I tell R to skip more than one row and column?  Columns
 and rows to be left out are not contiguous  The matrix I'm using is fairly
 large (335x335), so excel won't read the whole thing, and it's not a
 fixed-width txt file, so I cannot use read.fwf.  I also don't have access to
 Unix, as was suggested to other users to solve this problem.


 --
 View this message in context: 
 http://www.nabble.com/Skipping-rows-and-columns-in-large-matrix-tf3712332.html#a10384400
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Looking for a cleaner way to implement a setting certain indices of a matrix to 1 function

2007-05-08 Thread Anders Nielsen
Hi Mark, 

Is this of any help?

  resMat-function(K=6,lag=2,ncol=3*K){
X-matrix(0,K,ncol)
X[,1:(K*lag)]-diag(K)
return(X)
  } 

Cheers, 

Anders. 

On Tuesday 08 May 2007 11:21 am, Leeds, Mark (IED) wrote:
 I wrote an ugly algorithm to set certain elements of a matrix to 1
 without looping and below works and you can see what
 The output is below the code.
 
 K-6
 lagnum-2
 
 restrictmat-matrix(0,nrow=K,ncol=K*3)
 restrictmat[((col(restrictmat) - row(restrictmat) = 0 ) 
 (col(restrictmat)-row(restrictmat)) %% K == 0)]-1
 restrictmat[,(lagnum*K+1):ncol(restrictmat)]-0
 
  restrictmat
  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
 [,13] [,14] [,15] [,16] [,17] [,18]
 [1,]100000100 0 0 0
 0 0 0 0 0 0
 [2,]010000010 0 0 0
 0 0 0 0 0 0
 [3,]001000001 0 0 0
 0 0 0 0 0 0
 [4,]000100000 1 0 0
 0 0 0 0 0 0
 [5,]000010000 0 1 0
 0 0 0 0 0 0
 [6,]000001000 0 0 1
 0 0 0 0 0 0
 
 For lagnum equals 1 , it also works :
 
  restrictmat
  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
 [,13] [,14] [,15] [,16] [,17] [,18]
 [1,]100000000 0 0 0
 0 0 0 0 0 0
 [2,]010000000 0 0 0
 0 0 0 0 0 0
 [3,]001000000 0 0 0
 0 0 0 0 0 0
 [4,]000100000 0 0 0
 0 0 0 0 0 0
 [5,]000010000 0 0 0
 0 0 0 0 0 0
 [6,]000001000 0 0 0
 0 0 0 0 0 0
 
 But I am thinking that there has to be a better way particularly because
 I'll get an error if I set lagnum to 3. 
 Any improvements or total revampings are appreciated. The number of
 columns will always be a multiple of the number of rows
 So K doesn't have to be 6. that was just to show what the commands do.
 thanks.
 
 
 This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Looking for a cleaner way to implement a setting certainindices of a matrix to 1 function

2007-05-08 Thread jim holtman
Is this what you want?

 n - 6
 lagnum - 2
 result - NULL
 for (i in 1:lagnum)  result - cbind(result, diag(n))
 result
 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
[1,]100000100 0 0 0
[2,]010000010 0 0 0
[3,]001000001 0 0 0
[4,]000100000 1 0 0
[5,]000010000 0 1 0
[6,]000001000 0 0 1




On 5/8/07, Leeds, Mark (IED) [EMAIL PROTECTED] wrote:
 That's a good idea : I didn't realize that my matrices would look so bad
 in the final email. All I want
 To do is output 1's in the diagonal elements and zero's everywhere else
 but the matrix is not square so by diagonals I
 Really mean if

 Lagnum = 1 then the elements are (1,1), (2,2), (3,3),(4,4),(5,5),(6,6)

 Lagnum = 2 then the elements (1,1), (2,2),
 (3,3),(4,4),(5,5),(6,6),(7,1),(8,2),(9,3),(10,4),(11,5),(12,6)

 Lagnum = 3 then the elements (1,1), (2,2),
 (3,3),(4,4),(5,5),(6,6),(7,1),(8,2),(9,3),(10,4),(11,5),(12,6),(13,1),(1
 4,2),(15,3),(16,4),(17,5),
 (18,6)

 And lagnum always has to be greater than or equal to 1 and less than or
 equal to (number of cols/number of rows ). Thanks
 For your advice.


 -Original Message-
 From: Bert Gunter [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, May 08, 2007 5:34 PM
 To: Leeds, Mark (IED); r-help@stat.math.ethz.ch
 Subject: RE: [R] Looking for a cleaner way to implement a setting
 certainindices of a matrix to 1 function

 Suggestion:

 You might make it easier for folks to help if you explained in clear and
 simple terms what you are trying to do. Code is hard to deconstruct.


 Bert Gunter
 Genentech Nonclinical Statistics


 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Leeds, Mark (IED)
 Sent: Tuesday, May 08, 2007 2:22 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Looking for a cleaner way to implement a setting
 certainindices of a matrix to 1 function

 I wrote an ugly algorithm to set certain elements of a matrix to 1
 without looping and below works and you can see what The output is below
 the code.

 K-6
 lagnum-2

 restrictmat-matrix(0,nrow=K,ncol=K*3)
 restrictmat[((col(restrictmat) - row(restrictmat) = 0 ) 
 (col(restrictmat)-row(restrictmat)) %% K == 0)]-1
 restrictmat[,(lagnum*K+1):ncol(restrictmat)]-0

  restrictmat
 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
 [,13] [,14] [,15] [,16] [,17] [,18]
 [1,]100000100 0 0 0
 0 0 0 0 0 0
 [2,]010000010 0 0 0
 0 0 0 0 0 0
 [3,]001000001 0 0 0
 0 0 0 0 0 0
 [4,]000100000 1 0 0
 0 0 0 0 0 0
 [5,]000010000 0 1 0
 0 0 0 0 0 0
 [6,]000001000 0 0 1
 0 0 0 0 0 0

 For lagnum equals 1 , it also works :

  restrictmat
 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
 [,13] [,14] [,15] [,16] [,17] [,18]
 [1,]100000000 0 0 0
 0 0 0 0 0 0
 [2,]010000000 0 0 0
 0 0 0 0 0 0
 [3,]001000000 0 0 0
 0 0 0 0 0 0
 [4,]000100000 0 0 0
 0 0 0 0 0 0
 [5,]000010000 0 0 0
 0 0 0 0 0 0
 [6,]000001000 0 0 0
 0 0 0 0 0 0

 But I am thinking that there has to be a better way particularly because
 I'll get an error if I set lagnum to 3.
 Any improvements or total revampings are appreciated. The number of
 columns will always be a multiple of the number of rows So K doesn't
 have to be 6. that was just to show what the commands do.
 thanks.
 

 This is not an offer (or solicitation of an offer) to
 buy/se...{{dropped}}

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

 This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}

 __
 R-help@stat.math.ethz.ch mailing list
 

Re: [R] Mantel-Haenszel relative risk with Greenland-Robins variance estimate

2007-05-08 Thread Cody_Hamilton

Would this function help:
http://www.csm.ornl.gov/~frome/ES/RRMHex/MHanalysis.txt ?

Regards, -Cody



   
 Frank E Harrell   
 Jr
 [EMAIL PROTECTED]  To 
 bilt.edu R list r-help@stat.math.ethz.ch   
 Sent by:   cc 
 [EMAIL PROTECTED] 
 at.math.ethz.ch   Subject 
   [R] Mantel-Haenszel relative risk   
   with Greenland-Robins variance  
 05/08/2007 02:38  estimate
 PM
   
   
   
   
   




Does anyone know of an R function for computing the Greenland-Robins
variance for Mantel-Haenszel relative risks?

Thanks
Frank
--
Frank E Harrell Jr   Professor and Chair   School of Medicine
  Department of Biostatistics   Vanderbilt University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] plotting a point graph with data in X-axis

2007-05-08 Thread Milton Cezar Ribeiro
Dear all,

I have two data frame, on with a complete list of my field survey with 
frequency data of a sample species. This data frame looks like:


simulation-data.frame(cbind(my.year=c(rep(2000,8),rep(2001,12),rep(2002,12)),my.month=c(5:12,1:12,1:12)))
simulation$year.month-paste(simulation$my.year,_,ifelse(simulation$my.month=10,simulation$my.month,paste(0,simulation$my.month,sep=)),sep=)
simulation$freq-sample(1:40,32)
attach(simulation)
plot(year.month, freq)

As you can see, I have a collumn with the year and month of my samples, and a 
freq variable with simulated data. I would like to plot this data but when I 
try to use the plot showed above, I get a error message. 

After bypass this problem, I would like add points in my graph with simulated 
data for only a random number of survey month, but I need that the full range 
of surveys be kept on the X-axis. Just to simulate a sample I am using:

simulation.sample-simulation[sample(1:length(year.month),8, replace=F),]
simulation.sample$freq-sample(1:40,8)

Any ideas?

Kind regards

Miltinho

__


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Looking for a cleaner way to implement a setting certain indices of a matrix to 1 function

2007-05-08 Thread Leeds, Mark \(IED\)
thanks anders : that works perfectly. I'll have to study it better to
understand but it's much appreciated.
 

-Original Message-
From: Anders Nielsen [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, May 08, 2007 5:55 PM
To: Leeds, Mark (IED)
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] Looking for a cleaner way to implement a setting
certain indices of a matrix to 1 function

Hi Mark, 

Is this of any help?

  resMat-function(K=6,lag=2,ncol=3*K){
X-matrix(0,K,ncol)
X[,1:(K*lag)]-diag(K)
return(X)
  } 

Cheers, 

Anders. 

On Tuesday 08 May 2007 11:21 am, Leeds, Mark (IED) wrote:
 I wrote an ugly algorithm to set certain elements of a matrix to 1 
 without looping and below works and you can see what The output is 
 below the code.
 
 K-6
 lagnum-2
 
 restrictmat-matrix(0,nrow=K,ncol=K*3)
 restrictmat[((col(restrictmat) - row(restrictmat) = 0 ) 
 (col(restrictmat)-row(restrictmat)) %% K == 0)]-1 
 restrictmat[,(lagnum*K+1):ncol(restrictmat)]-0
 
  restrictmat
  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] 
 [,13] [,14] [,15] [,16] [,17] [,18]
 [1,]100000100 0 0 0
 0 0 0 0 0 0
 [2,]010000010 0 0 0
 0 0 0 0 0 0
 [3,]001000001 0 0 0
 0 0 0 0 0 0
 [4,]000100000 1 0 0
 0 0 0 0 0 0
 [5,]000010000 0 1 0
 0 0 0 0 0 0
 [6,]000001000 0 0 1
 0 0 0 0 0 0
 
 For lagnum equals 1 , it also works :
 
  restrictmat
  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] 
 [,13] [,14] [,15] [,16] [,17] [,18]
 [1,]100000000 0 0 0
 0 0 0 0 0 0
 [2,]010000000 0 0 0
 0 0 0 0 0 0
 [3,]001000000 0 0 0
 0 0 0 0 0 0
 [4,]000100000 0 0 0
 0 0 0 0 0 0
 [5,]000010000 0 0 0
 0 0 0 0 0 0
 [6,]000001000 0 0 0
 0 0 0 0 0 0
 
 But I am thinking that there has to be a better way particularly 
 because I'll get an error if I set lagnum to 3.
 Any improvements or total revampings are appreciated. The number of 
 columns will always be a multiple of the number of rows So K doesn't 
 have to be 6. that was just to show what the commands do.
 thanks.
 
 
 This is not an offer (or solicitation of an offer) to 
 buy/se...{{dropped}}
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mantel-Haenszel relative risk with Greenland-Robins variance estimate

2007-05-08 Thread Frank E Harrell Jr
[EMAIL PROTECTED] wrote:
 Would this function help:
 http://www.csm.ornl.gov/~frome/ES/RRMHex/MHanalysis.txt ?
 
 Regards, -Cody

I think so.  Thank you Cody.  If you have time would you mind defining, 
probably offline, the input data elements?  I assume that one of them is 
a stratification factor other than occupational group.

Thanks again
Frank

 
 
 

  Frank E Harrell   
  Jr
  [EMAIL PROTECTED]  To 
  bilt.edu R list r-help@stat.math.ethz.ch   
  Sent by:   cc 
  [EMAIL PROTECTED] 
  at.math.ethz.ch   Subject 
[R] Mantel-Haenszel relative risk   
with Greenland-Robins variance  
  05/08/2007 02:38  estimate
  PM





 
 
 
 
 Does anyone know of an R function for computing the Greenland-Robins
 variance for Mantel-Haenszel relative risks?
 
 Thanks
 Frank
 --

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Skipping rows and columns in large matrix

2007-05-08 Thread Silvia Lomascolo

yes, this works.  It didn't think of concatenating... thanks!



jim holtman wrote:
 
 zzz - zzz[-c(1,45,23,321), -c(23,34,45,67)]
 
 
 On 5/8/07, Silvia Lomascolo [EMAIL PROTECTED] wrote:

 I work on Windows, R version 2.4.1
 I need to leave out many rows and columns when reading a matrix.  The
 'skip'
 commands in read.table, seem to be just for skipping rows, if I
 understand
 well, but I need to skip many columns as well. I tried something very
 simple
 and left one row and one column out, but I don't know how to do it for
 many
 rows and columns at a time.

 Example:

 zzz # euclidean distances matrix

  X1   X2   X3   X4   X5
 X10   11   12 1   15
 X2   110 5   12 3
 X3   125 0 8 4
 X4 1  12 8 0   16
 X5   153 4160

 skip.data = zzz [-3, -3]   #  gives me the same matrix without row 3 and
 column 3... how can I tell R to skip more than one row and column? 
 Columns
 and rows to be left out are not contiguous  The matrix I'm using is
 fairly
 large (335x335), so excel won't read the whole thing, and it's not a
 fixed-width txt file, so I cannot use read.fwf.  I also don't have access
 to
 Unix, as was suggested to other users to solve this problem.


 --
 View this message in context:
 http://www.nabble.com/Skipping-rows-and-columns-in-large-matrix-tf3712332.html#a10384400
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 
 
 -- 
 Jim Holtman
 Cincinnati, OH
 +1 513 646 9390
 
 What is the problem you are trying to solve?
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/Skipping-rows-and-columns-in-large-matrix-tf3712332.html#a10385469
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] draw two plots on a single panel

2007-05-08 Thread Patrick Wang
Hi,

I have 2 dataset,

plot(data1)
plot(data2),

but it comes as two graphs, can I draw both on a single panel so I can
compare them?

Thanks
Pat

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] draw two plots on a single panel

2007-05-08 Thread Gabor Csardi
See ?points and ?lines for a simple solution. Eg:

plot(data1)
points(data2)

if mean this by 'panel'.

Gabor

On Tue, May 08, 2007 at 04:49:47PM -0700, Patrick Wang wrote:
 Hi,
 
 I have 2 dataset,
 
 plot(data1)
 plot(data2),
 
 but it comes as two graphs, can I draw both on a single panel so I can
 compare them?
 
 Thanks
 Pat
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Csardi Gabor [EMAIL PROTECTED]MTA RMKI, ELTE TTK

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plotting a point graph with data in X-axis

2007-05-08 Thread jim holtman
You have to have a valid 'date' object on the x-axis.  Try this:

simulation-data.frame(cbind(my.year=c(rep(2000,8),rep(2001,12),rep(2002,12)),my.month=c(5:12,1:12,1:12)))
simulation$year.month-paste(simulation$my.year,_,ifelse(simulation$my.month=10,simulation$my.month,paste(0,simulation$my.month,sep=)),sep=)
simulation$freq-sample(1:40,32)
# create POSIXct time
simulation$time - ISOdate(simulation$my.year, simulation$my.month,1)
attach(simulation)
plot(time, freq)


On 5/8/07, Milton Cezar Ribeiro [EMAIL PROTECTED] wrote:
 Dear all,

 I have two data frame, on with a complete list of my field survey with 
 frequency data of a sample species. This data frame looks like:


 simulation-data.frame(cbind(my.year=c(rep(2000,8),rep(2001,12),rep(2002,12)),my.month=c(5:12,1:12,1:12)))
 simulation$year.month-paste(simulation$my.year,_,ifelse(simulation$my.month=10,simulation$my.month,paste(0,simulation$my.month,sep=)),sep=)
 simulation$freq-sample(1:40,32)
 attach(simulation)
 plot(year.month, freq)

 As you can see, I have a collumn with the year and month of my samples, and a 
 freq variable with simulated data. I would like to plot this data but when I 
 try to use the plot showed above, I get a error message.

 After bypass this problem, I would like add points in my graph with simulated 
 data for only a random number of survey month, but I need that the full range 
 of surveys be kept on the X-axis. Just to simulate a sample I am using:

 simulation.sample-simulation[sample(1:length(year.month),8, replace=F),]
 simulation.sample$freq-sample(1:40,8)

 Any ideas?

 Kind regards

 Miltinho

 __


[[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Weighted least squares

2007-05-08 Thread Adaikalavan Ramasamy
http://en.wikipedia.org/wiki/Weighted_least_squares gives a formulaic 
description of what you have said.

I believe the original poster has converted something like this

y x
0   1.1
0   2.2
0   2.2
0   2.2
1   3.3
1   3.3
2   4.4
 ...

into something like the following

y x freq
0   1.11
0   2.23
1   3.32
2   4.41
 ...

Now, the variance of means of each row in table above is ZERO because 
the individual elements that comprise each row are identical. Therefore 
your method of using inverse-variance will not work here.

Then is it valid then to use lm( y ~ x, weights=freq ) ?

Regards, Adai



S Ellison wrote:
 Hadley,
 
 You asked
 .. what is the usual way to do a linear 
 regression when you have aggregated data?
 
 Least squares generally uses inverse variance weighting. For aggregated data 
 fitted as mean values, you just need the variances for the _means_. 
 
 So if you have individual means x_i and sd's s_i that arise from aggregated 
 data with n_i observations in group i, the natural weighting is by inverse 
 squared standard error of the mean. The appropriate weight for x_i would then 
 be n_i/(s_i^2). In R, that's n/(s^2), as n and s would be vectors with the 
 same length as x. If all the groups had the same variance, or nearly so, s is 
 a scalar; if they have the same number of observations, n is a scalar. 
 
 Of course, if they have the same variance and same number of observations, 
 they all have the same weight and you needn't weight them at all: see 
 previous posting!
 
 Steve E
 
 
 
 ***
 This email and any attachments are confidential. Any use, co...{{dropped}}
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plotting a point graph with data in X-axis

2007-05-08 Thread Gabor Grothendieck
- use proper spacing to make it easier to read
- start off with set.seed to make it reproducible
- omit cbind and combine all the rep's into one rep in first line
- make the date column a known date class (here Date),


set.seed(1)
sim - data.frame(
   my.year = rep(2000:2002, c(8, 12, 12)),
   my.month = c(5:12, 1:12, 1:12),
   freq = sample(1:40, 32)
)
sim$date - as.Date(paste(sim$my.year, sim$my.month, 1, sep = -))
plot(freq ~ date, sim)

On 5/8/07, Milton Cezar Ribeiro [EMAIL PROTECTED] wrote:
 Dear all,

 I have two data frame, on with a complete list of my field survey with 
 frequency data of a sample species. This data frame looks like:


 simulation-data.frame(cbind(my.year=c(rep(2000,8),rep(2001,12),rep(2002,12)),my.month=c(5:12,1:12,1:12)))
 simulation$year.month-paste(simulation$my.year,_,ifelse(simulation$my.month=10,simulation$my.month,paste(0,simulation$my.month,sep=)),sep=)
 simulation$freq-sample(1:40,32)
 attach(simulation)
 plot(year.month, freq)

 As you can see, I have a collumn with the year and month of my samples, and a 
 freq variable with simulated data. I would like to plot this data but when I 
 try to use the plot showed above, I get a error message.

 After bypass this problem, I would like add points in my graph with simulated 
 data for only a random number of survey month, but I need that the full range 
 of surveys be kept on the X-axis. Just to simulate a sample I am using:

 simulation.sample-simulation[sample(1:length(year.month),8, replace=F),]
 simulation.sample$freq-sample(1:40,8)

 Any ideas?

 Kind regards

 Miltinho

 __


[[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Skipping rows and columns in large matrix

2007-05-08 Thread Gabor Grothendieck
Columns that correspond to NULLs in the colClasses arg of read.table
are ignored.  ?read.table

On 5/8/07, Silvia Lomascolo [EMAIL PROTECTED] wrote:

 I work on Windows, R version 2.4.1
 I need to leave out many rows and columns when reading a matrix.  The 'skip'
 commands in read.table, seem to be just for skipping rows, if I understand
 well, but I need to skip many columns as well. I tried something very simple
 and left one row and one column out, but I don't know how to do it for many
 rows and columns at a time.

 Example:

 zzz # euclidean distances matrix

  X1   X2   X3   X4   X5
 X10   11   12 1   15
 X2   110 5   12 3
 X3   125 0 8 4
 X4 1  12 8 0   16
 X5   153 4160

 skip.data = zzz [-3, -3]   #  gives me the same matrix without row 3 and
 column 3... how can I tell R to skip more than one row and column?  Columns
 and rows to be left out are not contiguous  The matrix I'm using is fairly
 large (335x335), so excel won't read the whole thing, and it's not a
 fixed-width txt file, so I cannot use read.fwf.  I also don't have access to
 Unix, as was suggested to other users to solve this problem.


 --
 View this message in context: 
 http://www.nabble.com/Skipping-rows-and-columns-in-large-matrix-tf3712332.html#a10384400
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plotting a point graph with data in X-axis

2007-05-08 Thread Adaikalavan Ramasamy
R understands only numerical and Date class values for axis. So either


a) plot them using the sequence 1, ..., 32 and then explicitly label 
them. Here is an example:

  n - length(year.month)
  plot( 1:n, freq, xaxt=n)
  mtext( text=year.month, side=1, at=1:n, las=2 )


b) or create the dates in Date format. This option is preferable if the 
dates were varying unequally.

  x - seq( as.Date(2000-05-01), as.Date(2002-12-01), by=1 month )
  plot(x, simulation$freq)


BTW, you could also have created year.month via
   paste( rep( 2000:2002, c(8,12,12) ),
  formatC( c(5:12,1:12,1:12), width=2, flag=0 ) , sep=_ )


Regards, Adai




Milton Cezar Ribeiro wrote:
 Dear all,
 
 I have two data frame, on with a complete list of my field survey with 
 frequency data of a sample species. This data frame looks like:
 
 
 simulation-data.frame(cbind(my.year=c(rep(2000,8),rep(2001,12),rep(2002,12)),my.month=c(5:12,1:12,1:12)))
 simulation$year.month-paste(simulation$my.year,_,ifelse(simulation$my.month=10,simulation$my.month,paste(0,simulation$my.month,sep=)),sep=)
 simulation$freq-sample(1:40,32)
 attach(simulation)
 plot(year.month, freq)
 
 As you can see, I have a collumn with the year and month of my samples, and a 
 freq variable with simulated data. I would like to plot this data but when I 
 try to use the plot showed above, I get a error message. 
 
 After bypass this problem, I would like add points in my graph with simulated 
 data for only a random number of survey month, but I need that the full range 
 of surveys be kept on the X-axis. Just to simulate a sample I am using:
 
 simulation.sample-simulation[sample(1:length(year.month),8, replace=F),]
 simulation.sample$freq-sample(1:40,8)
 
 Any ideas?
 
 Kind regards
 
 Miltinho
 
 __
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Looking for a cleaner way to implement a setting certainindices of a matrix to 1 function

2007-05-08 Thread Gabor Grothendieck
Try this:

kronecker(t(rep(1:0, c(lagnum, 3-lagnum))), diag(K))


On 5/8/07, Leeds, Mark (IED) [EMAIL PROTECTED] wrote:
 That's a good idea : I didn't realize that my matrices would look so bad
 in the final email. All I want
 To do is output 1's in the diagonal elements and zero's everywhere else
 but the matrix is not square so by diagonals I
 Really mean if

 Lagnum = 1 then the elements are (1,1), (2,2), (3,3),(4,4),(5,5),(6,6)

 Lagnum = 2 then the elements (1,1), (2,2),
 (3,3),(4,4),(5,5),(6,6),(7,1),(8,2),(9,3),(10,4),(11,5),(12,6)

 Lagnum = 3 then the elements (1,1), (2,2),
 (3,3),(4,4),(5,5),(6,6),(7,1),(8,2),(9,3),(10,4),(11,5),(12,6),(13,1),(1
 4,2),(15,3),(16,4),(17,5),
 (18,6)

 And lagnum always has to be greater than or equal to 1 and less than or
 equal to (number of cols/number of rows ). Thanks
 For your advice.


 -Original Message-
 From: Bert Gunter [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, May 08, 2007 5:34 PM
 To: Leeds, Mark (IED); r-help@stat.math.ethz.ch
 Subject: RE: [R] Looking for a cleaner way to implement a setting
 certainindices of a matrix to 1 function

 Suggestion:

 You might make it easier for folks to help if you explained in clear and
 simple terms what you are trying to do. Code is hard to deconstruct.


 Bert Gunter
 Genentech Nonclinical Statistics


 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Leeds, Mark (IED)
 Sent: Tuesday, May 08, 2007 2:22 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Looking for a cleaner way to implement a setting
 certainindices of a matrix to 1 function

 I wrote an ugly algorithm to set certain elements of a matrix to 1
 without looping and below works and you can see what The output is below
 the code.

 K-6
 lagnum-2

 restrictmat-matrix(0,nrow=K,ncol=K*3)
 restrictmat[((col(restrictmat) - row(restrictmat) = 0 ) 
 (col(restrictmat)-row(restrictmat)) %% K == 0)]-1
 restrictmat[,(lagnum*K+1):ncol(restrictmat)]-0

  restrictmat
 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
 [,13] [,14] [,15] [,16] [,17] [,18]
 [1,]100000100 0 0 0
 0 0 0 0 0 0
 [2,]010000010 0 0 0
 0 0 0 0 0 0
 [3,]001000001 0 0 0
 0 0 0 0 0 0
 [4,]000100000 1 0 0
 0 0 0 0 0 0
 [5,]000010000 0 1 0
 0 0 0 0 0 0
 [6,]000001000 0 0 1
 0 0 0 0 0 0

 For lagnum equals 1 , it also works :

  restrictmat
 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
 [,13] [,14] [,15] [,16] [,17] [,18]
 [1,]100000000 0 0 0
 0 0 0 0 0 0
 [2,]010000000 0 0 0
 0 0 0 0 0 0
 [3,]001000000 0 0 0
 0 0 0 0 0 0
 [4,]000100000 0 0 0
 0 0 0 0 0 0
 [5,]000010000 0 0 0
 0 0 0 0 0 0
 [6,]000001000 0 0 0
 0 0 0 0 0 0

 But I am thinking that there has to be a better way particularly because
 I'll get an error if I set lagnum to 3.
 Any improvements or total revampings are appreciated. The number of
 columns will always be a multiple of the number of rows So K doesn't
 have to be 6. that was just to show what the commands do.
 thanks.
 

 This is not an offer (or solicitation of an offer) to
 buy/se...{{dropped}}

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

 This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plotting a point graph with data in X-axis

2007-05-08 Thread Gabor Grothendieck
One other idea.  If your dates are continguous, as they are here,
you might want to use a ts series for this.  Using the same sim
as in my prior post:

 set.seed(1)
 x - with(sim, ts(freq, start = c(my.year[1], my.month[1]), freq = 12))
 x
 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2000  11  15  22  34   8  32  33  38
2001  21   2   7   6  20  40  35  13  18  23   9  17
2002  19   5  12   3  28  29   1  16  27   4  25  39
 plot(x)


On 5/8/07, Gabor Grothendieck [EMAIL PROTECTED] wrote:
 - use proper spacing to make it easier to read
 - start off with set.seed to make it reproducible
 - omit cbind and combine all the rep's into one rep in first line
 - make the date column a known date class (here Date),


 set.seed(1)
 sim - data.frame(
   my.year = rep(2000:2002, c(8, 12, 12)),
   my.month = c(5:12, 1:12, 1:12),
   freq = sample(1:40, 32)
 )
 sim$date - as.Date(paste(sim$my.year, sim$my.month, 1, sep = -))
 plot(freq ~ date, sim)

 On 5/8/07, Milton Cezar Ribeiro [EMAIL PROTECTED] wrote:
  Dear all,
 
  I have two data frame, on with a complete list of my field survey with 
  frequency data of a sample species. This data frame looks like:
 
 
  simulation-data.frame(cbind(my.year=c(rep(2000,8),rep(2001,12),rep(2002,12)),my.month=c(5:12,1:12,1:12)))
  simulation$year.month-paste(simulation$my.year,_,ifelse(simulation$my.month=10,simulation$my.month,paste(0,simulation$my.month,sep=)),sep=)
  simulation$freq-sample(1:40,32)
  attach(simulation)
  plot(year.month, freq)
 
  As you can see, I have a collumn with the year and month of my samples, and 
  a freq variable with simulated data. I would like to plot this data but 
  when I try to use the plot showed above, I get a error message.
 
  After bypass this problem, I would like add points in my graph with 
  simulated data for only a random number of survey month, but I need that 
  the full range of surveys be kept on the X-axis. Just to simulate a sample 
  I am using:
 
  simulation.sample-simulation[sample(1:length(year.month),8, replace=F),]
  simulation.sample$freq-sample(1:40,8)
 
  Any ideas?
 
  Kind regards
 
  Miltinho
 
  __
 
 
 [[alternative HTML version deleted]]
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Separate x-axis ticks for panels in xyplot()?

2007-05-08 Thread Richard M. Heiberger
This will get you started.

Rich




tmp - data.frame(y=rnorm(20), x=rnorm(20),
  c=factor(rep(letters[1:5], 4)),
  g=factor(rep(1:2, each=10)))
v1 - seq(-1.5, 0, .5)
v2 - 0:2


xyplot(y ~ x | c, groups = g, data=tmp,
   scales = list(x = list(
   relation=sliced,
   at = rep(list(v1, v2), length=5),
   labels = rep(list(
 c('A', 'B', 'C', 'D'),
 c('E', 'F', 'G')), length=5),
   rot = 45)))

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] voronoi.mosaic chokes?

2007-05-08 Thread Andrew Pierce
Hi all,

I am running R 2.5.0 under Windows XP Media Center Edition.  Here's a 
problem that's been stumping me for a few days now, and I can't find 
anything useful in the archives.  I am using voronoi.mosaic (tripack 
package) to create proximity polygons for a study of vegetation 
competition and dynamics.  The points lists are read in from a file for 
each plot, then 8 duplicates are translated around the edges of the plot 
(Toroidal edge correction).  This is completed using the torus(...) 
function that I wrote myself.

VMuncorrected is the voronoi  mosaic that is not toroidally edge corrected
VMcorrected is the voronoi mosaic that is toroidally edge corrected

 treemap = read.table('af1.txt', sep = \t, header = 1)
 VMuncorrected = voronoi.mosaic(treemap$X, treemap$Y)

###Use the torus function to create 8 copies around the study region
 toroid = torus(treemap$X, treemap$Y, 25, 25)

 VMcorrected = voronoi.mosaic(toroid[,1], toroid[,2], duplicate = remove)

Here's the problem. When I read in the points for the data file listed 
above ('af1.txt'), both calls to voronoi.mosaic() work fine.  (The 
second one takes about 1.5 seconds because there are 1147 points in the 
toroidally corrected set).

However, when I read in the points from the next file ('af2.txt'), the 
first call to voronoi.mosaic() works.  The next call (to torus()) also 
works fine.  But the second call to voronoi.mosaic() causes R to freeze 
completely requiring Ctrl-Alt-Del.

I have 10 sets of points and this problem happens for about 5 of them.

Factors I have ruled out:
-too many points in the call (one set had 1147 and worked fine while the 
next set had 801 and froze R)
-duplicate points (taken care of by voronoi.mosaic(..., duplicate = 
remove) and also independently verified by exporting the data.  no 
duplicates in either the original or the toroidally corrected set)
-points too close together in space (minimum distance between two points 
in 'af1.txt' is 0.1414 and works fine.  minimum distance in the second 
set, 'af2.txt', is 0.2236, and this set causes R to freeze)
-not enough memory (each data set is run in a new R session-i.e. R was 
quit between each attempt)
-'flukiness' (the problem happens the same way every time for the 
problem data sets, and the code runs fine every time for the non-problem 
data sets)
-file formats (each text file has the same number of columns, all the 
labels for the columns are identical, and the columns are always in the 
same order)
-outdated versions (I am using R 2.5.0 and updated the tripack package 
within the last week.  also, I update packages about once a month)

This is a very frustrating problem because I get no errors indicating 
any problem with the data, and I have checked the data over and over for 
any type of error and found none.  If anyone has ANY helpful 
suggestions, I would love to hear about any and all of them.

Andrew

p.s. - for those of you who are really intrigued, I can email you the 
.txt files and the code for the torus() function.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] voronoi.mosaic chokes?

2007-05-08 Thread Christos Hatzis
Have you tried 'debug'?

With one of the datasets that crashes your system, run the first
voronoi.mosaic, then the torus function and then

debug(voronoi.mosaic)

and run through the second call of voronoi.mosaic.  You can step though the
code and at least will you find the point where it bombs.

-Christos

Christos Hatzis, Ph.D.
Nuvera Biosciences, Inc.
400 West Cummings Park
Suite 5350
Woburn, MA 01801
Tel: 781-938-3830
www.nuverabio.com
 


 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Andrew Pierce
 Sent: Wednesday, May 09, 2007 12:21 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] voronoi.mosaic chokes?
 
 Hi all,
 
 I am running R 2.5.0 under Windows XP Media Center Edition.  
 Here's a problem that's been stumping me for a few days now, 
 and I can't find anything useful in the archives.  I am using 
 voronoi.mosaic (tripack
 package) to create proximity polygons for a study of 
 vegetation competition and dynamics.  The points lists are 
 read in from a file for each plot, then 8 duplicates are 
 translated around the edges of the plot (Toroidal edge 
 correction).  This is completed using the torus(...) function 
 that I wrote myself.
 
 VMuncorrected is the voronoi  mosaic that is not toroidally 
 edge corrected VMcorrected is the voronoi mosaic that is 
 toroidally edge corrected
 
  treemap = read.table('af1.txt', sep = \t, header = 1)  
 VMuncorrected = voronoi.mosaic(treemap$X, treemap$Y)
 
 ###Use the torus function to create 8 copies around the study 
 region  toroid = torus(treemap$X, treemap$Y, 25, 25)
 
  VMcorrected = voronoi.mosaic(toroid[,1], toroid[,2], 
 duplicate = remove)
 
 Here's the problem. When I read in the points for the data 
 file listed above ('af1.txt'), both calls to voronoi.mosaic() 
 work fine.  (The second one takes about 1.5 seconds because 
 there are 1147 points in the toroidally corrected set).
 
 However, when I read in the points from the next file 
 ('af2.txt'), the first call to voronoi.mosaic() works.  The 
 next call (to torus()) also works fine.  But the second call 
 to voronoi.mosaic() causes R to freeze completely requiring 
 Ctrl-Alt-Del.
 
 I have 10 sets of points and this problem happens for about 5 of them.
 
 Factors I have ruled out:
 -too many points in the call (one set had 1147 and worked 
 fine while the next set had 801 and froze R) -duplicate 
 points (taken care of by voronoi.mosaic(..., duplicate =
 remove) and also independently verified by exporting the 
 data.  no duplicates in either the original or the toroidally 
 corrected set) -points too close together in space (minimum 
 distance between two points in 'af1.txt' is 0.1414 and works 
 fine.  minimum distance in the second set, 'af2.txt', is 
 0.2236, and this set causes R to freeze) -not enough memory 
 (each data set is run in a new R session-i.e. R was quit 
 between each attempt) -'flukiness' (the problem happens the 
 same way every time for the problem data sets, and the code 
 runs fine every time for the non-problem data sets) -file 
 formats (each text file has the same number of columns, all 
 the labels for the columns are identical, and the columns are 
 always in the same order) -outdated versions (I am using R 
 2.5.0 and updated the tripack package within the last week.  
 also, I update packages about once a month)
 
 This is a very frustrating problem because I get no errors 
 indicating any problem with the data, and I have checked the 
 data over and over for any type of error and found none.  If 
 anyone has ANY helpful suggestions, I would love to hear 
 about any and all of them.
 
 Andrew
 
 p.s. - for those of you who are really intrigued, I can email 
 you the .txt files and the code for the torus() function.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >