date:20050701

Re: [R] compiling under windows

2005-07-01 Thread Prof Brian Ripley

On Thu, 30 Jun 2005, Philip Bermingham wrote:

 What is the best way to set up a project in visual studio, work on R and
 re compile?  Is it better to use a different compiler or programming
 environment?  I specifically want to work on C and Fortran extensions.

See the `R Installation and Administration' manual.

It is possible to use Visual C++ (there is no Fortran in Visual Studio, 
although a third-party* extension has been available), but it is easier 
and more reliable to use the compilers used to build R itself. Information 
on using VC++ is in the file README.packages, in the top-level directory 
of a binary installatiion and in R_HOME/src/gnuwin32 in the sources.


* Originally from DEC then Compaq and now apparently from Intel.



-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Nolinear mixed-effects models (nlme)

2005-07-01 Thread Alex Bach

Hello,

I am trying to fit a nonlinear model of the form of:

A*x^b*exp(-c*x)

This represents a lactation curve. I have a bunch of cows, so I want  
COW to be a random effect.

I have been trying the following code with very littel success:

  fm1 - nlme(yield ~ A*(DIM^B)*(exp(-C*DIM)),
+ data = group,
+ fixed = A + B + C ~ 1,
+ start = c(A = 20, B = 0.3, C = 0.03))

Does anyone know how to add the random effect of the cow? I have used  
the command groupedData to have Cow as subject (i.e., yield~DIM |  
cow). Is this a valid and sufficient approach? I have the feeling it  
is not sufficient.

Also, does anyone know whether the formulation of the fixed effects  
is correct?.

Thank you very much,

Alex



[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] p-values for classification

2005-07-01 Thread Arne.Muller

Dear All,

I'm classifying some data with various methods (binary classification). I'm 
interpreting the results via a confusion matrix from which I calculate the 
sensitifity and the fdr. The classifiers are trained on 575 data points and my 
test set has 50 data points.

I'd like to calculate p-values for obtaining =fdr and =sensitifity for each 
classifier. I was thinking about shuffling/bootstrap the lables of the test 
set, classify them and calculating the p-value from the obtained normal 
distributed random fdr and sensitifity.

The problem is that it's rather slow when running many rounds of 
shuffling/classification (I'd like to do this for many classifiers and 
parameter combinations). In addition classification of the 50 test data points 
with shuffled lables realistically produces only a  very limited number of 
possible fdr's and sensitivities, and I'm wondering if I can realy believe 
these values to be normal.

Basically I'm looking for a way to calculate the p-values analytically. I'd be 
happy  for any suggestions, web-addresses or references.

kind regads,

Arne

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] [OT] gmail filter for R-help and R-devel lists

2005-07-01 Thread ecoinfo

One filter is enough:
 to: r-help
 Xiaohua

 On 6/30/05, Matthew Nelson [EMAIL PROTECTED] wrote: 
 
 A clarification: this only works properly when the To addresses are
 in separate filters.
 
 Sorry for the confusion.
 
 Matt
 
 On 6/30/05, Matthew Nelson [EMAIL PROTECTED] wrote:
  Doug,
 
  I was able to accomplish this for r-help by filtering on the To
  field with the following addresses:
  r-help@stat.math.ethz.ch, [EMAIL PROTECTED]
 
  I created this a week ago and it has so far filtered every mailing
  list messages successfully. Gmail conversations are a wonderful way
  to catch up on list activity after periods of neglect.
 
  Best regards,
  Matt
 
  On 6/30/05, Douglas Bates [EMAIL PROTECTED] wrote:
   This is slightly off-topic but I would be interested in whether anyone
   has succeeded in creating a filter expression for Google's gmail
   system that will select messages sent through the R-help and R-devel
   lists. It seems as if it should be easy to select on '[R]' or '[Rd]'
   in the subject line but I haven't been able to work out the exact
   syntax that would do this and not select messages that have an 'R'
   anywhere in the subject.
  
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
  
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 



-- 
Xiaohua Dai, Dr.
Centre for Systems Research, Durban Institute of Technology
P.O.Box 953, Durban 4000, South Africa
Tel: +27-31-2042737(O) Fax: +27-31-2042736(O)
Mobile: +27-723682954

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] loop over large dataset

2005-07-01 Thread Federico Calboli

Hi All,

I'd like to ask for a few clarifications. I am doing some calculations
over some biggish datasets. One has ~ 23000 rows, and 6 columns, the
other has ~62 rows and 6 columns.

I am using these datasets to perform a simulation of of haplotype
coalescence over a pedigree (the datestes themselves are pedigree
information). I created a new dataset (same number of rows as the
pedigree dataset, 2 colums) and I use a looping functions to assign
haplotypes according to a standrd biological reprodictive process (i.e.
meiosis, sexual reproduction).

My code is someting like:

  off = function(sire, dam){ # simulation of reproduction, two inds
  sch.toll = round(runif(1, min = 1, max = 2))
  dch.toll = round(runif(1, min = 1, max = 2))
  s.gam = sire[,sch.toll]
  d.gam = dam[,dch.toll]
  offspring = cbind(s.gam,d.gam)
# offspring
}

for (i in 1:dim(new)[1]){
if(ped[i,3] != 0  ped[i,5] != 0){
zz = off(as.matrix(t(new[ped[i,3],])),as.matrix(t(new[ped[i,5],])))
new[i,1] = zz[1]
new[i,2] = zz[2]
}
}

I am also attribution a generation index to each row with a trivial
calulation:

for(i in atres){
  genz[i] = (genz[ped[i,3]] + genz[ped[i,5]])/2 + 1
  #print(genz[i])
}

My question then. On the 23000 rows dataset the calculations take about
5 minutes. On the 62 rows one I kill the process after ~24 hours,
and the the job is not finished. Why such immense discrepancy in
execution times (the code is the same, the datasets are stored in two
separate .RData files)?

Any light would be appreciated.

Federico

PS I am running R 2.1.0 on Debian Sarge, on a Dual 3 GHz Xeon machine
with 2 gig RAM. The R process uses 99% of the CPU, but hardly any RAM
for what I gather from top.



-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] the format of the result

2005-07-01 Thread ronggui

I write a function to get the frequency and prop of a variable.

freq-function(x,digits=3)
{naa-is.na(x)
nas-sum(naa)
if (any(naa))
x-x[!naa]
n-length(x)
ta-table(x)
prop-prop.table(ta)*100
res-rbind(ta,prop)
rownames(res)-c(Freq,Prop)
cat(Missing value(s) are,nas,.\n)
cat(Valid case(s) are,n,.\n)
cat(Total case(s) are,(n+nas),.\n\n)
print(res,digits=(digits+2))
cat(\n)
}

 freq(sample(letters[1:3],48,T),2)
Missing value(s) are 0 .
Valid case(s) are 48 .
Total case(s) are 48 .

 a b c
Freq 11.00 20.00 17.00
Prop 22.92 41.67 35.42

and i want the result to be like
 a  b c
Freq 11.00  20.00  17.00
Prop 22.92% 41.67% 35.42%

how should i change my function to get what i want?
-- 
Department of Sociology
Fudan University,Shanghai
Blog:http://sociology.yculblog.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Simple indexing conundrum

2005-07-01 Thread Martin Henry H. Stevens

My apologies in advance for my thickness but I can't seem to solve the 
following, seemingly simple, data manipulation problem:

I have a data frame that contains multiple factors and multiple 
continuous response variables, but duplicates of some factor 
combinations. The duplicates contain bad data, so I would like to 
eliminate the duplicates. I would like to retain the entire rows 
identified by the maximum value of one particular continuous response 
variable.

For instance,

 data(airquality)

  str(airquality)
`data.frame':   153 obs. of  6 variables:
  $ Ozone  : int  41 36 12 18 NA 28 23 19 8 NA ...
  $ Solar.R: int  190 118 149 313 NA NA 299 99 19 194 ...
  $ Wind   : num  7.4 8 12.6 11.5 14.3 14.9 8.6 13.8 20.1 8.6 ...
  $ Temp   : int  67 72 74 62 56 66 65 59 61 69 ...
  $ Month  : int  5 5 5 5 5 5 5 5 5 5 ...
  $ Day: int  1 2 3 4 5 6 7 8 9 10 ...

I would like to subset airquality, retaining only the rows, containing 
the maximum Solar.R for each month.

Any solution would be greatly appreciated.

Regards,
Hank



Dr. Martin Henry H. Stevens, Assistant Professor
338 Pearson Hall
Botany Department
Miami University
Oxford, OH 45056

Office: (513) 529-4206
Lab: (513) 529-4262
FAX: (513) 529-4243
http://www.cas.muohio.edu/botany/bot/henry.html
http://www.muohio.edu/ecology/
http://www.muohio.edu/botany/
E Pluribus Unum

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] p-values for classification

2005-07-01 Thread Prof Brian Ripley

Not really an R question.

Most classifiers will produce predicted probabilities, and you can check 
their accuracy.  There are lots of details in my PRNN book, and some 
examples in MASS4.

I suggest you adjust your training and test sets to be more nearly equal, 
or use cross-validation.

I don't see how shuffling the labels will help: you want to know how well 
a classifier does when there is a real relationship between the 
explanatory variables and the class.  To take a simple example, suppose 
the classes are clearly linearly separable.  Then a logistic discriminant 
will have nigh-perfect performance on the actual data, but very poor 
performance on permuted labels.  You would do a lot better to simulate 
from a good fitted model, the so-called parametric bootstrapping.

On Fri, 1 Jul 2005 [EMAIL PROTECTED] wrote:

 Dear All,

 I'm classifying some data with various methods (binary classification). 
 I'm interpreting the results via a confusion matrix from which I 
 calculate the sensitifity and the fdr. The classifiers are trained on 
 575 data points and my test set has 50 data points.

 I'd like to calculate p-values for obtaining =fdr and =sensitifity for 
 each classifier. I was thinking about shuffling/bootstrap the lables of 
 the test set, classify them and calculating the p-value from the 
 obtained normal distributed random fdr and sensitifity.

 The problem is that it's rather slow when running many rounds of 
 shuffling/classification (I'd like to do this for many classifiers and 
 parameter combinations). In addition classification of the 50 test data 
 points with shuffled lables realistically produces only a very limited 
 number of possible fdr's and sensitivities, and I'm wondering if I can 
 realy believe these values to be normal.

 Basically I'm looking for a way to calculate the p-values analytically. 
 I'd be happy for any suggestions, web-addresses or references.

   kind regads,

   Arne

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] R integration with Microsoft Powerpoint

2005-07-01 Thread John Sorkin

Please allow me an unusual question.
 
Is there any way that R can be closely integrated with a Microsoft
Powerpoint presentation? I would like to embed R calculations in
Powerpoint so that I will start Powerpoint, be prompted to enter some
parameters, and an R function will run and return values and graphs.
 
Thanks,
John 
 
R 2.1.1
windows 2k
 
John Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
Baltimore VA Medical Center GRECC and
University of Maryland School of Medicine Claude Pepper OAIC
 
University of Maryland School of Medicine
Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
 
410-605-7119 
- NOTE NEW EMAIL ADDRESS:
[EMAIL PROTECTED]

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] [R-pkgs] New CRAN package relax: R Editor for Literate Analysis and lateX

2005-07-01 Thread Peter Wolf

Now package relax is on CRAN.

The name relax is short for
 
R Editor for Literate Analysis and lateX

The main element of package relax is the function relax() which starts an
all-in-one editor for data analysis and easy creation of LaTeX based 
documents
with R.

After calling relax() it creates a tcl/tk widget with a report field. 
 The report field
enables you to enter R expressions as well as pieces of text to document 
your ideas.
Computations and plots can be included quickly. After finishing your work
the sequence of text chunks, code chunks and integrated graphics and/or 
R-output
will constitute the basis of your work. To achieve a higher quality 
relax integrates
LaTeX compilation for professional formatting and pretty printing.

Dependencies:

* R (= 2.1.0), tcltk
* relax runs on windows systems, LaTeX / ghostscript has to be installed
* on Linux systems you have to install the img-package for tcltk

For further info see: 
http://www.wiwi.uni-bielefeld.de/~wolf/software/relax/relax.html

maintainer:

Hans Peter Wolf
Department of Economics
University of Bielefeld
[EMAIL PROTECTED]



R Editor for Literate Analysis and lateX

--- the all-in-one editor for data analysis
and easy creation of LaTeX based documents with R

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] barplot legend

2005-07-01 Thread Navarre Sabine

Hi,
 
Is it possible ti put the legend out of a barplot?
 
tanks
 
Sabine


-


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] loop over large dataset

2005-07-01 Thread Liaw, Andy

My suggestion is that you try to vectorize the computation as much as you
can.

From what you've shown, `new' and `ped' need to have the same number of
rows, right?

Your `off' function seems to be randomly choosing between columns 1 and 2
from its two input matrices (one row each?).  You may want to do the
sampling all at once instead of looping over the rows.  E.g.,

 (m - matrix(1:10, ncol=2))
 [,1] [,2]
[1,]16
[2,]27
[3,]38
[4,]49
[5,]5   10
 (colSample - sample(1:2, nrow(m), replace=TRUE))
[1] 1 1 2 1 1
 (x - m[cbind(1:nrow(m), colSample)])
[1] 1 2 8 4 5

So you might want to do something like (obviously untested):

todo - ped[,3] * ped[,5] != 0  ## indicator of which rows to work on
n.todo - sum(todo)  ## how many are there?
sire - new[ped[todo, 3], ]
dam - new[ped[todo, 5], ]
s.gam - sire[1:nrow(sire), sample(1:2, nrow(sire), replace=TRUE)]
d.gam - dam[1:nrow(dam), sample(1:2, nrow(dam), replace=TRUE)]
new[todo, 1:2] - cbind(s.gam, d.gam)

Andy


 From: Federico Calboli
 
 Hi All,
 
 I'd like to ask for a few clarifications. I am doing some calculations
 over some biggish datasets. One has ~ 23000 rows, and 6 columns, the
 other has ~62 rows and 6 columns.
 
 I am using these datasets to perform a simulation of of haplotype
 coalescence over a pedigree (the datestes themselves are pedigree
 information). I created a new dataset (same number of rows as the
 pedigree dataset, 2 colums) and I use a looping functions to assign
 haplotypes according to a standrd biological reprodictive 
 process (i.e.
 meiosis, sexual reproduction).
 
 My code is someting like:
 
   off = function(sire, dam){ # simulation of reproduction, two inds
   sch.toll = round(runif(1, min = 1, max = 2))
   dch.toll = round(runif(1, min = 1, max = 2))
   s.gam = sire[,sch.toll]
   d.gam = dam[,dch.toll]
   offspring = cbind(s.gam,d.gam)
 # offspring
 }
 
 for (i in 1:dim(new)[1]){
 if(ped[i,3] != 0  ped[i,5] != 0){
 zz = off(as.matrix(t(new[ped[i,3],])),as.matrix(t(new[ped[i,5],])))
 new[i,1] = zz[1]
 new[i,2] = zz[2]
 }
 }
 
 I am also attribution a generation index to each row with a trivial
 calulation:
 
 for(i in atres){
   genz[i] = (genz[ped[i,3]] + genz[ped[i,5]])/2 + 1
   #print(genz[i])
 }
 
 My question then. On the 23000 rows dataset the calculations 
 take about
 5 minutes. On the 62 rows one I kill the process after ~24 hours,
 and the the job is not finished. Why such immense discrepancy in
 execution times (the code is the same, the datasets are stored in two
 separate .RData files)?
 
 Any light would be appreciated.
 
 Federico
 
 PS I am running R 2.1.0 on Debian Sarge, on a Dual 3 GHz Xeon machine
 with 2 gig RAM. The R process uses 99% of the CPU, but hardly any RAM
 for what I gather from top.
 
 
 
 -- 
 Federico C. F. Calboli
 Department of Epidemiology and Public Health
 Imperial College, St Mary's Campus
 Norfolk Place, London W2 1PG
 
 Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193
 
 f.calboli [.a.t] imperial.ac.uk
 f.calboli [.a.t] gmail.com
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] It is time to say thank you.

2005-07-01 Thread John Sorkin

I would like to express my thanks to the many people who got together
and developed the R project. The idea, and work, of organizing and, for
no compensation, supporting an open software project must have been (and
still be) daunting.  It is clear that the availability of a free,
high-quality programing environment for programming and statistical
analysis has allowed people around the world to perform analyses that
they previously could not do. The continued time and effort that the R
community gives to support that R is greatly appreciated.
 
Many thanks to the organizers, developers, and supporters of the R
project!
 
John
 
John Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
Baltimore VA Medical Center GRECC and
University of Maryland School of Medicine Claude Pepper OAIC
 
University of Maryland School of Medicine
Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
 
410-605-7119 
- NOTE NEW EMAIL ADDRESS:
[EMAIL PROTECTED]

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Simple indexing conundrum

2005-07-01 Thread Liaw, Andy

Is this close to what you want?

 air.sub - do.call(rbind, lapply(split(airquality, airquality$Month), 
+function(d) d[which.max(d$Solar.R),]))
 air.sub
  Ozone Solar.R Wind Temp Month Day
514 334 11.5   64 5  16
6NA 332 13.8   80 6  14
740 314 10.9   83 7   6
828 273 11.5   82 8  13
924 259  9.7   73 9  10

Andy

 From: Martin Henry H. Stevens
 
 My apologies in advance for my thickness but I can't seem to 
 solve the 
 following, seemingly simple, data manipulation problem:
 
 I have a data frame that contains multiple factors and multiple 
 continuous response variables, but duplicates of some factor 
 combinations. The duplicates contain bad data, so I would like to 
 eliminate the duplicates. I would like to retain the entire rows 
 identified by the maximum value of one particular continuous response 
 variable.
 
 For instance,
 
  data(airquality)
 
   str(airquality)
 `data.frame': 153 obs. of  6 variables:
   $ Ozone  : int  41 36 12 18 NA 28 23 19 8 NA ...
   $ Solar.R: int  190 118 149 313 NA NA 299 99 19 194 ...
   $ Wind   : num  7.4 8 12.6 11.5 14.3 14.9 8.6 13.8 20.1 8.6 ...
   $ Temp   : int  67 72 74 62 56 66 65 59 61 69 ...
   $ Month  : int  5 5 5 5 5 5 5 5 5 5 ...
   $ Day: int  1 2 3 4 5 6 7 8 9 10 ...
 
 I would like to subset airquality, retaining only the rows, 
 containing 
 the maximum Solar.R for each month.
 
 Any solution would be greatly appreciated.
 
 Regards,
 Hank
 
 
 
 Dr. Martin Henry H. Stevens, Assistant Professor
 338 Pearson Hall
 Botany Department
 Miami University
 Oxford, OH 45056
 
 Office: (513) 529-4206
 Lab: (513) 529-4262
 FAX: (513) 529-4243
 http://www.cas.muohio.edu/botany/bot/henry.html
 http://www.muohio.edu/ecology/
 http://www.muohio.edu/botany/
 E Pluribus Unum
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] barplot legend

2005-07-01 Thread Marc Schwartz

On Fri, 2005-07-01 at 14:04 +0200, Navarre Sabine wrote:
 Hi,
  
 Is it possible ti put the legend out of a barplot?
  
 tanks
  
 Sabine


I presume that you mean outside the plot region?

If so, you can use something like the following:

# Adjust the plot margins to make room for the 
# legend on the right side. See ?par
par(mar = c(5, 4, 4, 10) + 0.1)

barplot(1:10)
box()

# Set xpd to allow legend placement outside
# plot region. See ?par
par(xpd = TRUE)

# Left click on the right side of the window where you want
# the legend. See ?locator
l - locator(1)

# Now put the legend where you clicked
# See ?legend
legend(l$x, l$y, legend = Legend Here)

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Simple indexing conundrum

2005-07-01 Thread Jim Brennan

Here is a different approach I only send since the result is slightly
different in that two rows are returned for Month 9 and the original row
number is retained.

 max2-function(x){max(x,na.rm=T)}
 MonthMax-ave(Solar.R,Month,FUN=max2)
 new-subset(airquality,Solar.R==MonthMax)
 new-subset(airquality,Solar.R==MonthMax)
 new
Ozone Solar.R Wind Temp Month Day
16 14 334 11.5   64 5  16
45 NA 332 13.8   80 6  14
67 40 314 10.9   83 7   6
10528 273 11.5   82 8  13
13324 259  9.7   73 9  10
13521 259 15.5   76 9  12

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Liaw, Andy
Sent: July 1, 2005 8:31 AM
To: 'Martin Henry H. Stevens'; R-Help
Subject: Re: [R] Simple indexing conundrum

Is this close to what you want?

 air.sub - do.call(rbind, lapply(split(airquality, airquality$Month), 
+function(d) d[which.max(d$Solar.R),]))
 air.sub
  Ozone Solar.R Wind Temp Month Day
514 334 11.5   64 5  16
6NA 332 13.8   80 6  14
740 314 10.9   83 7   6
828 273 11.5   82 8  13
924 259  9.7   73 9  10

Andy

 From: Martin Henry H. Stevens
 
 My apologies in advance for my thickness but I can't seem to 
 solve the 
 following, seemingly simple, data manipulation problem:
 
 I have a data frame that contains multiple factors and multiple 
 continuous response variables, but duplicates of some factor 
 combinations. The duplicates contain bad data, so I would like to 
 eliminate the duplicates. I would like to retain the entire rows 
 identified by the maximum value of one particular continuous response 
 variable.
 
 For instance,
 
  data(airquality)
 
   str(airquality)
 `data.frame': 153 obs. of  6 variables:
   $ Ozone  : int  41 36 12 18 NA 28 23 19 8 NA ...
   $ Solar.R: int  190 118 149 313 NA NA 299 99 19 194 ...
   $ Wind   : num  7.4 8 12.6 11.5 14.3 14.9 8.6 13.8 20.1 8.6 ...
   $ Temp   : int  67 72 74 62 56 66 65 59 61 69 ...
   $ Month  : int  5 5 5 5 5 5 5 5 5 5 ...
   $ Day: int  1 2 3 4 5 6 7 8 9 10 ...
 
 I would like to subset airquality, retaining only the rows, 
 containing 
 the maximum Solar.R for each month.
 
 Any solution would be greatly appreciated.
 
 Regards,
 Hank
 
 
 
 Dr. Martin Henry H. Stevens, Assistant Professor
 338 Pearson Hall
 Botany Department
 Miami University
 Oxford, OH 45056
 
 Office: (513) 529-4206
 Lab: (513) 529-4262
 FAX: (513) 529-4243
 http://www.cas.muohio.edu/botany/bot/henry.html
 http://www.muohio.edu/ecology/
 http://www.muohio.edu/botany/
 E Pluribus Unum
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] 10^k axis labels {was .. (log scale on y-axis)}

2005-07-01 Thread Martin Maechler

 Gabor == Gabor Grothendieck [EMAIL PROTECTED]
 on Thu, 30 Jun 2005 07:28:30 -0400 writes:

Gabor On 6/29/05, Jing Shen [EMAIL PROTECTED] wrote:

 I am planning to plot my data on log scale (y-axis). There is a
 parameter in plot function, which is
 plot( ..., log=y, ...)
 While, the problem is that it is with base of e. Is there a way to let
 me change it to 10 instead of e?
 

Gabor Is your question how to get the axis labels to be powers of 10?
Gabor In that case,

Gabor plot(1:100, log = y, yaxt = n)  # do not show y axis 
Gabor axis(2, c(1,10,100))  # draw y axis with required labels

and if you're there, you might be interested in the following
which provides a somewhat automated way to show 
a * 10 ^ k tick-labels instead of the scientific a e k ones.
{ For some time, I had wanted that something like this could
  become an easy option for builtin axis(*), but then I also know
  that we should rather strive to build future-proof tools, which
  hence should we applicable to 'grid' as well as to old-style
  'graphics'  and all this got me stuck in the process ...
}

Martin Maechler, ETH Zurich

---

axTexpr - function(side, at = axTicks(side, axp=axp, usr=usr, log=log),
axp = NULL, usr = NULL, log = NULL)
{
## Purpose: Do a 10^k labeling instead of a ek
##this auxiliary should return 'at' and 'label' (expression)
## --
## Arguments: as for axTicks()
## --
## Author: Martin Maechler, Date:  7 May 2004, 18:01
eT - floor(log10(abs(at)))# at == 0 case is dealt with below
mT - at / 10^eT
ss - lapply(seq(along = at),
 function(i) if(at[i] == 0) quote(0) else
 substitute(A %*% 10^E, list(A=mT[i], E=eT[i])))
do.call(expression, ss)
}

par(mar=.1+c(5,5,4,1))##  For the horizontal y-axis labels, need more space
plot(x,y, axes= FALSE, frame=TRUE)
aX - axTicks(1); axis(1, at=aX, label= axTexpr(1, aX))
if(FALSE) # rather the next one
{ aY - axTicks(2); axis(2, at=aY, label= axTexpr(2, aY))}
## or rather (horizontal labels on y-axis):
aY - axTicks(2); axis(2, at=aY, label= axTexpr(2, aY), las=2)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] FW: plot legend outside the grid

2005-07-01 Thread Uwe Ligges

In principle you are there, just after opening the device set par(mar) 
appropriate (large margin to place the legend in) before starting with 
plotting.
In the legend, specify points which virtually are in the margin you have 
already expanded...


Uwe Ligges



Ghosh, Sandeep wrote:

 
 -Original Message-
 From: Ghosh, Sandeep 
 Sent: Thursday, June 30, 2005 5:43 PM
 To: 'Berton Gunter'
 Subject: plot legend outside the grid 
 
 
 Thanks for the pointers... I managed to get everything to look and feel the 
 way I want except for the legend to plot outside the grid... Thanks for the 
 note on the par, but I'm not able to it to plot outside the plot grid..
 
 dataFrame - as.data.frame(t(structure(c(
 64,'wt',
 62,'wt',
 66,'wt',

[SNIP]


 63,'hom',
 64,'hom',
 67,'hom'), .Dim=c(2,98;
 
 colnames(dataFrame) - c('marbles_buried', 'genotype');
 
 png('mb.png', width=400, height=400, pointsize=8);
 
 dataFrame[c(marbles_buried)] - lapply(dataFrame[c(marbles_buried)], 
 function(x) as.numeric(levels(x)[x]));
 
 par(xpd=FALSE)
 
 with (dataFrame, stripchart(marbles_buried ~ genotype, method=jitter, 
 vertical=TRUE,  col = c('blue', 'red', 'green'), xlab='Genotype', ylab = 
 Marbles Buried, main='MBA WTs Vs HOMs', pch=c(1,4,2), jitter=1/3.5, cex=1))
 
 meds - as.vector(with(dataFrame, by(marbles_buried, genotype, mean)))
 segments((1:3)-0.25, meds, (1:3)+0.25, meds, col = c('blue', 'red', 'green'));
 
 dataWt - subset(dataFrame, genotype=='wt', 
 select=c(marbles_buried,genotype));
 dataHet - subset(dataFrame, genotype=='het', 
 select=c(marbles_buried,genotype));
 dataHom - subset(dataFrame, genotype=='hom', 
 select=c(marbles_buried,genotype));
 
 wtCount - length(dataWt$marbles_buried);
 hetCount - length(dataHet$marbles_buried);
 homCount - length(dataHom$marbles_buried);
 wtLegend - paste (wt, (n=, wtCount, ));
 hetLegend - paste (het, (n=, hetCount, ));
 homLegend - paste (hom, (n=, homCount, ));
 
 par(xpd=TRUE)
 legend(1, max(as.vector(dataFrame$marbles_buried)), c(wtLegend, hetLegend, 
 homLegend), col=c('blue', 'red', 'green'), pch=c(1,4,2));
 
 -Thanks
 Sandeep.
 
 -Original Message-
 From: Berton Gunter [mailto:[EMAIL PROTECTED]
 Sent: Thursday, June 30, 2005 2:55 PM
 To: Ghosh, Sandeep
 Subject: RE: [R] Help with stripplot
 
 
 Of course!
 
 stripchart() is a base graphics function and tehrefore has available to it
 the base graphics functionality, like (the base graphics function, **not**
 the lattice argument) legend(). See ?legend in the graphics package. Note
 the use of locator() for positioning the legend.
 
 Note: By default the legend will be clipped to the plot region. If you wish
 to have a legend outside the plot region set the xpd parameter of par to
 TRUE or NA prior to plotting.
 
 -- Bert 
 
 
-Original Message-
From: [EMAIL PROTECTED] 
[mailto:[EMAIL PROTECTED] On Behalf Of Ghosh, Sandeep
Sent: Thursday, June 30, 2005 12:22 PM
To: Deepayan Sarkar; r-help@stat.math.ethz.ch
Subject: Re: [R] Help with stripplot

Another question, in stripchart is there a way to draw a 
legends. I need legends that gives the mice count for each 
genotype wt/het/hom, something like the xyplot plot support 
for key/auto.key.

-Sandeep
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Simple indexing conundrum

2005-07-01 Thread Martin Henry H. Stevens

Thank guys! Both solutions do what I need. Thanks.
Hank
On Jul 1, 2005, at 8:45 AM, Jim Brennan wrote:

 Here is a different approach I only send since the result is slightly
 different in that two rows are returned for Month 9 and the original 
 row
 number is retained.

 max2-function(x){max(x,na.rm=T)}
 MonthMax-ave(Solar.R,Month,FUN=max2)
 new-subset(airquality,Solar.R==MonthMax)
 new-subset(airquality,Solar.R==MonthMax)
 new
 Ozone Solar.R Wind Temp Month Day
 16 14 334 11.5   64 5  16
 45 NA 332 13.8   80 6  14
 67 40 314 10.9   83 7   6
 10528 273 11.5   82 8  13
 13324 259  9.7   73 9  10
 13521 259 15.5   76 9  12

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Liaw, Andy
 Sent: July 1, 2005 8:31 AM
 To: 'Martin Henry H. Stevens'; R-Help
 Subject: Re: [R] Simple indexing conundrum

 Is this close to what you want?

 air.sub - do.call(rbind, lapply(split(airquality, 
 airquality$Month),
 +function(d) 
 d[which.max(d$Solar.R),]))
 air.sub
   Ozone Solar.R Wind Temp Month Day
 514 334 11.5   64 5  16
 6NA 332 13.8   80 6  14
 740 314 10.9   83 7   6
 828 273 11.5   82 8  13
 924 259  9.7   73 9  10

 Andy

 From: Martin Henry H. Stevens

 My apologies in advance for my thickness but I can't seem to
 solve the
 following, seemingly simple, data manipulation problem:

 I have a data frame that contains multiple factors and multiple
 continuous response variables, but duplicates of some factor
 combinations. The duplicates contain bad data, so I would like to
 eliminate the duplicates. I would like to retain the entire rows
 identified by the maximum value of one particular continuous response
 variable.

 For instance,

 data(airquality)

 str(airquality)
 `data.frame':153 obs. of  6 variables:
   $ Ozone  : int  41 36 12 18 NA 28 23 19 8 NA ...
   $ Solar.R: int  190 118 149 313 NA NA 299 99 19 194 ...
   $ Wind   : num  7.4 8 12.6 11.5 14.3 14.9 8.6 13.8 20.1 8.6 ...
   $ Temp   : int  67 72 74 62 56 66 65 59 61 69 ...
   $ Month  : int  5 5 5 5 5 5 5 5 5 5 ...
   $ Day: int  1 2 3 4 5 6 7 8 9 10 ...

 I would like to subset airquality, retaining only the rows,
 containing
 the maximum Solar.R for each month.

 Any solution would be greatly appreciated.

 Regards,
 Hank



 Dr. Martin Henry H. Stevens, Assistant Professor
 338 Pearson Hall
 Botany Department
 Miami University
 Oxford, OH 45056

 Office: (513) 529-4206
 Lab: (513) 529-4262
 FAX: (513) 529-4243
 http://www.cas.muohio.edu/botany/bot/henry.html
 http://www.muohio.edu/ecology/
 http://www.muohio.edu/botany/
 E Pluribus Unum

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html




 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html


Dr. Martin Henry H. Stevens, Assistant Professor
338 Pearson Hall
Botany Department
Miami University
Oxford, OH 45056

Office: (513) 529-4206
Lab: (513) 529-4262
FAX: (513) 529-4243
http://www.cas.muohio.edu/botany/bot/henry.html
http://www.muohio.edu/ecology/
http://www.muohio.edu/botany/
E Pluribus Unum

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] 10^k axis labels {was .. (log scale on y-axis)}

2005-07-01 Thread Gabor Grothendieck

On 7/1/05, Martin Maechler [EMAIL PROTECTED] wrote:
  Gabor == Gabor Grothendieck [EMAIL PROTECTED]
  on Thu, 30 Jun 2005 07:28:30 -0400 writes:
 
Gabor On 6/29/05, Jing Shen [EMAIL PROTECTED] wrote:
 
 I am planning to plot my data on log scale (y-axis). There is a
 parameter in plot function, which is
 plot( ..., log=y, ...)
 While, the problem is that it is with base of e. Is there a way to let
 me change it to 10 instead of e?

 
Gabor Is your question how to get the axis labels to be powers of 10?
Gabor In that case,
 
Gabor plot(1:100, log = y, yaxt = n)  # do not show y axis
Gabor axis(2, c(1,10,100))  # draw y axis with required labels
 
 and if you're there, you might be interested in the following
 which provides a somewhat automated way to show
 a * 10 ^ k tick-labels instead of the scientific a e k ones.
 { For some time, I had wanted that something like this could
  become an easy option for builtin axis(*), but then I also know
  that we should rather strive to build future-proof tools, which
  hence should we applicable to 'grid' as well as to old-style
  'graphics'  and all this got me stuck in the process ...
 }
 
 Martin Maechler, ETH Zurich
 
 ---
 
 axTexpr - function(side, at = axTicks(side, axp=axp, usr=usr, log=log),
axp = NULL, usr = NULL, log = NULL)
 {
## Purpose: Do a 10^k labeling instead of a ek
##this auxiliary should return 'at' and 'label' (expression)
## --
## Arguments: as for axTicks()
## --
## Author: Martin Maechler, Date:  7 May 2004, 18:01
eT - floor(log10(abs(at)))# at == 0 case is dealt with below
mT - at / 10^eT
ss - lapply(seq(along = at),
 function(i) if(at[i] == 0) quote(0) else
 substitute(A %*% 10^E, list(A=mT[i], E=eT[i])))
do.call(expression, ss)
 }
 
 par(mar=.1+c(5,5,4,1))##  For the horizontal y-axis labels, need more space
 plot(x,y, axes= FALSE, frame=TRUE)
 aX - axTicks(1); axis(1, at=aX, label= axTexpr(1, aX))
 if(FALSE) # rather the next one
 { aY - axTicks(2); axis(2, at=aY, label= axTexpr(2, aY))}
 ## or rather (horizontal labels on y-axis):
 aY - axTicks(2); axis(2, at=aY, label= axTexpr(2, aY), las=2)

This may not be as good as what you have (although its arguably
prettier in the specific example below) and may suffice in many,
though possibly not all, cases -- I mention it since its very simple
and, in fact, requires no auxilliary routines.  It uses your idea of 
employing axTicks.  The key trick is to use axTicks twice in axis:

 x - 10 ^ seq(-2,10) # test data

plot(x, log = y, yaxt = n)
axis(2, axTicks(2), axTicks(2))

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Reconstructing LD function

2005-07-01 Thread usenet

On Mon, 2005-06-27 at 13:18, Prof Brian Ripley wrote:
 On Mon, 27 Jun 2005 [EMAIL PROTECTED] wrote:

  in an LDA analysis with n groups n-1 LD functions result. Implicitly this
  defines an LD fucntion for the last group. Does there exist code already
  to explictly construct this LD function?

Thank you for the quick reply.

 What `LDA analysis' are our discussing here?  (LDA is usually
 `linear discriminant analysis', so what did you mean and what R function
 are you nor referring to?)

 R has lda in package MASS, and that works with n LD functions.  To reduce
 it to n-1, subtract the last one from the others, in which case LD_n == 0.
Indeed I have been using the MASS::lda package.

 Anything you do in LD analysis only depends on differences in LD
 functions, and there really are n of them.  With two groups one is
 conventionally taken to be zero (the first, usually, not the last).
How is the classifcation decision reached from the LD functions? Are
those what is known as linear Fisherian discriminant functions? If so,
I'm not positive about why one of these functions can be set to 0.

Thank you in advance for the clarification.

Best wishes,

Stefan

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Reconstructing LD function

2005-07-01 Thread Prof Brian Ripley

On Fri, 1 Jul 2005 [EMAIL PROTECTED] wrote:

 On Mon, 2005-06-27 at 13:18, Prof Brian Ripley wrote:
 On Mon, 27 Jun 2005 [EMAIL PROTECTED] wrote:

 in an LDA analysis with n groups n-1 LD functions result. Implicitly this
 defines an LD fucntion for the last group. Does there exist code already
 to explictly construct this LD function?

 Thank you for the quick reply.

 What `LDA analysis' are our discussing here?  (LDA is usually
 `linear discriminant analysis', so what did you mean and what R function
 are you nor referring to?)

 R has lda in package MASS, and that works with n LD functions.  To reduce
 it to n-1, subtract the last one from the others, in which case LD_n == 0.
 Indeed I have been using the MASS::lda package.

 Anything you do in LD analysis only depends on differences in LD
 functions, and there really are n of them.  With two groups one is
 conventionally taken to be zero (the first, usually, not the last).

 How is the classifcation decision reached from the LD functions? Are
 those what is known as linear Fisherian discriminant functions? If so,
 I'm not positive about why one of these functions can be set to 0.

What I did say:

`Anything you do in LD analysis only depends on differences in LD 
functions'

So subtracting any one function from the others does not change the 
differences.

LD is not about classification, and Fisher did not do classification, nor 
did he use more than 2 classes.  I suspect your difficulty is going to be 
clearing your preconceptions. lda() is support software for a book which 
does explain the relationship between Fisher's discrimination and 
classification: please consult it for the background.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Nolinear mixed-effects models (nlme)

2005-07-01 Thread Douglas Bates

On 6/30/05, Alex Bach [EMAIL PROTECTED] wrote:
 Hello,
 
 I am trying to fit a nonlinear model of the form of:
 
 A*x^b*exp(-c*x)
 
 This represents a lactation curve. I have a bunch of cows, so I want
 COW to be a random effect.

You need to decide which of the model parameters (i.e. A, B and C)
should have a random effect grouped by COW and to specify this in your
call to nlme.

 
 I have been trying the following code with very littel success:
 
   fm1 - nlme(yield ~ A*(DIM^B)*(exp(-C*DIM)),
 + data = group,
 + fixed = A + B + C ~ 1,
 + start = c(A = 20, B = 0.3, C = 0.03))
 
 Does anyone know how to add the random effect of the cow? I have used
 the command groupedData to have Cow as subject (i.e., yield~DIM |
 cow). Is this a valid and sufficient approach? I have the feeling it
 is not sufficient.
 
 Also, does anyone know whether the formulation of the fixed effects
 is correct?.
 
 Thank you very much,
 
 Alex
 
 
 
 [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] plot svm

2005-07-01 Thread bgmail

Hello
I'm working with DNA microarrays and want to classify them with SVM.  I 
want to plot the results and it's imposible for me. I found others 
tutorials and examples (with iris and cats data) where you can plot the 
results with plot.svm, but you need to write a formula and I don't know 
how to do this with golubEsets data, for example .

plot ( svm1,  golubTrain,  formula)

For example, Iris Data:  
   Sepal.Length   Sepal.Width   Petal.Length   Petal.WidthSpecies
15.1   3.51.4
0.2setosa
24.9   3.01.4
0.2setosa
34.7   3.21.3
0.2setosa
44.6   3.11.5
0.2setosa
55.0   3.61.4
0.2setosa
65.4   3.91.7
0.4setosa
74.6   3.41.4
0.3setosa
85.0   3.41.5
0.2setosa

m2 - svm(Species~., data = iris)
plot(m2, iris, Petal.Width ~ Petal.Length, slice = list(Sepal.Width = 3, 
Sepal.Length = 4))

I should be grateful if you would send me information about how to plot 
the golubEsets data (for example the formula, because I have tested 
several options but neither of them work). My data are very similar 
(expression values with several conditions), so I could plot my results 
if I knew how to plot golub data.

Thanks a lot

Beatriz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] R integration with Microsoft Powerpoint

2005-07-01 Thread Prof Brian Ripley

On Fri, 1 Jul 2005, John Sorkin wrote:

 Please allow me an unusual question.

 Is there any way that R can be closely integrated with a Microsoft
 Powerpoint presentation? I would like to embed R calculations in
 Powerpoint so that I will start Powerpoint, be prompted to enter some
 parameters, and an R function will run and return values and graphs.

R can be driven by COM, so if Powerpoint supports COM (possibly via VBA)
this would be possible.  It is likely, as other MS Office applications do.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] [OT] gmail filter for R-help and R-devel lists

2005-07-01 Thread Suresh Krishna


i dont use gmail, but this method *may* run into problems if people are 
replying to a message and r-help is on the cc: line.

thunderbird has a to: or cc: option for this... is gmail's to: field a 
default for to: or cc: ?

-s.

Deepayan Sarkar wrote:
 On 6/30/05, Douglas Bates [EMAIL PROTECTED] wrote:
 
This is slightly off-topic but I would be interested in whether anyone
has succeeded in creating a filter expression for Google's gmail
system that will select messages sent through the R-help and R-devel
lists.  It seems as if it should be easy to select on '[R]' or '[Rd]'
in the subject line but I haven't been able to work out the exact
syntax that would do this and not select messages that have an 'R'
anywhere in the subject.
 
 
 I filter on the To field, which mostly works:
 
 Matches: to:(r-help@stat.math.ethz.ch)
 Do this: Skip Inbox, Apply label r-help
 
 Matches: to:([EMAIL PROTECTED])
 Do this: Skip Inbox, Apply label r-help
 
 Deepayan
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] the format of the result

2005-07-01 Thread khobson





See ?sprintf

#e.g. Replace your prop line with:
prop-sprintf(%.2f%%, prop.table(ta)*100)

_
mailto:[EMAIL PROTECTED]
Kenneth Ray Hobson, P.E.
Oklahoma DOT - QA  IAS Manager
200 N.E. 21st Street
Oklahoma City, OK  73105-3204
(405) 522-4985, (405) 522-0552 fax

Visit our website at:
http://www.okladot.state.ok.us/materials/materials.htm

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] [OT] gmail filter for R-help and R-devel lists

2005-07-01 Thread ecoinfo

Yes, Douglas has proved this.

On 7/1/05, Suresh Krishna [EMAIL PROTECTED] wrote: 
 
 
 i dont use gmail, but this method *may* run into problems if people are
 replying to a message and r-help is on the cc: line.
 
 thunderbird has a to: or cc: option for this... is gmail's to: field a
 default for to: or cc: ?
 
 -s.
 
 Deepayan Sarkar wrote:
  On 6/30/05, Douglas Bates [EMAIL PROTECTED] wrote:
 
 This is slightly off-topic but I would be interested in whether anyone
 has succeeded in creating a filter expression for Google's gmail
 system that will select messages sent through the R-help and R-devel
 lists. It seems as if it should be easy to select on '[R]' or '[Rd]'
 in the subject line but I haven't been able to work out the exact
 syntax that would do this and not select messages that have an 'R'
 anywhere in the subject.
 
 
  I filter on the To field, which mostly works:
 
  Matches: to:(r-help@stat.math.ethz.ch)
  Do this: Skip Inbox, Apply label r-help
 
  Matches: to:([EMAIL PROTECTED])
  Do this: Skip Inbox, Apply label r-help
 
  Deepayan
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] R integration with Microsoft Powerpoint

2005-07-01 Thread khobson





Sure.  Just run R in a BAT file.  You just reference the BAT file in
PowerPoint like any other EXE application via an OLE link.   Of course you
can always use VBA code in Powerpoint to Shell() to the BAT program.

In R, type ?BATCH to see how the BAT file's content line should be coded to
run the R program.

mailto:[EMAIL PROTECTED]
Kenneth Ray Hobson, P.E.
Oklahoma DOT - QA  IAS Manager
200 N.E. 21st Street
Oklahoma City, OK  73105-3204
(405) 522-4985, (405) 522-0552 fax

Visit our website at:
http://www.okladot.state.ok.us/materials/materials.htm

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] ranking predictive features in logsitic regression

2005-07-01 Thread David Firth

On 30 Jun, 2005, at 21:20, Stephen Choularton wrote:

 Hi

 Is there some function R that multiplies each coefficient by the
 standard deviation of the corresponding variable and produces a 
 ranking?



Possibly you meant un-signed coefficients?  In which case something like

   function(model) rank(abs(coef(model)) * apply(model.matrix(model), 2, 
sd))

should do what you asked about.

The relimp package provides approximate inference for comparisons of 
this kind.

I should say that I don't think that such a ranking will often be very 
useful, though.  Some coefficients will be determined with greater 
precision than others, and there may be correlations to worry about, or 
variables may only make sense when considered in groups (eg factor 
effects, or interactions with corresponding main effects, etc.)

David

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] OT: How to instaill gcc in cygwin?

2005-07-01 Thread Wensui Liu

Dear Listers,

I know it is far off topic. But I do know there must be some people
here who know it very well.

Sorry for bothering others.

Thanks.

-- 
WenSui Liu, MS MA
Senior Decision Support Analyst
Division of Health Policy and Clinical Effectiveness
Cincinnati Children Hospital Medical Center

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Calculate 3D Fixed Kernel Home Range

2005-07-01 Thread Jared Stabach

I have x,y data on three animals (~150 data points each).  I have calculated 
the fixed kernel home range using the 'adehabitat' library and the LSCV 
smoothing factor.  Can anyone provide me with some help on how to display 
the density estimate of the Utilization Distribution 3-dimensionally?

Thanks in advance,

Jared

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] the format of the result

2005-07-01 Thread Marc Schwartz (via MN)

On Fri, 2005-07-01 at 19:40 +0800, ronggui wrote:
 I write a function to get the frequency and prop of a variable.
 
 freq-function(x,digits=3)
 {naa-is.na(x)
 nas-sum(naa)
 if (any(naa))
 x-x[!naa]
 n-length(x)
 ta-table(x)
 prop-prop.table(ta)*100
 res-rbind(ta,prop)
 rownames(res)-c(Freq,Prop)
 cat(Missing value(s) are,nas,.\n)
 cat(Valid case(s) are,n,.\n)
 cat(Total case(s) are,(n+nas),.\n\n)
 print(res,digits=(digits+2))
 cat(\n)
 }
 
  freq(sample(letters[1:3],48,T),2)
 Missing value(s) are 0 .
 Valid case(s) are 48 .
 Total case(s) are 48 .
 
  a b c
 Freq 11.00 20.00 17.00
 Prop 22.92 41.67 35.42
 
 and i want the result to be like
  a  b c
 Freq 11.00  20.00  17.00
 Prop 22.92% 41.67% 35.42%
 
 how should i change my function to get what i want?


Here is a modification of the function that I think should work. Note
that part of the output formatting process has to take into account the
a priori unknowns involving your 'digits' argument, the lengths of the
dimnames resulting from the table and the lengths of the frequency
counts in the table. Thus, a fair amount of the code is establishing the
'width' argument, which is then used in formatC() so that the output can
be column aligned properly.

Note that by default, table() will exclude NA, so you do not need to
subset 'x' before using table().

Also, note that I change Prop to Pct.


freq - function(x, digits = 3)
{
  n - length(x)
  missing - sum(is.na(x))
  ta - table(x)
  pct - prop.table(ta) * 100

  width - max(nchar(unlist(dimnames(ta))) + 1,
   nchar(ta) + digits + 1,
   5 + digits)
  
  Vals - paste(formatC(unlist(dimnames(ta)), format = s,
width = width),
collapse =   )

  Freq - paste(formatC(ta, format = f, digits = digits,
width = width),
collapse =   )

  Pct - paste(formatC(pct, format = f, digits = digits,
   width = width),
   %, sep = , collapse =  )

  cat(Missing value(s) are, missing, .\n)
  cat(Valid case(s) are, n - missing,.\n)
  cat(Total case(s) are, n, .\n\n)

  cat(, Vals, \n)
  cat(Freq, Freq, \n)
  cat(Pct , Pct, \n)
  cat(\n)
}


Thus:


 freq(sample(letters[1:3], 48, TRUE), 2)
Missing value(s) are 0 .
Valid case(s) are 48 .
Total case(s) are 48 .

   abc 
Freq   28.00 8.0012.00 
Pct58.33%   16.67%   25.00% 


 freq(sample(c(letters[1:3], NA), 1000, TRUE), 2)
Missing value(s) are 257 .
Valid case(s) are 743 .
Total case(s) are 1000 .

   abc 
Freq  250.00   218.00   275.00 
Pct33.65%   29.34%   37.01% 


 freq(iris$Species)
Missing value(s) are 0 .
Valid case(s) are 150 .
Total case(s) are 150 .

  setosa   versicolorvirginica 
Freq  50.000   50.000   50.000 
Pct   33.333%  33.333%  33.333% 


 freq(iris$Species, 0)
Missing value(s) are 0 .
Valid case(s) are 150 .
Total case(s) are 150 .

  setosa   versicolorvirginica 
Freq  50   50   50 
Pct   33%  33%  33% 


HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Lines for plot (Sweave)

2005-07-01 Thread Doran, Harold

Dear List:

I am generating a series of plots iteratively using Sweave. In short, a
dataframe is subsetted row by row and variable graphics are created
conditional on the data in each row. In this particular case, this code
ends up generating 17,000 individual plots. 

In some cases, all student data (this is working with student
achievement data) are available and my code below works very well in the
sense that a line connects all points. However, in some cases there are
missing data and I need to modify my code so that lines are connected
through all points even when data are missing.

Here is a snip of relevant code. In the actual program, the data in
stu.vector and avg.vector are obtained from the dataframe as the
programs loops through each row. 

stu.vector-c(2500, 2510,   NA , 2600)
avg.vector-c(2635, 2589, 2628, 2685)
x - c(0,1,2,3)
graph.min - min(stu.vector,avg.vector ,na.rm=TRUE)-150
graph.max - max(stu.vector,avg.vector ,na.rm=TRUE)+150
plot(x, stu.vector, ylim=c(graph.min,graph.max),  xlab= , ylab=Scaled
Score, xaxt='n', pch=2, col='blue', main=Math Growth Rate)
points(x, avg.vector, pch=1, col='red')
lines(x, stu.vector, lty=1, col='blue')
lines(x, avg.vector, lty=3, col='red')

If the NA did not exist in the object stu.vector then all points would
be connected with lines. However, in some cases data are missing and I
need to connect the data in stu.vector with lines. So in this case, the
line would connect points 1 and 2, and then 2 and 4 even though point 3
is missing. 

Can anyone suggest how I might do this? 

Thanks,
Harold


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Lines for plot (Sweave)

2005-07-01 Thread Gabor Grothendieck

On 7/1/05, Doran, Harold [EMAIL PROTECTED] wrote:
 Dear List:
 
 I am generating a series of plots iteratively using Sweave. In short, a
 dataframe is subsetted row by row and variable graphics are created
 conditional on the data in each row. In this particular case, this code
 ends up generating 17,000 individual plots.
 
 In some cases, all student data (this is working with student
 achievement data) are available and my code below works very well in the
 sense that a line connects all points. However, in some cases there are
 missing data and I need to modify my code so that lines are connected
 through all points even when data are missing.
 
 Here is a snip of relevant code. In the actual program, the data in
 stu.vector and avg.vector are obtained from the dataframe as the
 programs loops through each row.
 
 stu.vector-c(2500, 2510,   NA , 2600)
 avg.vector-c(2635, 2589, 2628, 2685)
 x - c(0,1,2,3)
 graph.min - min(stu.vector,avg.vector ,na.rm=TRUE)-150
 graph.max - max(stu.vector,avg.vector ,na.rm=TRUE)+150
 plot(x, stu.vector, ylim=c(graph.min,graph.max),  xlab= , ylab=Scaled
 Score, xaxt='n', pch=2, col='blue', main=Math Growth Rate)
 points(x, avg.vector, pch=1, col='red')
 lines(x, stu.vector, lty=1, col='blue')
 lines(x, avg.vector, lty=3, col='red')
 
 If the NA did not exist in the object stu.vector then all points would
 be connected with lines. However, in some cases data are missing and I
 need to connect the data in stu.vector with lines. So in this case, the
 line would connect points 1 and 2, and then 2 and 4 even though point 3
 is missing.

Replace the first lines statement with:

lines(approx(x, stu.vector), lty=1, col='blue')

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] zlim for levelplot

2005-07-01 Thread Tao Shi


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Lines for plot (Sweave)

2005-07-01 Thread Doran, Harold

Fabulous, it works great. I didn't know about approx().

Thank you 

-Original Message-
From: Gabor Grothendieck [mailto:[EMAIL PROTECTED] 
Sent: Friday, July 01, 2005 1:54 PM
To: Doran, Harold
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] Lines for plot (Sweave)

On 7/1/05, Doran, Harold [EMAIL PROTECTED] wrote:
 Dear List:

 I am generating a series of plots iteratively using Sweave. In short, 
 a dataframe is subsetted row by row and variable graphics are created 
 conditional on the data in each row. In this particular case, this 
 code ends up generating 17,000 individual plots.

 In some cases, all student data (this is working with student 
 achievement data) are available and my code below works very well in 
 the sense that a line connects all points. However, in some cases 
 there are missing data and I need to modify my code so that lines are 
 connected through all points even when data are missing.

 Here is a snip of relevant code. In the actual program, the data in 
 stu.vector and avg.vector are obtained from the dataframe as the 
 programs loops through each row.

 stu.vector-c(2500, 2510,   NA , 2600)
 avg.vector-c(2635, 2589, 2628, 2685)
 x - c(0,1,2,3)
 graph.min - min(stu.vector,avg.vector ,na.rm=TRUE)-150 graph.max - 
 max(stu.vector,avg.vector ,na.rm=TRUE)+150 plot(x, stu.vector, 
 ylim=c(graph.min,graph.max),  xlab= , ylab=Scaled Score, xaxt='n',

 pch=2, col='blue', main=Math Growth Rate) points(x, avg.vector, 
 pch=1, col='red') lines(x, stu.vector, lty=1, col='blue') lines(x, 
 avg.vector, lty=3, col='red')

 If the NA did not exist in the object stu.vector then all points would

 be connected with lines. However, in some cases data are missing and I

 need to connect the data in stu.vector with lines. So in this case, 
 the line would connect points 1 and 2, and then 2 and 4 even though 
 point 3 is missing.

Replace the first lines statement with:

lines(approx(x, stu.vector), lty=1, col='blue')

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] R integration with Microsoft Powerpoint

2005-07-01 Thread khobson





Of course there are many ways to do it.

The user input could come from R dialogs via the tcltk package or the
Input() dialogs from VBA in Powerpoint.

I chose the output as PDF.  The R source code called cars.r, might go
something like:

pdf(file=paste(getwd(), /, cars.pdf, sep=),
  width = 8.5, height = 11, onefile = TRUE, family = Helvetica,
title = R Graphics Output, fonts = NULL, version = 1.1)
plot(cars)
lines(lowess(cars))
graphics.off()
shell(paste(getwd(), /, cars.pdf, sep=),wait=FALSE) #veiw PDF
stop(all done)

The cars.bat file, might go something like:

C:\Program Files\R\rw2010\bin\R.exe CMD BATCH c:\myfiles\r\cars.r
#Change the drives and paths to R.exe and the cars.r files.

The cars.bat file was played from PowerPoint by creating the object and
doubleclicking in the slideshow.  The are other ways to do it of course.
In PowerPoint, click the menu item Insert | Object | Create and browse to
and select the cars.r file.  I set the object as an icon and used the R.exe
icon.

VBA scripting to play the cars.bat file is not all that involved either.
___
mailto:[EMAIL PROTECTED]
Kenneth Ray Hobson, P.E.
Oklahoma DOT - QA  IAS Manager
200 N.E. 21st Street
Oklahoma City, OK  73105-3204
(405) 522-4985, (405) 522-0552 fax

Visit our website at:
http://www.okladot.state.ok.us/materials/materials.htm

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] R integration with Microsoft Powerpoint

2005-07-01 Thread khobson





...snip...In PowerPoint, click the menu item Insert | Object | Create and
browse to
and select the cars.r file. ...snip...
In the previous post snippet above, replace cars.r with cars.bat.

To run the cars.bat program via VBA, I would typically insert a button.  To
do so in PowerPoint, right click the toolbar, select Control Toolbar and
then click the button icon.  Right click and drag and draw the button onto
the slide.  Double click the button object and add code something like:

Private Sub CommandButton1_Click()
Shell (c:\myfiles\r\cars.bat)
End Sub

When passing input to a program like R, I typically use VBA's Input() and
write the results to a TXT file.  This is then easily read into R.

mailto:[EMAIL PROTECTED]
Kenneth Ray Hobson, P.E.
Oklahoma DOT - QA  IAS Manager
200 N.E. 21st Street
Oklahoma City, OK  73105-3204
(405) 522-4985, (405) 522-0552 fax

Visit our website at:
http://www.okladot.state.ok.us/materials/materials.htm

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] lapply

2005-07-01 Thread Weiwei Shi

Hi, all:
I have a program here but it runs slow and I am wondering if there is
some place I can change to make it run faster.

Two lists, scd and c1, like this:
 scd[1:2]
[[1]]
[1] 54  241

[[2]]
[1] 52 53
...
 c1[1:2]
[[1]]
 [1] 13  30  92  93  13  94  30  95  96  97  98  99
[13] 8   19  31  100 101 29

[[2]]
[1] 13 55

 length(scd)
[1] 2542
 length(c1)
[1] 31859

My target is 
for each in scd, I need to know how many times it (as the whole) occur in c1.

My code is
N - length(scd) # num of word_comb
M - length(c1) # num of class1
g1 - lapply(1:N, function(i) lapply(1:M, function(j) all(scd[[i]]
%in% c1[[j]])))
a - lapply(1:N, function(i) sum(g1[[i]]==T))

My questions:
1. g1's calc is very slow
2. how to do the following using apply:
tab - array(as.integer(0), dim=c(2,2,N)
for (i in 1:N){
tab[2,1,i] - a[[i]]
}
tab[2,2,]=M-tab[2,1,]

Thanks,
-- 
Weiwei Shi, Ph.D

Did you always know?
No, I did not. But I believed...
---Matrix III

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] OT: How to instaill gcc in cygwin?

2005-07-01 Thread Dirk Eddelbuettel

Wensui Liu liuwensui at gmail.com writes:
 I know it is far off topic. But I do know there must be some people
 here who know it very well.

Click the 'install now' button at cygwin.org, and when the selection box 
appears in the install process, also select gcc.  There is a _lot_ of stuff
available for cygwin that the default install ignores.

That said, it won't help you for R as you cannot build R under Cygwin. The R
Extensions and R Admin manuals for details -- you'd want MinGW, a cousin of
Cygwin, on the PC.

Hth, Dirk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Lines for plot (Sweave)

2005-07-01 Thread Don MacQueen

You can use:

   lines(x[!is.na[stu.vector], stu.vector[!is.na(stu.vector)], lty=1, 
col='blue')

-Don

At 1:43 PM -0400 7/1/05, Doran, Harold wrote:
Dear List:

I am generating a series of plots iteratively using Sweave. In short, a
dataframe is subsetted row by row and variable graphics are created
conditional on the data in each row. In this particular case, this code
ends up generating 17,000 individual plots.

In some cases, all student data (this is working with student
achievement data) are available and my code below works very well in the
sense that a line connects all points. However, in some cases there are
missing data and I need to modify my code so that lines are connected
through all points even when data are missing.

Here is a snip of relevant code. In the actual program, the data in
stu.vector and avg.vector are obtained from the dataframe as the
programs loops through each row.

stu.vector-c(2500, 2510,   NA , 2600)
avg.vector-c(2635, 2589, 2628, 2685)
x - c(0,1,2,3)
graph.min - min(stu.vector,avg.vector ,na.rm=TRUE)-150
graph.max - max(stu.vector,avg.vector ,na.rm=TRUE)+150
plot(x, stu.vector, ylim=c(graph.min,graph.max),  xlab= , ylab=Scaled
Score, xaxt='n', pch=2, col='blue', main=Math Growth Rate)
points(x, avg.vector, pch=1, col='red')
lines(x, stu.vector, lty=1, col='blue')
lines(x, avg.vector, lty=3, col='red')

If the NA did not exist in the object stu.vector then all points would
be connected with lines. However, in some cases data are missing and I
need to connect the data in stu.vector with lines. So in this case, the
line would connect points 1 and 2, and then 2 and 4 even though point 3
is missing.

Can anyone suggest how I might do this?

Thanks,
Harold


   [[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


-- 
--
Don MacQueen
Environmental Protection Department
Lawrence Livermore National Laboratory
Livermore, CA, USA

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Setting lattice boxplot's lines to black

2005-07-01 Thread Mario Alfonso Morales Rivera

Hi R users.

I'm using the lattice library and I need a print version for my graphics.

How Set I my boxplot's lines to black 


tpl-trellis.par.get(plot.line)

tpl$col-black

trellis.par.set(plot.line, tpl)

Don't work. Boxplot's lines aren't black.


Thanks a lot.


This is my script.

library(lattice)

# I set background's color to white

tbg-trellis.par.get(background)

tbg$col-white

trellis.par.set(background, tbg)

# Set strip background's color to white

tsbg-trellis.par.get(strip.background) tsbg$col-white

trellis.par.set(strip.background, tsbg)

# I set symbol's color to black

tps-trellis.par.get(plot.symbol)

tps$col-black

trellis.par.set(plot.symbol, tps)


# Set line's color to black

tpl-trellis.par.get(plot.line)

tpl$col-black

trellis.par.set(plot.line, tpl)

print(bwplot(Sepal.Length~Species ,data=iris))


# This work whit xyplot but don't work whit bwplot, the boxplot's
# lines aren't black.


How Set I my boxplot's lines to black 

Thanks a lot.




-- 
Mario Alfonso Morales Rivera
Profesor Auxiliar.
Departamento de Matemáticas y Estadistica.
Universidad de Códoba.
Colombia

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Generating correlated data from uniform distribution

2005-07-01 Thread Menghui Chen

Dear R users,

I want to generate two random variables (X1, X2) from uniform
distribution (-0.5, 0.5) with a specified correlation coefficient r.
Does anyone know how to do it in R?

Many thanks!

Menghui

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Setting lattice boxplot's lines to black

2005-07-01 Thread Deepayan Sarkar

On 7/1/05, Mario Alfonso Morales Rivera [EMAIL PROTECTED] wrote:
 Hi R users.
 
 I'm using the lattice library and I need a print version for my graphics.
 
 How Set I my boxplot's lines to black 

(As I said in a private reply,) you seem to want a black and white plot, so use

trellis.device(color = FALSE)

to initialize the device.

Deepayan

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Generating correlated data from uniform distribution

2005-07-01 Thread Jim Brennan

 dat-matrix(runif(2000),2,1000)
 rho-.77
 R-matrix(c(1,rho,rho,1),2,2)
 dat2-t(ch)%*%dat
 cor(dat2[1,],dat2[2,])
[1] 0.7513892
 dat-matrix(runif(2),2,1)
 rho-.28
 R-matrix(c(1,rho,rho,1),2,2)
 ch-chol(R)
 dat2-t(ch)%*%dat
 cor(dat2[1,],dat2[2,])
[1] 0.2681669
 dat-matrix(runif(20),2,10)
 rho-.28
 R-matrix(c(1,rho,rho,1),2,2)
 ch-chol(R)
 dat2-t(ch)%*%dat
 cor(dat2[1,],dat2[2,])
[1] 0.2814035

See  ?choleski

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Menghui Chen
Sent: July 1, 2005 4:49 PM
To: r-help@stat.math.ethz.ch
Subject: [R] Generating correlated data from uniform distribution

Dear R users,

I want to generate two random variables (X1, X2) from uniform
distribution (-0.5, 0.5) with a specified correlation coefficient r.
Does anyone know how to do it in R?

Many thanks!

Menghui

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Generating correlated data from uniform distribution

2005-07-01 Thread Tony Plate

Isn't this a little trickier with non-normal variables?  It sounds like 
Menghui Chen wants variables that have uniform marginal distribution, 
and a specified correlation.

When I look at histograms (or just the quantiles) of the rows of dat2 in 
your example, I see something for dat2[2,] that does not look much like 
it comes from a uniform distribution.

  dat-matrix(runif(2000),2,1000)
  rho-.77
  R-matrix(c(1,rho,rho,1),2,2)
  ch-chol(R)
  dat2-t(ch)%*%dat
  cor(dat2[1,],dat2[2,])
[1] 0.7513892
  hist(dat2[1,])
  hist(dat2[2,])
 
  quantile(dat2[1,])
  0% 25% 50% 75%100%
0.000655829 0.246216035 0.507075912 0.745158441 0.16418
  quantile(dat2[2,])
0%   25%   50%   75%  100%
0.0393046 0.4980066 0.7150426 0.9208855 1.3864704
 

-- Tony Plate

Jim Brennan wrote:
 dat-matrix(runif(2000),2,1000)
 rho-.77
 R-matrix(c(1,rho,rho,1),2,2)
 ch-chol(R)
 dat2-t(ch)%*%dat
 cor(dat2[1,],dat2[2,])
[1] 0.7513892
 
dat-matrix(runif(2),2,1)
rho-.28
R-matrix(c(1,rho,rho,1),2,2)
ch-chol(R)
dat2-t(ch)%*%dat
cor(dat2[1,],dat2[2,])
 
 [1] 0.2681669
 
dat-matrix(runif(20),2,10)
rho-.28
R-matrix(c(1,rho,rho,1),2,2)
ch-chol(R)
dat2-t(ch)%*%dat
cor(dat2[1,],dat2[2,])
 
 [1] 0.2814035
 
 See  ?choleski
 
 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Menghui Chen
 Sent: July 1, 2005 4:49 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Generating correlated data from uniform distribution
 
 Dear R users,
 
 I want to generate two random variables (X1, X2) from uniform
 distribution (-0.5, 0.5) with a specified correlation coefficient r.
 Does anyone know how to do it in R?
 
 Many thanks!
 
 Menghui
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Generating correlated data from uniform distribution

2005-07-01 Thread Spencer Graves

  How about tetrachoric correlations?  Generate correlated normal 
observations, then convert to uniform using pnorm:

rho - 0.9
Cor - array(c(1, rho, rho, 1), dim=c(2,2))

library(mvtnorm)

set.seed(1)
Y - rmvnorm(1, sigma=Cor)

X - pnorm(Y)-0.5
plot(X)
hist(X[,1])
hist(X[,2])
cor(X)

  Enjoy.
  spencer graves

Tony Plate wrote:

 Isn't this a little trickier with non-normal variables?  It sounds like 
 Menghui Chen wants variables that have uniform marginal distribution, 
 and a specified correlation.
 
 When I look at histograms (or just the quantiles) of the rows of dat2 in 
 your example, I see something for dat2[2,] that does not look much like 
 it comes from a uniform distribution.
 
   dat-matrix(runif(2000),2,1000)
   rho-.77
   R-matrix(c(1,rho,rho,1),2,2)
   ch-chol(R)
   dat2-t(ch)%*%dat
   cor(dat2[1,],dat2[2,])
 [1] 0.7513892
   hist(dat2[1,])
   hist(dat2[2,])
  
   quantile(dat2[1,])
   0% 25% 50% 75%100%
 0.000655829 0.246216035 0.507075912 0.745158441 0.16418
   quantile(dat2[2,])
 0%   25%   50%   75%  100%
 0.0393046 0.4980066 0.7150426 0.9208855 1.3864704
  
 
 -- Tony Plate
 
 Jim Brennan wrote:
 
dat-matrix(runif(2000),2,1000)
rho-.77
R-matrix(c(1,rho,rho,1),2,2)
ch-chol(R)
dat2-t(ch)%*%dat
cor(dat2[1,],dat2[2,])
 
 [1] 0.7513892
 
dat-matrix(runif(2),2,1)
rho-.28
R-matrix(c(1,rho,rho,1),2,2)
ch-chol(R)
dat2-t(ch)%*%dat
cor(dat2[1,],dat2[2,])

[1] 0.2681669


dat-matrix(runif(20),2,10)
rho-.28
R-matrix(c(1,rho,rho,1),2,2)
ch-chol(R)
dat2-t(ch)%*%dat
cor(dat2[1,],dat2[2,])

[1] 0.2814035

See  ?choleski

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Menghui Chen
Sent: July 1, 2005 4:49 PM
To: r-help@stat.math.ethz.ch
Subject: [R] Generating correlated data from uniform distribution

Dear R users,

I want to generate two random variables (X1, X2) from uniform
distribution (-0.5, 0.5) with a specified correlation coefficient r.
Does anyone know how to do it in R?

Many thanks!

Menghui

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

-- 
Spencer Graves, PhD
Senior Development Engineer
PDF Solutions, Inc.
333 West San Carlos Street Suite 700
San Jose, CA 95110, USA

[EMAIL PROTECTED]
www.pdf.com http://www.pdf.com
Tel:  408-938-4420
Fax: 408-280-7915

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Generating correlated data from uniform distribution

2005-07-01 Thread Jim Brennan

Yes you are right I guess this works only for normal data. Free advice
sometimes comes with too little consideration :-)
Sorry about that and thanks to Spencer for the correct way.
-Original Message-
From: Tony Plate [mailto:[EMAIL PROTECTED] 
Sent: July 1, 2005 6:01 PM
To: Jim Brennan
Cc: 'Menghui Chen'; r-help@stat.math.ethz.ch
Subject: Re: [R] Generating correlated data from uniform distribution

Isn't this a little trickier with non-normal variables?  It sounds like 
Menghui Chen wants variables that have uniform marginal distribution, 
and a specified correlation.

When I look at histograms (or just the quantiles) of the rows of dat2 in 
your example, I see something for dat2[2,] that does not look much like 
it comes from a uniform distribution.

  dat-matrix(runif(2000),2,1000)
  rho-.77
  R-matrix(c(1,rho,rho,1),2,2)
  ch-chol(R)
  dat2-t(ch)%*%dat
  cor(dat2[1,],dat2[2,])
[1] 0.7513892
  hist(dat2[1,])
  hist(dat2[2,])

  quantile(dat2[1,])
  0% 25% 50% 75%100%
0.000655829 0.246216035 0.507075912 0.745158441 0.16418
  quantile(dat2[2,])
0%   25%   50%   75%  100%
0.0393046 0.4980066 0.7150426 0.9208855 1.3864704

-- Tony Plate

Jim Brennan wrote:
 dat-matrix(runif(2000),2,1000)
 rho-.77
 R-matrix(c(1,rho,rho,1),2,2)
 ch-chol(R)
 dat2-t(ch)%*%dat
 cor(dat2[1,],dat2[2,])
[1] 0.7513892

dat-matrix(runif(2),2,1)
rho-.28
R-matrix(c(1,rho,rho,1),2,2)
ch-chol(R)
dat2-t(ch)%*%dat
cor(dat2[1,],dat2[2,])

 [1] 0.2681669

dat-matrix(runif(20),2,10)
rho-.28
R-matrix(c(1,rho,rho,1),2,2)
ch-chol(R)
dat2-t(ch)%*%dat
cor(dat2[1,],dat2[2,])

 [1] 0.2814035

 See  ?choleski

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Menghui Chen
 Sent: July 1, 2005 4:49 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Generating correlated data from uniform distribution

 Dear R users,

 I want to generate two random variables (X1, X2) from uniform
 distribution (-0.5, 0.5) with a specified correlation coefficient r.
 Does anyone know how to do it in R?

 Many thanks!

 Menghui

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Generating correlated data from uniform distribution

2005-07-01 Thread Peter Dalgaard

Jim Brennan [EMAIL PROTECTED] writes:

 Yes you are right I guess this works only for normal data. Free advice
 sometimes comes with too little consideration :-)

Worth every cent...

 Sorry about that and thanks to Spencer for the correct way.

Hmm, but is it? Or rather, what is the relation between the
correlation of the normals  and that of the transformed variables? 
Looks nontrivial to me.

Incidentally, here's a way that satisfies the criteria, but in a
rather weird way:

N - 1
rho - .6
x - runif(N, -.5,.5)
y - x * sample(c(1,-1), N, replace=T, prob=c((1+rho)/2,(1-rho)/2))

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph: (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] scope argument in step function

2005-07-01 Thread Young Cho

Thanks a lot for help in advance. I am switching from matlab to R and I guess I 
need some time to get rolling. I was wondering why this code : 
 
 fit.0 - lm( Response ~ 1, data = ds3)
 step(fit.0,scope=list(upper=~.,lower=~1),data=ds3)
Start:  AIC= -32.66 
 Response ~ 1 

Call:
lm(formula = Response ~ 1, data = ds3)
Coefficients:
(Intercept)  
  1.301  
 
 
is not working different from the following:
 
 
 cnames - dimnames(ds3)[[2]]
 cnames - cnames[-444]# last col is Response
 
 fmla - as.formula(paste( ~ ,paste(cnames,collapse=+)))
 step(fit.0,scope=list(upper=fmla,lower=~1),data=ds3)
Start:  AIC= -32.66 
 Response ~ 1  
 fmla - as.formula(paste( ~ ,paste(cnames,collapse=+)))
 fit.s - step(fit.0,scope=list(upper=fmla,lower=~1),data=ds3)

Step:  AIC= -Inf 
 Response ~ ENTP9324 + CH1W0281 
   Df Sum of Sq RSS  AIC
none0 -Inf
- CH1W0281  3   0.00381 0.00381 -115
- ENTP9324  9 1   1  -34

The dataframe ds3 is 17 by 444 and I understand it is not smart thing to run 
stepwise regression. What I wondered is if I pass the 'upper=~.' , it seems 
step() thinks the full model is current one. Not adding anymore. If this is the 
right answer, is there a better way than creating fmla argument in the above?
 
Thanks!
 
-Young.
 


-


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Generating correlated data from uniform distribution

2005-07-01 Thread Jim Brennan

OK now I am skeptical especially when you say in a weird way:-)
This may be OK but look at plot(x,y) and I am suspicious. Is it still
alright with this kind of relationship?

For large N it appears Spencer's method is returning slightly lower
correlation for the uniforms as compared to the normals so maybe there is a
problem!?!

Hope we are all learning something and Menghui gets/has what he wants . :-)

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Peter
Dalgaard
Sent: July 1, 2005 6:59 PM
To: Jim Brennan
Cc: 'Tony Plate'; 'Menghui Chen'; r-help@stat.math.ethz.ch
Subject: Re: [R] Generating correlated data from uniform distribution

Jim Brennan [EMAIL PROTECTED] writes:

 Yes you are right I guess this works only for normal data. Free advice
 sometimes comes with too little consideration :-)

Worth every cent...

 Sorry about that and thanks to Spencer for the correct way.

Hmm, but is it? Or rather, what is the relation between the
correlation of the normals  and that of the transformed variables? 
Looks nontrivial to me.

Incidentally, here's a way that satisfies the criteria, but in a
rather weird way:

N - 1
rho - .6
x - runif(N, -.5,.5)
y - x * sample(c(1,-1), N, replace=T, prob=c((1+rho)/2,(1-rho)/2))

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph: (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Generating correlated data from uniform distribution

2005-07-01 Thread Spencer Graves

  Peter is absolutely correct:  The correlation I used was for a 
hidden normal process, not for the resultant correlated uniforms.  This 
is similar to but different from tetrachoric corrrelations, about 
which there is a substantial literature (including an R package 
polycor).

  Why do you want correlated uniforms?  What do they represent 
physically?  Does it matter if you can match exactly a particular 
correlation coefficient, or is it enough to say that they are uniformily 
distributed random variables such that their normal scores have a 
specified correlation coefficient?  There is so much known about the 
multivariate normal distribution and so little about correlated uniforms 
that it might be more useful to know the correlations of latent normals, 
for which your uniforms are what are measured.

  spencer graves

Peter Dalgaard wrote:

 Jim Brennan [EMAIL PROTECTED] writes:
 
 
Yes you are right I guess this works only for normal data. Free advice
sometimes comes with too little consideration :-)
 
 
 Worth every cent...
 
 
Sorry about that and thanks to Spencer for the correct way.
 
 
 Hmm, but is it? Or rather, what is the relation between the
 correlation of the normals  and that of the transformed variables? 
 Looks nontrivial to me.
 
 Incidentally, here's a way that satisfies the criteria, but in a
 rather weird way:
 
 N - 1
 rho - .6
 x - runif(N, -.5,.5)
 y - x * sample(c(1,-1), N, replace=T, prob=c((1+rho)/2,(1-rho)/2))
 

-- 
Spencer Graves, PhD
Senior Development Engineer
PDF Solutions, Inc.
333 West San Carlos Street Suite 700
San Jose, CA 95110, USA

[EMAIL PROTECTED]
www.pdf.com http://www.pdf.com
Tel:  408-938-4420
Fax: 408-280-7915

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Generating correlated data from uniform distribution

2005-07-01 Thread alejandro munoz

Dear Menghui,

You may consider looking in Luc Devroye's Non-uniform Random Number
Generation. Despite its title, section XI.3.2 describes how to
generate bivariate uniforms. The book is out of print but Devroye
himself urges you to print it from his scanned PDFs(!):

http://cgm.cs.mcgill.ca/~luc/rnbookindex.html

Hope this helps,

alejandro

On 7/1/05, Menghui Chen [EMAIL PROTECTED] wrote:
 Dear R users,
 
 I want to generate two random variables (X1, X2) from uniform
 distribution (-0.5, 0.5) with a specified correlation coefficient r.
 Does anyone know how to do it in R?
 
 Many thanks!
 
 Menghui
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] as.Date today ?

2005-07-01 Thread Omar Lakkis

I have a Date variable that I constructed with as.Date()
How ca I compare it to today (,,==)   ?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] as.Date today ?

2005-07-01 Thread Spencer Graves

  The following is a minor modification of examples in the help for 
as.Date:

  x - c(1jan1960, 2jan1960, 31mar1960, 30jul2006)
  z - as.Date(x, %d%b%Y)
  z Sys.Date()
[1]  TRUE  TRUE  TRUE FALSE

  How's this?
  spencer graves

Omar Lakkis wrote:

 I have a Date variable that I constructed with as.Date()
 How ca I compare it to today (,,==)   ?
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

-- 
Spencer Graves, PhD
Senior Development Engineer
PDF Solutions, Inc.
333 West San Carlos Street Suite 700
San Jose, CA 95110, USA

[EMAIL PROTECTED]
www.pdf.com http://www.pdf.com
Tel:  408-938-4420
Fax: 408-280-7915

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Is it possible to use glm() with 30 observations?

2005-07-01 Thread Kerry Bush

I have a very simple problem. When using glm to fit
binary logistic regression model, sometimes I receive
the following warning:

Warning messages:
1: fitted probabilities numerically 0 or 1 occurred
in: glm.fit(x = X, y = Y, weights = weights, start =
start, etastart = etastart,  
2: fitted probabilities numerically 0 or 1 occurred
in: glm.fit(x = X, y = Y, weights = weights, start =
start, etastart = etastart,  

What does this output tell me? Since I only have 30
observations, i assume this is a small sample problem.
Is it possible to fit this model in R with only 30
observations? Could any expert provide suggestions to
avoid the warning?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Lines for plot (Sweave)

2005-07-01 Thread Gabor Grothendieck

A variation on your idea might be:

fo - stu.vector ~ x
lines(fo, model.frame(fo), lty=1, col='blue')


On 7/1/05, Don MacQueen [EMAIL PROTECTED] wrote:
 You can use:
 
   lines(x[!is.na[stu.vector], stu.vector[!is.na(stu.vector)], lty=1,
 col='blue')
 
 -Don
 
 At 1:43 PM -0400 7/1/05, Doran, Harold wrote:
 Dear List:
 
 I am generating a series of plots iteratively using Sweave. In short, a
 dataframe is subsetted row by row and variable graphics are created
 conditional on the data in each row. In this particular case, this code
 ends up generating 17,000 individual plots.
 
 In some cases, all student data (this is working with student
 achievement data) are available and my code below works very well in the
 sense that a line connects all points. However, in some cases there are
 missing data and I need to modify my code so that lines are connected
 through all points even when data are missing.
 
 Here is a snip of relevant code. In the actual program, the data in
 stu.vector and avg.vector are obtained from the dataframe as the
 programs loops through each row.
 
 stu.vector-c(2500, 2510,   NA , 2600)
 avg.vector-c(2635, 2589, 2628, 2685)
 x - c(0,1,2,3)
 graph.min - min(stu.vector,avg.vector ,na.rm=TRUE)-150
 graph.max - max(stu.vector,avg.vector ,na.rm=TRUE)+150
 plot(x, stu.vector, ylim=c(graph.min,graph.max),  xlab= , ylab=Scaled
 Score, xaxt='n', pch=2, col='blue', main=Math Growth Rate)
 points(x, avg.vector, pch=1, col='red')
 lines(x, stu.vector, lty=1, col='blue')
 lines(x, avg.vector, lty=3, col='red')
 
 If the NA did not exist in the object stu.vector then all points would
 be connected with lines. However, in some cases data are missing and I
 need to connect the data in stu.vector with lines. So in this case, the
 line would connect points 1 and 2, and then 2 and 4 even though point 3
 is missing.
 
 Can anyone suggest how I might do this?
 
 Thanks,
 Harold
 
 
[[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
 
 
 --
 --
 Don MacQueen
 Environmental Protection Department
 Lawrence Livermore National Laboratory
 Livermore, CA, USA
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Is it possible to use glm() with 30 observations?

2005-07-01 Thread Spencer Graves

  The issue is not 30 observations but whether it is possible to 
perfectly separate the two possible outcomes.  Consider the following:

tst.glm - data.frame(x=1:3, y=c(0, 1, 0))
glm(y~x, family=binomial, data=tst.glm)

tst2.glm - data.frame(x=1:1000,
  y=rep(0:1, each=500))
glm(y~x, family=binomial, data=tst2.glm)

  The algorithm fits y~x to tst.glm without complaining for tst.glm, 
but issues warnings for tst2.glm.  This is called the Hauck-Donner 
effect, and RSiteSearch(Hauck-Donner) just now produced 8 hits.  For 
more information, look for Hauck-Donnner in the index of Venables, W. 
N. and Ripley, B. D. (2002) _Modern Applied Statistics with S._ New 
York: Springer.  (If you don't already have this book, I recommend you 
give serious consideration to purchasing a copy.  It is excellent on 
many issues relating to statistical analysis and R.

  Spencer Graves

Kerry Bush wrote:

 I have a very simple problem. When using glm to fit
 binary logistic regression model, sometimes I receive
 the following warning:
 
 Warning messages:
 1: fitted probabilities numerically 0 or 1 occurred
 in: glm.fit(x = X, y = Y, weights = weights, start =
 start, etastart = etastart,  
 2: fitted probabilities numerically 0 or 1 occurred
 in: glm.fit(x = X, y = Y, weights = weights, start =
 start, etastart = etastart,  
 
 What does this output tell me? Since I only have 30
 observations, i assume this is a small sample problem.
 Is it possible to fit this model in R with only 30
 observations? Could any expert provide suggestions to
 avoid the warning?
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

-- 
Spencer Graves, PhD
Senior Development Engineer
PDF Solutions, Inc.
333 West San Carlos Street Suite 700
San Jose, CA 95110, USA

[EMAIL PROTECTED]
www.pdf.com http://www.pdf.com
Tel:  408-938-4420
Fax: 408-280-7915

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Can you help?

2005-07-01 Thread Perfect Harmony

Please take the time to read this email. What you might view as spam could be 
an animals last chance.

Animals don't stand a chance without the everyone's help. Some people don't 
care, some don't think they can make a difference. There is always something 
YOU can do to help.

We desperately need your support to continue to assist HORSES in need. Even $5 
goes a long way towards helping.

http://www.perfectharmony-ms.org

You can visit our ebay auctions at 
http://cgi6.ebay.com/ws/eBayISAPI.dll?ViewSellersOtherItemsuserid=perfectharmony-ms

Horses like Tinkerbelle would never have had a chance for a decent life without 
our help.
Tinker is a 10 year old pony mare that is an extremely hard keeper. She was 
abused and untrained and sold to the person that donated her to our facility 
because she didn't feel that she could dedicate what Tinker needed properly. 
She has been here now for a year with us. She is slowly learning to trust. She 
is now coming up to strangers when they visit us, although she is a little 
standoffish.
Tinker has been with us this long because it seems no one wants her. Because 
she isn't broke to ride, she will most likely never find a home of her very own.
From what we know of Tinker, she would like a small boy of her very own that 
is gentle and experienced. She has never kicked or bitten, she is just afraid 
for herself.
Tinker needs her teeth floated now, she needs special feed to keep weight on 
and she needs training under saddle. Without these things she will have to live 
out her life with us never knowing what it is like to be 'the' cherished pony 
of some loving child.
We cannot give her these things without help from the public.
Pictures of Tinker can be seen at http://tinker.perfectharmony-ms.org.

Perfect Harmony Animal Rescue and Sanctuary
http://www.perfectharmony-ms.org
PayPal address [EMAIL PROTECTED]

unsub: [EMAIL PROTECTED]

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

60 matches

Mail list logo