Re: [R] combination which limited

2005-06-12 Thread Muhammad Subianto
Dear All,
Many thanks to Marc Schwartz and Gabor Grothendieck who have explained
me about using expand.grid function and clearly explain how to use
JGR.

 dd - expand.grid(interface = interface, screen = screen,
computer = computer, available = available)
 
 There are several possibilities now:
 
 1. you could list out dd on the console and note the number of the
 rows you want to keep:
 
 idx - c(1,5,7)
 dd2 - dd[,idx]
 

I like a possible no. 1, because I can use and explore with my hand,
  idx - c(1:5,9,17,25)
  dd2 - dd[idx,]
  dd2
   interface screen computer available
1usblcd   pc   yes
2   firewarelcd   pc   yes
3  infralcd   pc   yes
4  bluetoothlcd   pc   yes
5usb   cube   pc   yes
9usblcd   server   yes
17   usblcd   laptop   yes
25   usblcd   pcno
 

Regards,
Muhammad Subianto
Notepad, Copy and Paste are my best friend to use R.2.1.0 on windows 2000

On 6/11/05, Gabor Grothendieck [EMAIL PROTECTED] wrote:
 On 6/11/05, Marc Schwartz [EMAIL PROTECTED] wrote:
  On Sat, 2005-06-11 at 20:44 +0200, Muhammad Subianto wrote:
   Dear R-helpers,
   I am learning about combination in R.
   I want to combination all of
   possible variable but it limited.
   I am sorry I could not explain exactly.
   For usefull I give an example
 interface - c(usb,fireware,infra,bluetooth)
 screen- c(lcd,cube)
 computer  - c(pc,server,laptop)
 available - c(yes,no)
  
   What the result I need, something like this below,
 usb  lcd pc  yes
 fireware lcd pc  yes
 infralcd pc  yes
 bluetoothlcd pc  yes
 usb  cubepc  yes
 usb  lcd server  yes
 usb  lcd laptop  yes
 usb  lcd pc  no
  
   How can I do that?
   I was wondering if someone can help me.
   Thanks you for your time and best regards,
   Muhammad Subianto
 
  Use:
 
   expand.grid(interface, screen, computer, available)
 Var1 Var2   Var3 Var4
  1usb  lcd pc  yes
  2   fireware  lcd pc  yes
  3  infra  lcd pc  yes
  4  bluetooth  lcd pc  yes
  5usb cube pc  yes
  6   fireware cube pc  yes
  7  infra cube pc  yes
  8  bluetooth cube pc  yes
  9usb  lcd server  yes
  10  fireware  lcd server  yes
  11 infra  lcd server  yes
  12 bluetooth  lcd server  yes
  13   usb cube server  yes
  14  fireware cube server  yes
  15 infra cube server  yes
  16 bluetooth cube server  yes
  17   usb  lcd laptop  yes
  18  fireware  lcd laptop  yes
  19 infra  lcd laptop  yes
  20 bluetooth  lcd laptop  yes
  21   usb cube laptop  yes
  22  fireware cube laptop  yes
  23 infra cube laptop  yes
  24 bluetooth cube laptop  yes
  25   usb  lcd pc   no
  26  fireware  lcd pc   no
  27 infra  lcd pc   no
  28 bluetooth  lcd pc   no
  29   usb cube pc   no
  30  fireware cube pc   no
  31 infra cube pc   no
  32 bluetooth cube pc   no
  33   usb  lcd server   no
  34  fireware  lcd server   no
  35 infra  lcd server   no
  36 bluetooth  lcd server   no
  37   usb cube server   no
  38  fireware cube server   no
  39 infra cube server   no
  40 bluetooth cube server   no
  41   usb  lcd laptop   no
  42  fireware  lcd laptop   no
  43 infra  lcd laptop   no
  44 bluetooth  lcd laptop   no
  45   usb cube laptop   no
  46  fireware cube laptop   no
  47 infra cube laptop   no
  48 bluetooth cube laptop   no
 
 
  See ?expand.grid for more information.
 
 
 
 After you do the above you will still want to cut it down to just
 the rows you need.
 
 As expained, use expand.grid.  Let's assume you used this statement:
 
 dd - expand.grid(interface = interface, screen = screen,
computer = computer, available = available)
 
 There are several possibilities now:
 
 1. you could list out dd on the console and note the number of the
 rows you want to keep:
 
 idx - c(1,5,7)
 dd2 - dd[,idx]
 
 or if you want most of them it may be easier to record which ones
 you do not want:
 
 ndix - c(2,4,7)
 dd2 - dd[,-ndix]
 
 2. Another possibility is to export it to a spreadsheet and visually
 delete the rows you don't want.
 
 3. A third possibility is to install JGR (which is a free Java GUI
 front end to R).
 First download and install JGR from:http://stats.math.uni-augsburg.de/JGR/
 In JGR (I am using Windows and its possible that the instructions vary
 slightly on other platforms):
 
 1. create dd as explained
 2. bring up the object browser using the menu Tools | Object Browser
 or just ctrl-B
 3. Select dd from the object browser
 4. This will put you into a spreadsheet in which you can select the
 rows you want
  to delete (hold down ctrl for the 2nd and subsequent selection to have a
 non-contiguous multi-row selection).
 5. 

[R] y-axis and resizing window

2005-06-12 Thread Søren Merser
hi
using plot(..., las=1), i.e. horizontal axis labels, the labels on the 
y-axis jams if the heigth of the graphics windov becomes too low
while both x-axis and  y-axis kind of removes superflus lables with las=0 
(default)
is there a way to make plot behave alike with horizontal lables?
regards søren

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] y-axis and resizing window

2005-06-12 Thread Prof Brian Ripley

On Sun, 12 Jun 2005, Søren Merser wrote:


using plot(..., las=1), i.e. horizontal axis labels, the labels on the
y-axis jams if the heigth of the graphics windov becomes too low
while both x-axis and  y-axis kind of removes superflus lables with las=0
(default)
is there a way to make plot behave alike with horizontal lables?


It I understand you correctly (what does `jams' mean?), this is nothing to 
do with resizing. The axis labelling code checks for enough width-wise 
space for labels, but not for enough height-wise space. Specifically, 
do_axis for the y axis contains


/* Check room for perpendicular labels. */
if (Rf_gpptr(dd)-las == 1 ||
Rf_gpptr(dd)-las == 2 ||
tnew - tlast = gap) {

so y-axis labels are always plotted for las %in% c(1,2) and hence may 
overlap.  (Similar code exists for an x-axis.)


--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] delete -character from strings in matrix

2005-06-12 Thread Werner Wernersen
Hi!

I have strings where occasionally some -chars occur.
How can I delete these chars?

I tried it with gsub but using  as replace does not
work.

Thanks a lot for any hint!
Regards,
   Werner

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] glm with variance = mu+theta*mu^2?

2005-06-12 Thread Mwalili, S. M.
You can fit negative binomial using the 'zicounts' package
library(zicounts)
 
 data(teeth)
 names(teeth)

   ## c) fit negative binomial regression model
nb.zc - zicounts(resp = dmft~.,x =~gender + age,data=teeth, distr = NB)
 nb.zc

Even,
 
library(zicounts)
 library(Fahrmeir) # use cells data
data(cells)
nb.cells - zicounts(parm=c(2,0,0,0,1),resp = y~.,x 
=~TNF+IFN+TNF:IFN,data=cells, distr = NB)
 nb.cells
 
 
Samuel.


Kjetil Brinchmann Halvorsen [EMAIL PROTECTED] wrote:
Spencer Graves wrote:

 How might you fit a generalized linear model (glm) with variance 
 = mu+theta*mu^2 (where mu = mean of the exponential family random 
 variable and theta is a parameter to be estimated)?

 This appears in Table 2.7 of Fahrmeir and Tutz (2001) 
 Multivariate Statisticial Modeling Based on Generalized Linear Models, 
 2nd ed. (Springer, p. 60), where they compare log-linear model fits 
 to cellular differentiation data based on quasi-likelihoods between 
 variance = phi*mu (quasi-Poisson), variance = phi*mu^2 
 (quasi-exponential), and variance = mu+theta*mu^2. The quasi 
 function accepted for the family argument in glm generates functions 
 variance, validmu, and dev.resids. I can probably write 
 functions to mimic the quasi function. However, I have two 
 questions in regard to this:

 (1) I don't know what to use for dev.resids. This may not 
 matter for fitting. I can try a couple of different things to see if 
 it matters.

 (2) Might someone else suggest something different, e.g., using 
 something like optim to solve an appropriate quasi-score function?

 Thanks,
 spencer graves

Since nobody has answerd this I will try. The variance function 
mu+theta*mu^2 is the variance function
of the negative binomial family. If this variance function is used to 
construct a quasi-likelihood, the resulting quasi-
likelihood is identical to the negative binomial likelihood, so for 
fitting we can simly use glm.nb from MASS, which
will give the correct estimated values. However, in a quasi-likelihood 
setting the (co)varince estimation from
glm.nb is not appropriate, and from the book (fahrmeir ..) it seems that 
the estimation method used is a
sandwich estimator, so we can try the sandwich package. This works but 
the numerical results are somewhat different from the book. Any 
comments on this?

my code follows:

 library(Fahrmeir)
 library(help=Fahrmeir)
 library(MASS)
 cells.negbin - glm(y~TNF+IFN+TNF:IFN, data=cells,
family=negative.binomial(1/0.215))
 summary(cells.negbin)

Call:
glm(formula = y ~ TNF + IFN + TNF:IFN, family = negative.binomial(1/0.215),
data = cells)

Deviance Residuals:
Min 1Q Median 3Q Max 
-1.6714 -0.8301 -0.2153 0.4802 1.4282 

Coefficients:
Estimate Std. Error t value Pr(|t|) 
(Intercept) 3.39874495 0.18791125 18.087 4.5e-10 ***
TNF 0.01616136 0.00360569 4.482 0.00075 ***
IFN 0.00935690 0.00359010 2.606 0.02296 * 
TNF:IFN -0.5910 0.7002 -0.844 0.41515 
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for Negative Binomial(4.6512) family taken to be 
1.012271)

Null deviance: 46.156 on 15 degrees of freedom
Residual deviance: 12.661 on 12 degrees of freedom
AIC: 155.49

Number of Fisher Scoring iterations: 5

 confint(cells.negbin)
Waiting for profiling to be done...
2.5 % 97.5 %
(Intercept) 3.0383197319 3.7890206510
TNF 0.0091335087 0.0238915483
IFN 0.0023292566 0.0170195707
TNF:IFN -0.0001996824 0.960427
 library(sandwich)
Loading required package: zoo
 vcovHC( cells.negbin )
(Intercept) TNF IFN 
TNF:IFN
(Intercept) 0.01176249372 -0.0001279740135 -0.0001488223001 
0.0212541999
TNF -0.00012797401 0.039017282 0.021242875 
-0.0019793137
IFN -0.00014882230 0.021242875 0.054314079 
-0.0013277626
TNF:IFN 0.0212542 -0.001979314 -0.001327763 
0.0002370104
 cov2cor(vcovHC( cells.negbin ))
(Intercept) TNF IFN TNF:IFN
(Intercept) 1.000 -0.5973702 -0.5887923 0.1272950
TNF -0.5973702 1.000 0.4614542 -0.6508822
IFN -0.5887923 0.4614542 1.000 -0.3700671
TNF:IFN 0.1272950 -0.6508822 -0.3700671 1.000
 cells.negbin2 - glm.nb( y~TNF+IFN+TNF:IFN, data=cells)
 summary(cells.negbin)

Call:
glm(formula = y ~ TNF + IFN + TNF:IFN, family = negative.binomial(1/0.215),
data = cells)

Deviance Residuals:
Min 1Q Median 3Q Max 
-1.6714 -0.8301 -0.2153 0.4802 1.4282 

Coefficients:
Estimate Std. Error t value Pr(|t|) 
(Intercept) 3.39874495 0.18791125 18.087 4.5e-10 ***
TNF 0.01616136 0.00360569 4.482 0.00075 ***
IFN 0.00935690 0.00359010 2.606 0.02296 * 
TNF:IFN -0.5910 0.7002 -0.844 0.41515 
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for Negative Binomial(4.6512) family taken to be 
1.012271)

Null deviance: 46.156 on 15 degrees of freedom
Residual deviance: 12.661 on 12 degrees of freedom
AIC: 155.49

Number of Fisher Scoring iterations: 5

 confint( cells.negbin2 )
Waiting for profiling to be done...
2.5 % 97.5 %
(Intercept) 3.0864669072 

Re: [R] y-axis and resizing window

2005-06-12 Thread Søren Merser
thanks

with 'jams' i meant messes up, but your term overlap is exactly what i
actually had in mind

though a minor problem, do you think that the code will change to enable
checking for enough height-wise space?

regards søren

- Original Message - 
From: Prof Brian Ripley [EMAIL PROTECTED]
To: Søren Merser [EMAIL PROTECTED]
Cc: R - help r-help@stat.math.ethz.ch
Sent: Sunday, June 12, 2005 12:39 PM
Subject: Re: [R] y-axis and resizing window


On Sun, 12 Jun 2005, Søren Merser wrote:

 using plot(..., las=1), i.e. horizontal axis labels, the labels on the
 y-axis jams if the heigth of the graphics windov becomes too low
 while both x-axis and  y-axis kind of removes superflus lables with las=0
 (default)
 is there a way to make plot behave alike with horizontal lables?

It I understand you correctly (what does `jams' mean?), this is nothing to
do with resizing. The axis labelling code checks for enough width-wise
space for labels, but not for enough height-wise space. Specifically,
do_axis for the y axis contains

  /* Check room for perpendicular labels. */
  if (Rf_gpptr(dd)-las == 1 ||
  Rf_gpptr(dd)-las == 2 ||
  tnew - tlast = gap) {

so y-axis labels are always plotted for las %in% c(1,2) and hence may
overlap.  (Similar code exists for an x-axis.)

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595





 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] delete -character from strings in matrix

2005-06-12 Thread Liaw, Andy
Please define does not work.  Here's what I get:

 m - matrix(paste(letters[1:4], does not work.), 2, 2)
 m
 [,1]   [,2]  
[1,] a does not work. c does not work.
[2,] b does not work. d does not work.
 gsub(does not work., , m)
[1] a  b  c  d 
 structure(gsub(does not work., , m), dim=dim(m))
 [,1] [,2]
[1,] a  c 
[2,] b  d 

R-2.1.0 on WinXPPro.

Andy 

 From: Werner Wernersen
 
 Hi!
 
 I have strings where occasionally some -chars occur.
 How can I delete these chars?
 
 I tried it with gsub but using  as replace does not
 work.
 
 Thanks a lot for any hint!
 Regards,
Werner
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] delete -character from strings in matrix

2005-06-12 Thread Werner Wernersen
Thanks for the reply, Andy!

My problem was that I could not get rid of a double
quote character within the 
string. I don't know what I have done before, but now
it works...?!?!
Sorry for bothering you.

Best,
   Werner


Liaw, Andy wrote:
 Please define does not work.  Here's what I get:
 
 
m - matrix(paste(letters[1:4], does not work.),
2, 2)
m
 
  [,1]   [,2]  
 [1,] a does not work. c does not work.
 [2,] b does not work. d does not work.
 
gsub(does not work., , m)
 
 [1] a  b  c  d 
 
structure(gsub(does not work., , m), dim=dim(m))
 
  [,1] [,2]
 [1,] a  c 
 [2,] b  d 
 
 R-2.1.0 on WinXPPro.
 
 Andy 
 
 
From: Werner Wernersen

Hi!

I have strings where occasionally some -chars
occur.
How can I delete these chars?

I tried it with gsub but using  as replace does
not
work.

Thanks a lot for any hint!
Regards,
   Werner

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html



 
 
 
 

--
 Notice:  This e-mail message, together with any
attachments, contains information of Merck  Co., Inc.
(One Merck Drive, Whitehouse Station, New Jersey, USA
08889), and/or its affiliates (which may be known
outside the United States as Merck Frosst, Merck Sharp
 Dohme or MSD and in Japan, as Banyu) that may be
confidential, proprietary copyrighted and/or legally
privileged. It is intended solely for the use of the
individual or entity named on this message.  If you
are not the intended recipient, and have received this
message in error, please notify us immediately by
reply e-mail and then delete it from your system.

--
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] memory allocation problem under linux

2005-06-12 Thread [EMAIL PROTECTED]
I have some compiled code that works under winXp but not under linux (kernel
2.6.10-5). I'm also using R 2.1.0
After debugging, I've discovered that this code:
  #define NMAX 256
  long **box;
  ...
  box   = (long **)R_alloc(NMAX,   sizeof(long *));

gives a null pointer, so subsequent line:
  for (i=0; iNMAX; i++) box[i] = (long *) R_alloc(NMAX, sizeof(long));
gives a SIGSEGV signal.
In the same shared library, I have a function with this code:
  partitions=16;
  ...
  h2=(long **)R_alloc(partitions,sizeof(long *));
  for (i=0;ipartitions;i++) 
  h2[i]=(long *)R_alloc(partitions,sizeof(long));
that works! Naturally, I've tried to change NMAX from 256 to 16, without any
success.

Any idea on where the problem can reside? (Note that this not happens under 
WinXp).
And just another question. When R_alloc fails, should-it terminate the function
with an error, without returning control to the function?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Replacing for loop with tapply!?

2005-06-12 Thread Sander Oom
Dear Adaikalavan,

Your solution (the second function) is definitely the most elegant and 
generic solution of all replies in this discussion. Robust for missing 
values and flexible to allow as many calculations as desired! It is so 
clear, I even managed to hack it (of course also thanks to the new 
insight from all the other posts)!

As the data consists of weather stations in rows and days in columns, I 
have adapted the function to work on rows instead of columns. Did not 
manage to get the results directly into the right rows/cols layout, so a 
transpose (t) is still required. However this seems instant, so does not 
mean a reduction in speed! Calculating proportions is now a snip!!

Thanks for you help,

Sander.

### simulate data
set.seed(1)# for reproducibility
mat - matrix(sample(-15:50, 15 * 10, TRUE), 15, 10)
mat[ mat  45 ] - NA  # create some missing values
mat[ 9, ]   - NA  # station 9's data is completely missing
mat

find.stats - function( data, threshold ){

   n  - length(threshold)
   excess - numeric( n )
   out- matrix( ncol=nrow(data), nrow=(n + 2) ) # initialise
   good   - which( apply( data, 1, function(x) !all(is.na(x)) ) )
   # rows that are not completely missing

   out[ ,good ] - apply( data[ good, ], 1, function(x){
 m - max( x, na.rm=T )
 # determine maximum value per row
 c - length(x[!is.na(x)])
 # determine number of non-missing values
 for(i in 1:n){ excess[i] - sum( x  threshold[i], na.rm=TRUE 
)/length(x[!is.na(x)]) }
 # calc proportion of non-missing values over multiple thresholds
 return( c(m, c, excess) )
   } )

   rownames(out) - c( TmpMax, Count, paste(Over, threshold, sep=) )
   colnames(out) - rownames(data)  # name of the stations
   return( t(out) )
}

lstTemps=c(37,39,41,43)
tmp - find.stats( mat, lstTemps )
tmp




Adaikalavan Ramasamy wrote:
 OK, so you want to find some summary statistics for each column, where
 some columns could be completely missing. 
 
 Writing a small wrapper should help. When you use apply(), you are
 actually applying a function to every column (or row). First, let us
 simulate a dataset with 15 days/rows and 10 stations/columns 
 
 ### simulate data
 set.seed(1)# for reproducibility 
 mat - matrix(sample(-15:50, 15 * 10, TRUE), 15, 10)  
 mat[ mat  45 ] - NA  # create some missing values
 mat[ ,9 ]   - NA  # station 9's data is completely missing
 
 
 Here are two example of such wrappers :
 
 find.stats1 - function( data, threshold=c(37,39,41) ){
   
   n   - length(threshold)
   out - matrix(  nrow=(n + 1), ncol=ncol(data) ) # initialise
 
   out[1, ] - apply(data, 2, function(x) 
  ifelse( all(is.na(x)), NA, max(x, na.rm=T) ))
 
   for(i in 1:n) out[ i+1, ] - colSums( data  threshold[i], na.rm=T )
   
   rownames(out) - c( daily_max, paste(above, threshold, sep=_) )
   colnames(out) - rownames(data)  # name of the stations
   return( out )
 }
   
 find.stats2 - function( data, threshold=c(37,39,41) ){
   
   n  - length(threshold)
   excess - numeric( n )
   out- matrix(  nrow=(n + 1), ncol=ncol(data) ) # initialise
   good   - which( apply( data, 2, function(x) !all(is.na(x)) ) )
   # colums that are not completely missing
  
   out[ , good] - apply( data[ , good], 2, function(x){
 m - max( x, na.rm=T )
 for(i in 1:n){ excess[i] - sum( x  threshold[i], na.rm=TRUE ) }
 return( c(m, excess) )
   } ) 
   
   rownames(out) - c( daily_max, paste(above, threshold, sep=_) )
   colnames(out) - rownames(data)  # name of the stations
   return( out )
 }
 
 find.stats1( mat )
   [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 daily_max   44   42   39   41   45   43   42   45   NA42
 above_37 212132210 1
 above_39 210132110 1
 above_41 210022110 1
 
 find.stats2( mat )
   [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 daily_max   44   42   39   41   45   43   42   45   NA42
 above_37 21213221   NA 1
 above_39 21013211   NA 1
 above_41 21002211   NA 1
 
 
 On my laptop 'find.stats1' and 'find.stats2' (which is more flexible)
 takes 7 and 6 seconds respectively to execute on a dataset with 1
 stations and 365 days.
 
 Regards, Adai
 
 
 
 On Fri, 2005-06-10 at 20:05 +0200, Sander Oom wrote:
Dear all,

Dimitris and Andy, thanks for your great help. I have progressed to the 
following code which runs very fast and effective:

mat - matrix(sample(-15:50, 15 * 10, TRUE), 15, 10)
mat[mat45] - NA
mat-NA
mat
temps - c(35, 37, 39)
ind - rbind(
 t(sapply(temps, function(temp)
   rowSums(mat  temp, na.rm=TRUE) )),
 rowSums(!is.na(mat), na.rm=FALSE),
 apply(mat, 1, max, na.rm=TRUE))
ind - t(ind)
ind

However, some weather stations have missing values for the whole 

[R] linking R to goto blas

2005-06-12 Thread Stefan Sobernig
Dear all,

I am currently trying to link R 2.1.0 to the GOTO BLAS 0.99.3 library on
a box running Fedora Core 3 , basically following the steps indicated in
the R-Admin document:

1: I downloaded the current libgoto.xxx.so from
http://www.cs.utexas.edu/users/kgoto/libraries/libgoto_prescott-32-r0.99-3.so.gz,
a version suitable for our XEON machine (Nocona core), unpacked it to
/usr/lib and created a symlink libgoto.so pointing to the library.

2: Then, I got ready to re-configure and re-compile R (2.1.0) using the
following configure flags: ./configure --prefix=/usr --enable-R-shlib
--enable-shared --with-tcltk --with-blas=-lgoto -lpthread -lm

I did read the R-Admin doc and therefore I am aware of the fact that
passing -lgoto is supposed to be sufficient, but as a matter of fact
configuring with --with-blas=-lgoto only ends up in a libR.so being
linked to the standard libblas.so. config.log reports in this settings
that libgoto.xxx.so is missing links to libpthread etc. Therefore, I
added the two flags -lpthread -lm as indicated at GOTO's website and I
got a clean configure run. (Am I concluding correctly that I am using a
threaded version of goto blas?)

3: Running make, however, freezed when trying to build grDevices,
without throwing any warning or error messages:

[...]
../../../../library/grDevices/libs/grDevices.so is unchanged
make[5]: Leaving directory
`/home/ssoberni/R-2.1.0/src/library/grDevices/src'
make[4]: Leaving directory
`/home/ssoberni/R-2.1.0/src/library/grDevices/src'
[freeze]

4: I then rummaged the R mailing list archives and stumbled over a
thread dating from May this year pointing to a similar issue, concerning
gcc-3.4 and broken lapack libraries provided by FC3 (see
https://stat.ethz.ch/pipermail/r-devel/2005-May/033117.html).

Following these opinions/ findings, I did the following (though I knew
that -- in principle -- R is supposed to handle this issue by passing a
--ffloat-store flag to the fortran compiler, doesn't it?):

* I wanted to remove the FC3 native lapack libraries, and to my
surprise, they were not installed at all (no liblapack.so.xxx in /usr/lib).
* I set up an older gcc environment, i.e. the last release from the
3.3.x family (3.3.6) and tried to recompile R ending up with the same
hang-up.

As a last step, I tried to exclude R's internal package explicitly by
setting --wihtout-lapack, which did not hava a visible effect on the
building process and did not provide a workaround for the hang-up.

Please, I highly appreciate any thoughts or hints as my colleagues and I
are eager to get into GOTO's universe.

//stefan

-- 

Stefan Sobernig
Department of Information Systems and New Media
Vienna University of Economics  
Augasse 2-6
A - 1090 Vienna
 
Phone: +43 - 1 - 31336 - 4878
Fax: +43 - 1 - 31336 - 746 
Email: [EMAIL PROTECTED] mailto:[EMAIL PROTECTED]
PubKey: http://julia.wu-wien.ac.at/~ssoberni/0x5FC2D3FA.asc 
http://julia.wu-wien.ac.at/%7Essoberni/0x5FC2D3FA.asc
 


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] linking R to goto blas

2005-06-12 Thread Prof Brian Ripley
On Sun, 12 Jun 2005, Stefan Sobernig wrote:

 I am currently trying to link R 2.1.0 to the GOTO BLAS 0.99.3 library on
 a box running Fedora Core 3 , basically following the steps indicated in
 the R-Admin document:

 1: I downloaded the current libgoto.xxx.so from
 http://www.cs.utexas.edu/users/kgoto/libraries/libgoto_prescott-32-r0.99-3.so.gz,
 a version suitable for our XEON machine (Nocona core), unpacked it to
 /usr/lib and created a symlink libgoto.so pointing to the library.

 2: Then, I got ready to re-configure and re-compile R (2.1.0) using the
 following configure flags: ./configure --prefix=/usr --enable-R-shlib
 --enable-shared --with-tcltk --with-blas=-lgoto -lpthread -lm

 I did read the R-Admin doc and therefore I am aware of the fact that
 passing -lgoto is supposed to be sufficient, but as a matter of fact

Only for single-threaded versions.  For others you need 
--with-blas=-lgoto -lpthread.

 configuring with --with-blas=-lgoto only ends up in a libR.so being
 linked to the standard libblas.so. config.log reports in this settings
 that libgoto.xxx.so is missing links to libpthread etc. Therefore, I
 added the two flags -lpthread -lm as indicated at GOTO's website and I
 got a clean configure run. (Am I concluding correctly that I am using a
 threaded version of goto blas?)

Dunno: the organization of the Goto site has changed since that section 
was written.  Looks like only multi-threaded (2 threads) versions are 
currently available.

 3: Running make, however, freezed when trying to build grDevices,
 without throwing any warning or error messages:

 [...]
 ../../../../library/grDevices/libs/grDevices.so is unchanged
 make[5]: Leaving directory
 `/home/ssoberni/R-2.1.0/src/library/grDevices/src'
 make[4]: Leaving directory
 `/home/ssoberni/R-2.1.0/src/library/grDevices/src'
 [freeze]

 4: I then rummaged the R mailing list archives and stumbled over a
 thread dating from May this year pointing to a similar issue, concerning
 gcc-3.4 and broken lapack libraries provided by FC3 (see
 https://stat.ethz.ch/pipermail/r-devel/2005-May/033117.html).

 Following these opinions/ findings, I did the following (though I knew
 that -- in principle -- R is supposed to handle this issue by passing a
 --ffloat-store flag to the fortran compiler, doesn't it?):

It does.  For me this works with the internal BLAS and with Goto's 
blas versions 0.96-2 and 0.99-3, on an Opteron.  (It also works on i686 
with several other BLASes.)

 * I wanted to remove the FC3 native lapack libraries, and to my
 surprise, they were not installed at all (no liblapack.so.xxx in /usr/lib).
 * I set up an older gcc environment, i.e. the last release from the
 3.3.x family (3.3.6) and tried to recompile R ending up with the same
 hang-up.

They are not used unless you explicitly asked for them.

 As a last step, I tried to exclude R's internal package explicitly by
 setting --wihtout-lapack, which did not hava a visible effect on the
 building process and did not provide a workaround for the hang-up.

Assuming that is a typo for --without-lapack, it does nothing (it is the 
default and excludes an external LAPACK).

 Please, I highly appreciate any thoughts or hints as my colleagues and I
 are eager to get into GOTO's universe.

First get a version with the internal BLAS working.  That will rule out
any issues about LAPACK.  Then change the BLAS: it looks as if this might 
be a problem with the particular Goto BLAS.

Please note: the R-devel list would be a much better choice for such 
issues -- see the posting guide.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] linking R to goto blas

2005-06-12 Thread Peter Dalgaard
Stefan Sobernig [EMAIL PROTECTED] writes:

 Dear all,
 
 I am currently trying to link R 2.1.0 to the GOTO BLAS 0.99.3 library on
 a box running Fedora Core 3 , basically following the steps indicated in
 the R-Admin document:
 
 1: I downloaded the current libgoto.xxx.so from
 http://www.cs.utexas.edu/users/kgoto/libraries/libgoto_prescott-32-r0.99-3.so.gz,
 a version suitable for our XEON machine (Nocona core), unpacked it to
 /usr/lib and created a symlink libgoto.so pointing to the library.
 
 2: Then, I got ready to re-configure and re-compile R (2.1.0) using the
 following configure flags: ./configure --prefix=/usr --enable-R-shlib
 --enable-shared --with-tcltk --with-blas=-lgoto -lpthread -lm
... 
 Please, I highly appreciate any thoughts or hints as my colleagues and I
 are eager to get into GOTO's universe.

Hmm. Looks over-complicated to me. What works for me on AMD64 is to
have a config.site file in my BUILD-GOTO directory, containing

 cat config.site
BLAS_LIBS=-L/home/pd/GOTO -lgoto_opt64p-r0.96 -lpthread
CFLAGS=-O3 -g
#CFLAGS=-g
FFLAGS=$CFLAGS
CXXFLAGS=$CFLAGS

(the .*FLAGS business is optional, of course). With this in place, a
simple ../R/configure followed by make seems to do the trick.

I'll give it a try on my FC3 system, but it's a 500 MHz PIII, so it
takes a while...

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph: (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] 0 * NA

2005-06-12 Thread BORGULYA Gábor
Hi list!

Debuging one of my R programs I found:

  0 * NA
 [1] NA

It this a bug, or intentional? I would expect 0 or 0.0 depending on the type 
of the NA.

Gabor

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] 0 * NA

2005-06-12 Thread Liaw, Andy
I believe that's intentional.  NA means we don't know what the value is, so
just about any operation with NA will result in NA.  You might think
anything times 0 is 0, but:

 0*Inf
[1] NaN

and there's no guarantee that the true value not observed is not Inf...

Andy

 From: BORGULYA Gábor
 
 Hi list!
 
 Debuging one of my R programs I found:
 
   0 * NA
  [1] NA
 
 It this a bug, or intentional? I would expect 0 or 0.0 
 depending on the type 
 of the NA.
 
 Gabor
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] linking R to goto blas

2005-06-12 Thread Peter Dalgaard
Peter Dalgaard [EMAIL PROTECTED] writes:

 Stefan Sobernig [EMAIL PROTECTED] writes:
 
  Dear all,
  
  I am currently trying to link R 2.1.0 to the GOTO BLAS 0.99.3 library on
  a box running Fedora Core 3 , basically following the steps indicated in
  the R-Admin document:
  
  1: I downloaded the current libgoto.xxx.so from
  http://www.cs.utexas.edu/users/kgoto/libraries/libgoto_prescott-32-r0.99-3.so.gz,
  a version suitable for our XEON machine (Nocona core), unpacked it to
  /usr/lib and created a symlink libgoto.so pointing to the library.
  
  2: Then, I got ready to re-configure and re-compile R (2.1.0) using the
  following configure flags: ./configure --prefix=/usr --enable-R-shlib
  --enable-shared --with-tcltk --with-blas=-lgoto -lpthread -lm
 ... 
  Please, I highly appreciate any thoughts or hints as my colleagues and I
  are eager to get into GOTO's universe.
 
 Hmm. Looks over-complicated to me. What works for me on AMD64 is to
 have a config.site file in my BUILD-GOTO directory, containing
 
  cat config.site
 BLAS_LIBS=-L/home/pd/GOTO -lgoto_opt64p-r0.96 -lpthread
 CFLAGS=-O3 -g
 #CFLAGS=-g
 FFLAGS=$CFLAGS
 CXXFLAGS=$CFLAGS
 
 (the .*FLAGS business is optional, of course). With this in place, a
 simple ../R/configure followed by make seems to do the trick.
 
 I'll give it a try on my FC3 system, but it's a 500 MHz PIII, so it
 takes a while...

Hmm... That gives me the grDevices issue, which boils down to an R
that segfaults immediately upon startup, in

#0  0x05c0aea7 in tilde_expand () from /usr/lib/libreadline.so.4
#1  0x08170254 in R_ExpandFileName_readline (
s=0x8bb4020
#/home/pd/r-patched/BUILD-GOTO/library/grDevices/R/sysdata.rdb,
#buff=0x8295300
#/home/pd/r-patched/BUILD-GOTO/library/grDevices/R/grDevices)
at ../../../R/src/unix/sys-std.c:406
#2  0x0816f5da in R_ExpandFileName (
s=0x8bb4020
#/home/pd/r-patched/BUILD-GOTO/library/grDevices/R/sysdata.rdb) at
#../../../R/src/unix/sys-unix.c:129
#3  0x08167352 in R_FileExists (
path=0x8bb4020
#/home/pd/r-patched/BUILD-GOTO/library/grDevices/R/sysdata.rdb) at
#stat.h:365
#4  0x08105da3 in do_fileexists (call=0x84fd544, op=0x82c5f78,
#args=0x0,
rho=0x8c514a4) at ../../../R/src/main/platform.c:857
#5  0x080edc85 in do_internal (call=0x0, op=0x82ba5d4, args=0x3920,
env=0x8c514a4) at ../../../R/src/main/names.c:1078
#6  0x080c0daa in Rf_eval (e=0x84fd57c, rho=0x8c514a4)
at ../../../R/src/main/eval.c:382
#7  0x080c3695 in Rf_applyClosure (call=0x8668c28, op=0x84fd5b4,
arglist=0x8c50564, rho=0x8ad5cc4, suppliedenv=0x82aa5f0)

running --no-readline gives me another crash

(gdb) bt
#0  0x003fb0da in strcmp () from /lib/ld-linux.so.2
#1  0x003f009a in _dl_map_object () from /lib/ld-linux.so.2
#2  0x004fdb58 in dl_open_worker () from /lib/tls/libc.so.6
#3  0x in ?? ()

...which suggests that something is up with dynamic linking. 

I'll give it another spin...

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph: (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] memory allocation problem under linux

2005-06-12 Thread [EMAIL PROTECTED]
I've written:

  #define NMAX 256
  long **box;
  ...
  box   = (long **)R_alloc(NMAX,   sizeof(long *));
gives a null pointer, so subsequent line:
  for (i=0; iNMAX; i++) box[i] = (long *) R_alloc(NMAX, sizeof(long));
gives a SIGSEGV signal.

Sorry, that's not exact: I have a segmentation fault just *inside* R_alloc!
Substituting R_alloc with malloc and Calloc gives the same error.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] [R-pkgs] New versions of Matrix and lme4 packages

2005-06-12 Thread Douglas Bates
I have uploaded version 0.96-1 of both Matrix and lme4 to CRAN.  The
source package should migrate to CRAN over the weekend and binary
packages should be available some time next week.

As for previous releases, the versions of these two packages are
interdependent.  The lme4 package requires Matrix_0.96-1 or later but
we cannot enforce the other dependency.  Please remember that if you
upgrade the Matrix package you should also upgrade the lme4 package.

The method for fitting generalized linear mixed models using the
Laplacian approximation is considerably faster in this version.  Also,
the packages have been reorganized so the interdependence will not be
as strong in the future.

___
R-packages mailing list
[EMAIL PROTECTED]
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Essay identification

2005-06-12 Thread Werner Bier
Hi R-help,
 
I have a database of 10 students who have written an overall of 78 essays. 
The challenge? I would like to identify who wrote the 79th essay.
 
Has anybody used R in this context? 
 
Even if not, would you suggest me which pattern recognition technique I might 
possibly apply?
 
Thanks a lot and regards,
Tom 



-


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Essay identification

2005-06-12 Thread Berton Gunter
I assume that you know the usual procedure is to 'score' each essay by a
vector that gives the frequency of occurrence of commonly used (sometimes
adding subject matter specific) words and phrases. This multivariate
response is then fed in as a training set into your favorite supervised
learning/classification procedure. R has many of these -- trees, logisic
regression, boosting, Random Forests,svm's,LDA,SOM's (whoops -- that's an
Unsupervised one),  ... . Try
RSiteSearch('Classification',restrict=('functions').

The devil is in the details as to what works best, I believe. With only 78
exemplars in 10 groups, unless there is a lot of separation (disparate
styles that you could probably detect manually) it may be difficult. It also
depends on how large each group is (balance is generally better).

Cheers,
Bert

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Werner Bier
Sent: Sunday, June 12, 2005 12:30 PM
To: r-help@stat.math.ethz.ch
Subject: [R] Essay identification

Hi R-help,
 
I have a database of 10 students who have written an overall of 78 essays. 
The challenge? I would like to identify who wrote the 79th essay.
 
Has anybody used R in this context? 
 
Even if not, would you suggest me which pattern recognition technique I
might possibly apply?
 
Thanks a lot and regards,
Tom 



-


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Essay identification

2005-06-12 Thread Gabor Grothendieck
On 6/12/05, Werner Bier [EMAIL PROTECTED] wrote:
 Hi R-help,
 
 I have a database of 10 students who have written an overall of 78 essays.
 The challenge? I would like to identify who wrote the 79th essay.
 
 Has anybody used R in this context?
 
 Even if not, would you suggest me which pattern recognition technique I might 
 possibly apply?

Check out

http://xxx.uni-augsburg.de/PS_cache/cond-mat/pdf/0108/0108530.pdf

for a simple method.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] ANOVA vs REML approach to variance component estimation

2005-06-12 Thread Adaikalavan Ramasamy
Thank you for confirming this and introducing me to varcomp().

I have another question that I hope you or someone else can help me
with. I was trying to generalise my codes for variable measurement
levels and discovered that lme() was estimating the within group
variance even with a single measure per subject for all subjects !

Here is an example where we have 12 animals but with single measurement.

  y  - c(2.2, -1.4, -0.5, -0.3, -2.1, 1.5, 
  1.3, -0.3, 0.5, -1.4, -0.2, 1.8) 
  ID - factor( 1:12 )


Analysis of variance method correctly says that there is no residual
variance and it equals to total variance.

summary(aov(y ~ ID))
Df  Sum Sq Mean Sq
ID  11 20.9692  1.9063


However the REML method is giving me a within animal variance when there
is no replication at animal level. It seems like I can get components of
variance for factors that are not replicated.

library(ape)
varcomp(lme(y ~ 1, random = ~ 1 | ID))
   IDWithin 
1.6712661 0.2350218 

Am I reading this correct and can someone kindly explain this to me ?

Thank you again.

Regards, Adai



On Fri, 2005-06-10 at 15:10 -0400, Chuck Cleland wrote:
They look fine to me.  Also, note varcomp() in the ape package and 
 VarCorr() in the nlme package.  I think in this case the ANOVA estimate 
 of the intercept variance component is negative because the true value 
 is close to zero.
 
   y - c( 2.2, -1.4, -0.5,  # animal 1
 +-0.3, -2.1,  1.5,  # animal 2
 + 1.3, -0.3,  0.5,  # animal 3
 +-1.4, -0.2,  1.8)  # animal 4
 
   ID - factor( rep(1:4, each=3) )
 
   library(nlme)
   library(ape)
 
   summary(aov(y ~ ID))
  Df  Sum Sq Mean Sq F value Pr(F)
 ID   3  0.9625  0.3208  0.1283 0.9406
 Residuals8 20.0067  2.5008
 
   (0.3208 - 2.5008) / 3
 [1] -0.727
 
   varcomp(lme(y ~ 1, random = ~ 1 | ID))
ID   Within
 0.0002709644 1.9062505816
 attr(,class)
 [1] varcomp
 
   VarCorr(lme(y ~ 1, random = ~ 1 | ID))
 ID = pdLogChol(1)
  Variance StdDev
 (Intercept) 0.0002709644 0.01646100
 Residual1.9062505816 1.38067034
 
 Adaikalavan Ramasamy wrote:
  Can anyone verify my calculations below or explain why they are wrong ?
  
  I have several animals that were measured thrice. The only blocking
  variable is the animal itself. I am interested in calculating the 
  between and within object variations in R. An artificial example :
  
  y - c( 2.2, -1.4, -0.5,  # animal 1
 -0.3  -2.1   1.5,  # animal 2
  1.3  -0.3   0.5,  # animal 3
 -1.4  -0.2   1.8)  # animal 4
  ID - factor( rep(1:4, each=3) )
  
  
  1) Using the ANOVA method
  
summary(aov( y ~ ID ))
Df Sum Sq Mean Sq F value Pr(F)
ID   3  0.900   0.300  0.1207 0.9453
Residuals8 19.880   2.485   
  
= within animal  variation  = 2.485
= between animal variation  = (0.300 - 2.485)/3 = -0.7283
  
  I am aware that ANOVA can give negative estimates for variances. Is this
  such a case or have I coded wrongly ?
  
  
  2) Using the REML approach 
  
library(nlme)
lme( y ~ 1, rand = ~ 1 | ID)
 
Random effects:
Formula: ~1 | ID
(Intercept) Residual
StdDev:  0.01629769 1.374438
  
= within animal variation  = 1.374438^2 = 1.88908
= between animal variation = 0.01629769^2 = 0.0002656147
  
  Is this the correct way of coding for this problem ? I do not have
  access to a copy of Pinheiro  Bates at the moment.
  
  Thank you very much in advance.
  
  Regards, Adai
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
  http://www.R-project.org/posting-guide.html
  


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] delete -character from strings in matrix

2005-06-12 Thread Adaikalavan Ramasamy
You will need to escape special characters. Here is an example :

 my.string - Here is a quote \ in a string
 my.string
  [1] Here is a quote \ in a string

 gsub(\, , my.string)
  [1] Here is a quote  in a string
 
See help(regexp) for more details.

Regards, Adai



On Sun, 2005-06-12 at 14:10 +0200, Werner Wernersen wrote:
 Thanks for the reply, Andy!
 
 My problem was that I could not get rid of a double
 quote character within the 
 string. I don't know what I have done before, but now
 it works...?!?!
 Sorry for bothering you.
 
 Best,
Werner
 
 
 Liaw, Andy wrote:
  Please define does not work.  Here's what I get:
  
  
 m - matrix(paste(letters[1:4], does not work.),
 2, 2)
 m
  
   [,1]   [,2]  
  [1,] a does not work. c does not work.
  [2,] b does not work. d does not work.
  
 gsub(does not work., , m)
  
  [1] a  b  c  d 
  
 structure(gsub(does not work., , m), dim=dim(m))
  
   [,1] [,2]
  [1,] a  c 
  [2,] b  d 
  
  R-2.1.0 on WinXPPro.
  
  Andy 
  
  
 From: Werner Wernersen
 
 Hi!
 
 I have strings where occasionally some -chars
 occur.
 How can I delete these chars?
 
 I tried it with gsub but using  as replace does
 not
 work.
 
 Thanks a lot for any hint!
 Regards,
Werner
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 
 
 
  
  
  
  
 
 --
  Notice:  This e-mail message, together with any
 attachments, contains information of Merck  Co., Inc.
 (One Merck Drive, Whitehouse Station, New Jersey, USA
 08889), and/or its affiliates (which may be known
 outside the United States as Merck Frosst, Merck Sharp
  Dohme or MSD and in Japan, as Banyu) that may be
 confidential, proprietary copyrighted and/or legally
 privileged. It is intended solely for the use of the
 individual or entity named on this message.  If you
 are not the intended recipient, and have received this
 message in error, please notify us immediately by
 reply e-mail and then delete it from your system.
 
 --
  
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] ANOVA vs REML approach to variance component estimation

2005-06-12 Thread Douglas Bates
On 6/12/05, Adaikalavan Ramasamy [EMAIL PROTECTED] wrote:
 Thank you for confirming this and introducing me to varcomp().
 
 I have another question that I hope you or someone else can help me
 with. I was trying to generalise my codes for variable measurement
 levels and discovered that lme() was estimating the within group
 variance even with a single measure per subject for all subjects !
 
 Here is an example where we have 12 animals but with single measurement.
 
   y  - c(2.2, -1.4, -0.5, -0.3, -2.1, 1.5,
   1.3, -0.3, 0.5, -1.4, -0.2, 1.8)
   ID - factor( 1:12 )
 
 
 Analysis of variance method correctly says that there is no residual
 variance and it equals to total variance.
 
 summary(aov(y ~ ID))
 Df  Sum Sq Mean Sq
 ID  11 20.9692  1.9063
 
 
 However the REML method is giving me a within animal variance when there
 is no replication at animal level. It seems like I can get components of
 variance for factors that are not replicated.
 
 library(ape)
 varcomp(lme(y ~ 1, random = ~ 1 | ID))
IDWithin
 1.6712661 0.2350218
 
 Am I reading this correct and can someone kindly explain this to me ?

It's a spurious convergence in lme.  There is no check in lme for the
number of observations exceeding the number of groups.  There should
be.  I'll add this to the bug reports list.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Essay identification

2005-06-12 Thread Ted Harding

On 12-Jun-05 Berton Gunter wrote:
 I assume that you know the usual procedure is to 'score'
 each essay by a vector that gives the frequency of occurrence
 of commonly used (sometimes adding subject matter specific)
 words and phrases. This multivariate response is then fed in
 as a training set into your favorite supervised
 learning/classification procedure. R has many of these -- trees,
 logisic regression, boosting, Random Forests,svm's,LDA,SOM's
 (whoops -- that's an Unsupervised one),  ... . Try
 RSiteSearch('Classification',restrict=('functions').
 
 The devil is in the details as to what works best, I believe.
 With only 78 exemplars in 10 groups, unless there is a lot of
 separation (disparate styles that you could probably detect
 manually) it may be difficult. It also depends on how large
 each group is (balance is generally better).
 
 Cheers,
 Bert

I would add to Berton's list such scores as numbers of different
words used, sentence lengths, relative frequencies of verbs,
nouns, adjectives, adverbs, and so on, perhaps scaled by overall
length. Length of Essay might even be a discriminant!

You could also look at more subtle characteristics such as
Zipf bins[*] -- the relative numbers of different
words which occur once only, twice, three times, ... (though
I'm not sure how you would score such a thing for classification
purposes).
[*] A term I've just invented inspired by the original instance
of this by the linguist Zipf, later giving rise to the
logarithmic distribution in the historic paper by Fisher,
Corbett  Williams in the Numbers of Species and Numbers
of Individuals in butterfly traps.

If you really want to go to town you can try things related to
grammatical complexity, e.g. numbers of subordinate clauses
per sentence, relative clauses, the reach of relative pronouns
(how far from the referring pronoun is the thing referred to)
and so on.

There's quite an extensive literature on this sort of thing.
though it's not as fashionable as it used to be.

Th real problem is that you can get carried away by good
ideas of things to try!

The other factor to bear in mind is that if the Essays
can be grouped by subject this is likely to influence many
of the scores (such as the above).

Hoping this helps and does not distract!
Ted.



E-Mail: (Ted Harding) [EMAIL PROTECTED]
Fax-to-email: +44 (0)870 094 0861
Date: 13-Jun-05   Time: 00:43:10
-- XFMail --

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] ANOVA vs REML approach to variance component estimation

2005-06-12 Thread Adaikalavan Ramasamy
Thank you.

On Sun, 2005-06-12 at 18:54 -0500, Douglas Bates wrote:
 On 6/12/05, Adaikalavan Ramasamy [EMAIL PROTECTED] wrote:
  Thank you for confirming this and introducing me to varcomp().
  
  I have another question that I hope you or someone else can help me
  with. I was trying to generalise my codes for variable measurement
  levels and discovered that lme() was estimating the within group
  variance even with a single measure per subject for all subjects !
  
  Here is an example where we have 12 animals but with single measurement.
  
y  - c(2.2, -1.4, -0.5, -0.3, -2.1, 1.5,
1.3, -0.3, 0.5, -1.4, -0.2, 1.8)
ID - factor( 1:12 )
  
  
  Analysis of variance method correctly says that there is no residual
  variance and it equals to total variance.
  
  summary(aov(y ~ ID))
  Df  Sum Sq Mean Sq
  ID  11 20.9692  1.9063
  
  
  However the REML method is giving me a within animal variance when there
  is no replication at animal level. It seems like I can get components of
  variance for factors that are not replicated.
  
  library(ape)
  varcomp(lme(y ~ 1, random = ~ 1 | ID))
 IDWithin
  1.6712661 0.2350218
  
  Am I reading this correct and can someone kindly explain this to me ?
 
 It's a spurious convergence in lme.  There is no check in lme for the
 number of observations exceeding the number of groups.  There should
 be.  I'll add this to the bug reports list.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] slow loading with lme4

2005-06-12 Thread ronggui
it takes a long time to load the lme4 package.anyone else encounter this 
problem?

 system.time(library(lme4))
Matrix
lattice
[1] 19.90  0.30 25.56NANA


 version
 _  
platform i386-pc-mingw32
arch i386   
os   mingw32
system   i386, mingw32  
status   Patched
major2  
minor1.0
year 2005   
month05 
day  29 
language R 

OS:windows 2000



2005-06-13

--
Deparment of Sociology
Fudan University

Blog:www.sociology.yculblog.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] us zipcode data map

2005-06-12 Thread Francisco J. Zagmutt
Not that I am aware of.  Try library(help=maps) for a list of all the 
functions in the library.  Anyhow, I am not sure that a US map with zipcodes 
will look very good/readable, unless you focus on a very small area (i.e. 
county).

Cheers

Francisco

From: Mike R [EMAIL PROTECTED]
Reply-To: r-help@stat.math.ethz.ch
To: r-help@stat.math.ethz.ch
Subject: Re: [R] us zipcode data map
Date: Fri, 10 Jun 2005 18:06:39 -0700

On 6/10/05, Francisco J. Zagmutt [EMAIL PROTECTED] wrote:
  library(maps)
  example(match.map) #for coloring
 
  If you want to annotate the map look at ?map.text

thanks Francisco,  correct me if i am wrong, but maps_2.0-27.tar.gz
does many many maps, but not any zipcode maps ?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html