[R] fitting mixed models to censored data?

2007-04-23 Thread Douglas Grove
Hi,

I'm trying to figure out if there are any packages allowing
one to fit mixed models (or non-linear mixed models) to data
that includes censoring.

I've done some searching already on CRAN and through the mailing
list archives, but haven't discovered anything.  Since I may well
have done a poor job searching I thought I'd ask here prior to
giving up.

I understand that SAS's proc nlmixed can accomodate censoring
(though proc mixed apparently can't), so if I can't find 
something available in R, I'll have to break down and use
that.  Please, save me from having to use SAS!

Thanks much,
Doug

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] fitting mixed models to censored data?

2007-04-23 Thread Douglas Grove
Hi Bert,

Yes, I am always wary when one software offers something that
other do not.

The censoring I'm faced with (at present) isn't as complicated
as with much 'survival' data.  I'm trying to analyze assay data
and have a lower limit of detection (LLD) to contend with. 
Once the level of the analyte gets low enough it can't be 
accurately quantitated, hence all that is reported is that 
the level is less than some value (the LLD).

So I'm not worried about all the complex assumptions that go along
with censoring in clinical trials, etc.

Thanks,
Doug


On Mon, 23 Apr 2007, Bert Gunter wrote:

 Douglas:

 AFAIK, this is subject area of active current research. Diggle, Heagerty,
 Liang, and Zeger , 2002, (ANALYSIS OF LONGITUDINAL DATA) say on p.316: An
 emerging consensus is that analysis of data with potentially informative
 dropouts necessarily involves assumptions which are difficult, or even
 impossible, to check from the observed data.  This was ca 1994, I believe,
 so I don't know whether this view is still held among experts (which I am
 not). But if it is, you may do well to be careful of whatever SAS does even
 if you do have to go running off to it.

 Cheers,

 Bert Gunter
 Genentech Nonclinical Statistics


 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Douglas Grove
 Sent: Monday, April 23, 2007 10:58 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] fitting mixed models to censored data?

 Hi,

 I'm trying to figure out if there are any packages allowing
 one to fit mixed models (or non-linear mixed models) to data
 that includes censoring.

 I've done some searching already on CRAN and through the mailing
 list archives, but haven't discovered anything.  Since I may well
 have done a poor job searching I thought I'd ask here prior to
 giving up.

 I understand that SAS's proc nlmixed can accomodate censoring
 (though proc mixed apparently can't), so if I can't find
 something available in R, I'll have to break down and use
 that.  Please, save me from having to use SAS!

 Thanks much,
 Doug

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] fitting mixed models to censored data?

2007-04-23 Thread Douglas Grove
Hi Bill,

Thanks for your reply.  The first place I looked was in
the survival package since it can obviously handle 
censored data.  However, I don't have any particular desire
to restrict myself to standard survival models just because
I have some censoring.  Frailties appear to fit in nicely
with the types of models typically used with survival data,
but that's not the only kind of model I'd like to look at.

Thanks,
Doug


On Mon, 23 Apr 2007, Pikounis, Bill [CNTUS] wrote:

 Doug,
 In perhaps similar situations where there are clusters of measurements
 due to repeated time or space on an individual subject or experimental
 unit, I have used the survreg() function from the survival library.

 You can specify left, right, and/or interval censoring within a data set
 through Surv(), and so I have used left censoring for the LOD
 observations. I was just focused on marginal or population-averaged
 estimation, so the use of cluster() in the argument for survreg() and
 the robust option in survreg() to get sandwich error estimates was
 sufficient for me. Depending on your needs to evaluate random effects,
 frailty() in the survival package -- which can be used with survreg() or
 coxph() --- is another alternative to explore, I believe.

 Hope that helps,
 Bill
 Nonclinical Statistics, Centocor R  D

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] Behalf Of Douglas Grove
 Sent: Monday, April 23, 2007 2:29 PM
 To: Bert Gunter
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] fitting mixed models to censored data?


 Hi Bert,

 Yes, I am always wary when one software offers something that
 other do not.

 The censoring I'm faced with (at present) isn't as complicated
 as with much 'survival' data.  I'm trying to analyze assay data
 and have a lower limit of detection (LLD) to contend with.
 Once the level of the analyte gets low enough it can't be
 accurately quantitated, hence all that is reported is that
 the level is less than some value (the LLD).

 So I'm not worried about all the complex assumptions that go along
 with censoring in clinical trials, etc.

 Thanks,
 Doug


 On Mon, 23 Apr 2007, Bert Gunter wrote:

 Douglas:

 AFAIK, this is subject area of active current research.
 Diggle, Heagerty,
 Liang, and Zeger , 2002, (ANALYSIS OF LONGITUDINAL DATA)
 say on p.316: An
 emerging consensus is that analysis of data with
 potentially informative
 dropouts necessarily involves assumptions which are
 difficult, or even
 impossible, to check from the observed data.  This was ca
 1994, I believe,
 so I don't know whether this view is still held among
 experts (which I am
 not). But if it is, you may do well to be careful of
 whatever SAS does even
 if you do have to go running off to it.

 Cheers,

 Bert Gunter
 Genentech Nonclinical Statistics


 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Douglas Grove
 Sent: Monday, April 23, 2007 10:58 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] fitting mixed models to censored data?

 Hi,

 I'm trying to figure out if there are any packages allowing
 one to fit mixed models (or non-linear mixed models) to data
 that includes censoring.

 I've done some searching already on CRAN and through the mailing
 list archives, but haven't discovered anything.  Since I may well
 have done a poor job searching I thought I'd ask here prior to
 giving up.

 I understand that SAS's proc nlmixed can accomodate censoring
 (though proc mixed apparently can't), so if I can't find
 something available in R, I'll have to break down and use
 that.  Please, save me from having to use SAS!

 Thanks much,
 Doug

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cannot turn some columns in a data frame into factors

2006-05-11 Thread Douglas Grove
You need to create a new object and assign it to 'df'

so you'd do something like this:

df - sapply(factors, function (name) {
 pos - match(name,df.names)
 factor(df[[pos]])
 })
 

Doug






On Thu, 11 May 2006, Sam Steingold wrote:

  * jim holtman [EMAIL PROTECTED] [2006-05-11 12:27:39 -0400]:
 
  try '-' as the assignment to make it global.
 
   df[[pos]] - factor(df[[pos]])
 
 nothing changed -- I observe the exact same behaviour:
 
 Month ( 1 ): TRUE 
 factors: FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE 
 
 
  On 5/11/06, Sam Steingold [EMAIL PROTECTED] wrote:
 
  Hi,
  I have a data frame df and a list of names of columns that I want to
  turn into factors:
 
  df.names - attr(df,names)
  sapply(factors, function (name) {
 pos - match(name,df.names)
 if (is.na(pos)) stop(paste(name,: no such column\n))
 df[[pos]] - factor(df[[pos]])
 cat(name,(,pos,):,is.factor(df[[pos]]),\n)
  })
  cat(factors:,sapply(df,is.factor),\n)
 
  the output is:
 
 
  Month ( 1 ): TRUE
  factors: FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 
 
  i.e., there is a column named Month (the 1st column), and it is indeed
  turned into a factor inside sapply(), but after that it is numerical
  again!
 
  what am I doing wrong?
 
 -- 
 Sam Steingold (http://www.podval.org/~sds) on Fedora Core release 5 (Bordeaux)
 http://pmw.org.il http://ffii.org http://memri.org http://palestinefacts.org
 http://truepeace.org http://mideasttruth.com http://dhimmi.com
 If you're being passed on the right, you're in the wrong lane.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] is there a formatted output in R?

2006-03-10 Thread Douglas Grove
You really need to learn how to do some searching, as you seem to
be constantly asking questions you can answer yourself

help.search(sprintf)


On Fri, 10 Mar 2006, Michael wrote:

 something like sprintf in C?
 
 so I can do:
 
 print(sprintf(the correct result is %3.4f\n, myresult));
 
 ---
 
 Also, I am desperately looking for a clear console screen  function in
 R...
 
 thanks a lot!
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] can I do this with read.table??

2006-01-26 Thread Douglas Grove
Hi,

I'm trying to figure out if there's an automated way to get
read.table to read in my data and *not* convert the character
columns into anything, just leave them alone.  What I'm referring
to as 'character columns' are columns in the data that are quoted.
For columns of alphabetic strings (that aren't TRUE or FALSE) I can
suppress conversion to factor with as.is=TRUE, but what I'd like to
stop is the conversion of quoted numbers of the form 01,02,..., 
into numeric form.
 
By an 'automated way', I mean one that does not involve me having
to know which columns in the data are the ones I want kept as
they are.

This doesn't seem like an unreasonable thing to want to do.
After all, say I've got the data.frame:

  A - data.frame(a=1:3, b=I(c(01,02,03)))

I can export this to a text file with the simple command

  write.table(A, A.txt, sep=\t, row.names=FALSE, quote=TRUE)

but I cannot find an equally simple mechanism for reading this
data back in from A.txt that allows me to reconstruct my
data.frame 'A'.  Is this an unreasonable thing to expect?

Thanks,
Doug

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] can I do this with read.table??

2006-01-26 Thread Douglas Grove
I did read the help page, very carefully.   

The colClasses argument can be used if I want
to stop and look through every data set to see
which column I need to protect.  But that's what I 
said that I don't want to do.

As for 'as.is', I wish it did what you suggest, but
it doesn't.  If one reads carefully, as.is protects
a character vector from converstion to a *factor*,
but not from conversion to numeric/logical.

Doug




On Sun, 26 Feb 2006, Kjetil Brinchmann Halvorsen wrote:

 Douglas Grove wrote:
  Hi,
  
  I'm trying to figure out if there's an automated way to get
  read.table to read in my data and *not* convert the character
  columns into anything, just leave them alone.  What I'm referring
 
 ?Did you read the help page?
 What about argument as.is=TRUE?
 See also argument colClasses
 
 Kjetil
 
  to as 'character columns' are columns in the data that are quoted.
  For columns of alphabetic strings (that aren't TRUE or FALSE) I can
  suppress conversion to factor with as.is=TRUE, but what I'd like to
  stop is the conversion of quoted numbers of the form 01,02,..., into
  numeric form.
  
  By an 'automated way', I mean one that does not involve me having
  to know which columns in the data are the ones I want kept as
  they are.
  
  This doesn't seem like an unreasonable thing to want to do.
  After all, say I've got the data.frame:
  
  A - data.frame(a=1:3, b=I(c(01,02,03)))
  
  I can export this to a text file with the simple command
  
  write.table(A, A.txt, sep=\t, row.names=FALSE, quote=TRUE)
  
  but I cannot find an equally simple mechanism for reading this
  data back in from A.txt that allows me to reconstruct my
  data.frame 'A'.  Is this an unreasonable thing to expect?
  
  Thanks,
  Doug
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide!
  http://www.R-project.org/posting-guide.html
  
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Selecting data frame components by name - do you know a shorter way?

2006-01-20 Thread Douglas Grove
So you want to create a subset of a data frame?
with components name1 name2 name3 ... 

dframe[, c(name1,name2,name3,...)]   

will do that

Doug



On Fri, 20 Jan 2006, Michael Reinecke wrote:

 Hi! I suspect there must be an easy way to access components of a data frame 
 by name, i.e. the input should look like name1 name2 name3 ... and the 
 output be a data frame of those components with the corresponding names. I 
 ´ve been trying for hours, but only found the long way to do it (which is not 
 feasible, since I have lots of components to select):
 
  
 
 dframe[names(dframe)==name1 | dframe==name2 | dframe==name3]
 
  
 
 Do you know a shortcut?
 
  
 
 Michael
 
 
   [[alternative HTML version deleted]]
 
 __
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] 'x' must be numeric

2006-01-20 Thread Douglas Grove
It's much more helpful if you show the actual command you used.

Presumably you have a data frame 'd' and you've done

hist(d), and 'hist' has complained because d is not numeric,
d is a data frame that *contains* a numeric vector.

You need to give hist() that numeric vector, which you can do
in many ways, including: d$V1, d[,V1] and d[,1]

Doug


On Fri, 20 Jan 2006, Naiara S. Pinto wrote:

 Hello all,
 
 I am importing data from a txt file and try to get a histogram, I get the
 message: Error in hist: 'x' must be numeric.
 When I use mode R returns List.
 However when I use srt I get:
 `data.frame':   456 obs. of  1 variable:
  $ V1: num  0.6344 0.4516 0.0968 0.7634 0.7957 ...
 My file consists of one column only (no headers) and I can't figure out
 why I am getting this error message. Why does this happen?
 
 Thanks!
 
 Naiara.
 
 
 Naiara S. Pinto
 Ecology, Evolution and Behavior
 1 University Station A6700
 Austin, TX, 78712
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] discovery (was: data.frame to character)w

2005-06-10 Thread Douglas Grove
Help pages are useful, you should try them

e.g. ?pi or ?LETTERS


 How can one discover or list all available built-in objects?

 On Jun 10, 2005, at 7:23 AM, Muhammad Subianto wrote:
  L3 - LETTERS[1:3]
   L10 - LETTERS[1:10]

 LETTERS is apparently a built-in character vector.  ls() and objects 
 () only lists the ones I've created.  Is there a function that lists  
 all available built-in objects?

 For example, pi is another built-in, but e is not.  A means to  
 list them would be nice.

 Regards,
 - Robert

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] problem with dir() in R-2.1.0?

2005-04-25 Thread Douglas Grove
The new version of R has begun enforcing rules on regular expressions.
Your pattern is not a valid regular expression, hence it no longer works.
The meaning of '*' is with respect to a preceding character, hence it is
ill-defined without one.  



On Mon, 25 Apr 2005, Ye, Bin wrote:

 Hi,
 
 I always use dir(pattern=*.RData) in all the earlier version of R (1.8, 
 1.9, 2.0.1).
 
 Error messege is as below:
 Error in list.files(path, pattern, all.files, full.names, recursive) :
 invalid 'pattern' regular expression
 
 Does anyone have an idea what's going on? How should I define the pattern I 
 need in R-2.1.0?
 
 Thanks!
 
 
 Bin
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] How about a mascot for R?

2004-12-04 Thread Douglas Grove
When I think of New Zealand I think Rabbit :)

How 'bout something like the Monty Python rabbit from 
the Holy Grail (nasty pointy teeth..., look at the bones!)

Doug

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] inverse function of order()

2004-10-04 Thread Douglas Grove
An alternate method that saves having to use order() again is

r[o] - r

Doug




On Mon, 2004-10-04 at 15:21, Wolfram Fischer wrote:
 I have:
 
  d - sample(10:100, 9)
  o - order(d)
  r - d[o]
 
 How I can get d (in the original order), knowing only r and o?
 
 Thanks - Wolfram
 
 __
 [EMAIL PROTECTED] mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] alternate rank method

2004-06-29 Thread Douglas Grove
I agree.  These are obvious extensions to the options provided
now by rank.  I didn't suggest this as I am not a contributor and
don't feel comfortable asking others to do more work :)

Thanks,
Doug


On Tue, 29 Jun 2004, Martin Maechler wrote:

  Torsten == Torsten Hothorn [EMAIL PROTECTED]
  on Mon, 28 Jun 2004 10:59:26 +0200 (CEST) writes:
 
 Torsten On Fri, 25 Jun 2004, Douglas Grove wrote:
 
  I should have specified an additional constraint:
  
  I'm going to need to use this repeatedly on large vectors
  (length 10^6), so something efficient is needed.
  
 
 Torsten give function `irank' in package `exactRankTests' a
 Torsten try.
 
 As an answer to Torsten (who got it already orally) and Gabor's
 original tricky suggestions:
 
 I strongly believe this should happen in the same C code on
 which R's base rank() function works and already implements the
 *averaging* of ties.
 Doing the analog of changing average(..) to min(..) or max(..)
 shouldn't be hard and certainly will be more efficient than the
 workarounds posted here.
 
 Patches welcome...
 since otherwise I'm not sure I'll get there in time.
 
 Martin


__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] ties in runif() output

2004-06-26 Thread Douglas Grove
On Sat, 26 Jun 2004, Prof Brian Ripley wrote:

 On Fri, 25 Jun 2004, Douglas Grove wrote:
 
  I get ties in output from runif() when I generate as few as 10^5
  variates and get quite a lot when I generate 10^6.  Is this 
  expected??  
 
 It should have been.
 
  I haven't seen any duplication with rnorm(10^6), but
  see varying amounts of duplication using rexp(), rbeta() and
  rgamma().  I would have thought that there'd be enough precision
  that one wouldn't get ties until generating samples larger than this..
 
 Did you do the calculations?  Please do so. There are about 2e9 possible
 values of the standard generators.

I know little about the limitations of random number generation 
and didn't realize that only 2e9 values were obtainable.
I could have done the math myself had I known

Thanks very much for your help,
Doug


  qbirthday(classes=2e9)
 [1] 52655
 
 Statisticians ought to know about the birthday problem!
 
 (rnorm is different because the default generator uses two uniforms, 
 deliberately to increase the precision.)
 
   set.seed(222)
   sum(duplicated(runif(10^5)))
  [1] 4
 
 That's unusually high, BTW.
 
   sum(duplicated(runif(10^6)))
  [1] 140
 
 -- 
 Brian D. Ripley,  [EMAIL PROTECTED]
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595
 
 __
 [EMAIL PROTECTED] mailing list
 https://www.stat.math.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] alternate rank method

2004-06-25 Thread Douglas Grove
Hi,

I'm wondering if anyone can point me to a function that will
allow me to do a ranking that treats ties differently than
rank() provides for?

I'd like a method that will assign to the elements of each 
tie group the largest rank. 

An example:  

For the vector 'v', I'd like the method to return 'rv'

 v:  1 2 3 3 3 4 5 5 6 7
rv:  1 2 5 5 5 6 8 8 9 10


Thanks,
Doug Grove

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] alternate rank method

2004-06-25 Thread Douglas Grove
I should have specified an additional constraint:

I'm going to need to use this repeatedly on large
vectors (length 10^6), so something efficient is
needed.


On Fri, 25 Jun 2004, Sundar Dorai-Raj wrote:

 Douglas Grove wrote:
 
  Hi,
  
  I'm wondering if anyone can point me to a function that will
  allow me to do a ranking that treats ties differently than
  rank() provides for?
  
  I'd like a method that will assign to the elements of each 
  tie group the largest rank. 
  
  An example:  
  
  For the vector 'v', I'd like the method to return 'rv'
  
   v:  1 2 3 3 3 4 5 5 6 7
  rv:  1 2 5 5 5 6 8 8 9 10
  
  
  Thanks,
  Doug Grove
  
 
 How about
 
 rv - rowSums(outer(v, v, =))
 
 Adapted from Prof. Ripley's reply in the following thread:
 
 http://finzi.psych.upenn.edu/R/Rhelp02/archive/31993.html
 
 HTH,
 
 --sundar
 
 __
 [EMAIL PROTECTED] mailing list
 https://www.stat.math.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] ties in runif() output

2004-06-25 Thread Douglas Grove
I get ties in output from runif() when I generate as few as 10^5
variates and get quite a lot when I generate 10^6.  Is this 
expected??  I haven't seen any duplication with rnorm(10^6), but
see varying amounts of duplication using rexp(), rbeta() and
rgamma().  I would have thought that there'd be enough precision
that one wouldn't get ties until generating samples larger than this..


 set.seed(222)
 sum(duplicated(runif(10^5)))
[1] 4

 sum(duplicated(runif(10^6)))
[1] 140


platform i686-pc-linux-gnu
arch i686
os   linux-gnu
system   i686, linux-gnu
status   Patched
major1
minor9.0
year 2004
month04
day  13
language R


Thanks,
Doug Grove

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] predict function

2004-02-13 Thread Douglas Grove
You can't use this anymore.  The function predict() has a method
for loess objects, but there is no longer an available function
called predict.loess.   So just replace predict.loess
with predict.


On Fri, 13 Feb 2004, Thomas Jagoe wrote:

 I am using R to do a loess normalisation procedure.
 In 1.5.1 I used the following commands to normalise the variable logratio,
 over a 2d surface (defined by coordinates x and y):
 
  array - read.table(121203B_QCnew.txt, header=T, sep=\t)
  array$logs555-log(array$s555)/log(2)
  array$logs647-log(array$s647)/log(2)
  array$logratio-array$logs555-array$logs647
  array$logav-(array$logs555+array$logs647)/2
  library(modreg)
  loess2d-loess(logratio~x+y,data=array)
  array$logratio2DLoeNorm -array$logratio - predict.loess(loess2d, array)
 
 However in 1.8.1 all goes well until the last step when I get an error:
 
 Error: couldn't find function predict.loess
 
 Can anyone help ?
 
 Thomas
 
 __
 [EMAIL PROTECTED] mailing list
 https://www.stat.math.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Windows Memory Issues

2003-12-08 Thread Douglas Grove
On Sat, 6 Dec 2003, Prof Brian Ripley wrote:

 I think you misunderstand how R uses memory.  gc() does not free up all 
 the memory used for the objects it frees, and repeated calls will free 
 more.  Don't speculate about how memory management works: do your 
 homework!

Are you saying that consecutive calls to gc() will free more memory than
a single call, or am I misunderstanding?   Reading ?gc and ?Memory I don't
see anything about this mentioned.  Where should I be looking to find 
more comprehensive info on R's memory management??  I'm not writing any
packages, just would like to have a better handle on efficiently using
memory as it is usually the limiting factor with R.  FYI, I'm running
R1.8.1 and RedHat9 on a P4 with 2GB of RAM in case there is any platform
specific info that may be applicable.

Thanks,

Doug Grove
Statistical Research Associate
Fred Hutchinson Cancer Research Center




 
 In any case, you are using an outdated version of R, and your first
 course of action should be to compile up R-devel and try that, as there 
 has been improvements to memory management under Windows.  You could also 
 try compiling using the native malloc (and that *is* described in the 
 INSTALL file) as that has different compromises.
 
 
 On Sat, 6 Dec 2003, Richard Pugh wrote:
 
  Hi all,
   
  I am currently building an application based on R 1.7.1 (+ compiled
  C/C++ code + MySql + VB).  I am building this application to work on 2
  different platforms (Windows XP Professional (500mb memory) and Windows
  NT 4.0 with service pack 6 (1gb memory)).  This is a very memory
  intensive application performing sophisticated operations on large
  matrices (typically 5000x1500 matrices).
   
  I have run into some issues regarding the way R handles its memory,
  especially on NT.  In particular, R does not seem able to recollect some
  of the memory used following the creation and manipulation of large data
  objects.  For example, I have a function which receives a (large)
  numeric matrix, matches against more data (maybe imported from MySql)
  and returns a large list structure for further analysis.  A typical call
  may look like this .
   
   myInputData - matrix(sample(1:100, 750, T), nrow=5000)
   myPortfolio - createPortfolio(myInputData)
   
  It seems I can only repeat this code process 2/3 times before I have to
  restart R (to get the memory back).  I use the same object names
  (myInputData and myPortfolio) each time, so I am not create more large
  objects ..
   
  I think the problems I have are illustrated with the following example
  from a small R session .
   
   # Memory usage for Rui process = 19,800
   testData - matrix(rnorm(1000), 1000) # Create big matrix
   # Memory usage for Rgui process = 254,550k
   rm(testData)
   # Memory usage for Rgui process = 254,550k
   gc()
   used (Mb) gc trigger  (Mb)
  Ncells 369277  9.9 667722  17.9
  Vcells  87650  0.7   24286664 185.3
   # Memory usage for Rgui process = 20,200k
   
  In the above code, R cannot recollect all memory used, so the memory
  usage increases from 19.8k to 20.2.  However, the following example is
  more typical of the environments I use .
   
   # Memory 128,100k
   myTestData - matrix(rnorm(1000), 1000)
   # Memory 357,272k
   rm(myTestData)
   # Memory 357,272k
   gc()
used (Mb) gc trigger  (Mb)
  Ncells  478197 12.8 818163  21.9
  Vcells 9309525 71.1   31670210 241.7
   # Memory 279,152k
   
  Here, the memory usage increases from 128.1k to 279.1k
   
  Could anyone point out what I could do to rectify this (if anything), or
  generally what strategy I could take to improve this?
   
  Many thanks,
  Rich.
   
  Mango Solutions
  Tel : (01628) 418134
  Mob : (07967) 808091
   
  
  [[alternative HTML version deleted]]
  
  __
  [EMAIL PROTECTED] mailing list
  https://www.stat.math.ethz.ch/mailman/listinfo/r-help
  
  
 
 -- 
 Brian D. Ripley,  [EMAIL PROTECTED]
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595
 
 __
 [EMAIL PROTECTED] mailing list
 https://www.stat.math.ethz.ch/mailman/listinfo/r-help


__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] Kmeans again

2003-06-06 Thread Douglas Grove
 I'm sorry to insist but I still think there is something wrong with the function 
 kmeans. For instance, let's try the same small example:
  
  dados-matrix(c(-1,0,2,2.5,7,9,0,3,0,6,1,4),6,2)
 
 I will choose observations 3 and 4 for initial centers and just one iteration. The 
 results are
  
  A-kmeans(dados,dados[c(3,4),],1)
  A
 $cluster
 [1] 1 1 1 1 2 2
 $centers
[,1] [,2]
 1 0.875 2.75
 2 8.000 2.50
 $withinss
 [1] 38.9375  6.5000
 $size
 [1] 4 2
  
 If I do it by hand, after one iteration, the results are
  
 $cluster
 [1] 1 2 1 2 1 2
  
 So I think that something is wrong with the function kmeans; probably the initial 
 centers given
  by the user are not being taken into account.


Andy Liaw already gave an example where he specified two different starting 
values and Kmeans gave different results after 1 iteration, so clearly 
your hypothesis is incorrect.

Either your calculations are wrong or you are calculating the wrong
formulae.  It is very doubtful that anything is wrong with Kmeans.

Doug Grove

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


[R] removing leading/trailing blanks

2003-02-19 Thread Douglas Grove
Hi,

What's the best way of dropping leading or trailing
blanks from a character string?  

The only thing I can think of is using sub() to replace
blanks with null strings, but I don't know if there is
a better way (I also don't know how to represent the
trailing blank in a regular expression).

Thanks,
Doug Grove

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



Re: [R] dataframe subsetting behaviour

2003-01-22 Thread Douglas Grove
 Douglas Grove [EMAIL PROTECTED] writes:
 
  Hi,
  
  I'm trying to understand a behaviour that I have encountered
  and can't fathom.
  
  
  Here's some code I will use to illustrate the behaviour:
  
  # start with some data frame a having some named columns
  a - data.frame(a=rep(1,3),c=rep(2,3),d=rep(3,3),e=rep(4,3))
  
  # create a subset of the original data frame, but include a
  # name b that is not present in my original data frame
  b - a[,c(a,b,c)]
  
  
  ## Up until now no errors are issued, but the following commands
  ## will give the error shown:
  
  b[1,] ## Error in x[[j]] : subscript out of bounds
  b[1,2]## Error in names-.default(*tmp*, value = cols) : 
##  names attribute must be the same length as the vector
  
  
  Can anyone explain to me the meaning of these error messages in terms
  of R is actually doing?  These error messages had me baffled and 
  it took me hours to track down that the source of the error was an 
  incorrect column name in my data frame subsetting.
 
 Looks like a (semi-)bug. Indexing outside of the data frame creates a
 column which is really the single value NULL, e.g. 
 
  dput(a[,4:5])
 structure(list(e = c(4, 4, 4), NA = NULL), .Names = c(e,
 NA), row.names = c(1, 2, 3), class = data.frame)
 
 This will print because the format.data.frame called inside
 print.data.frame will recycle the NULL and give you
 
  a[,4:5]
   e   NA
 1 4 NULL
 2 4 NULL
 3 4 NULL
 
 However, it confuses the h*ck out of [.data.frame
 
  (a[,4:5])[2]
 Error in [.data.frame((a[, 4:5]), 2) : undefined columns selected
  (a[,4:5])[,2]
 NULL
  (a[,4:5])[,1]
 [1] 4 4 4
 
 and also the examples you found. However, the main issue is that you
 have managed to construct a corrupt data frame. So indexing outside
 the array should probably either give an error or return a column of
 NA.


Yes, it would be nice if trying to index outside the data frame generated
an error, that is what happens in Splus (at least the version I have
access to: 6.0 Release 1 for Linux 2.2.12)


 
 -- 
O__   Peter Dalgaard Blegdamsvej 3  
   c/ /'_ --- Dept. of Biostatistics 2200 Cph. N   
  (*) \(*) -- University of Copenhagen   Denmark  Ph: (+45) 35327918
 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907


__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help