Re: [R] Need Help with robustbase package: fitnorm2 and plotnorm2

2007-06-08 Thread Prof Brian Ripley
On Fri, 8 Jun 2007, M. Jankowski wrote:

 This is my first post requesting help to this mailing list. I am new
 to R. My apologies for any breach in posting etiquette.

For future reference, telling us your version of R and exact OS would have 
helped here.  The R posting guide suggests showing the output of 
sessionInfo().

Also, to help the readers, fitNorm2 (R is case-sensitive) is in 'prada', 
and the missing package is rrcov not robustbase.

 I am new to
 this language and just learning my way around. I am attempting to run
 some sample code and  and am confused by the error message:
 Loading required package: rrcov
 Error in fitNorm2(fdat[, FSC-H], fdat[, SSC-H], scalefac = ScaleFactor) :
Required package rrcov could not be found.
 In addition: Warning message:
 there is no package called 'rrcov' in: library(package, lib.loc =
 lib.loc, character.only = TRUE, logical = TRUE,

 that I get when I attempt to run the following sample snippet of code.
 The error above is taken from the code below. I am running Ubuntu
 Linux with all the r packages listed in the Synaptic package manager
 (universa). I loaded the prada bioconductor package as instructed in
 the comments and the robustbase was downloaded and installed with the
 command: sudo R CMD INSTALL robustbase_0.2- 7.tar.gz, the robustbase
 folder is in /usr/local/lib/R/site-library/ When I type in
 'library(robustbase)' no error appears; I believe robustbase is
 installed correctly. The sample code was taken from FCS-prada.pdf. The
 sample code was written in 2005, I understand that rrcov was made part
 of the robustbase package sometime in the past year. This may be the
 cause of the problem, but, if it is, I have no idea how to fix it.

That is not the case: rrcov is a separate package, and one prada depends 
on.  So somehow you have managed to install prada without an essential 
dependency 'rrcov'.  That looks like a problem in the Debian/Ubuntu 
packaging of prada.  (There is a list R-sig-debian for such issues.)

Running install.packages(rrcov) inside R should fix this for you: if 
your R is not current (i.e.  2.5.0) you may need to run R as root for 
that session.  (There may be a Debian package for rrcov for your OS and R 
version, but without further details I cannot check.)

In the current version of prada (1.12.0 for BioC-2.0 for R 2.5.0) rrcov is 
in Imports, so probably your version of BioC is not current either.

[...]

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] evaluating variables in the context of a data frame

2007-06-08 Thread Prof Brian Ripley
On Thu, 7 Jun 2007, Zack Weinberg wrote:

 Given

 D = data.frame(o=gl(2,1,4))

 this works as I expected:

 evalq(o, D)
 [1] 1 2 1 2
 Levels: 1 2

 but neither of these does:

 f - function(x, dat) evalq(x, dat)
 f(o, D)
 Error in eval(expr, envir, enclos) : object o not found
 g - function(x, dat) eval(x, dat)
 g(o, D)
 Error in eval(x, dat) : object o not found

 What am I doing wrong?  This seems to be what the helpfiles say you do
 to evaluate arguments in the context of a passed-in data frame...

When you call f(o, D), the argument 'o' is evaluated in the current 
environment ('context' in R means something different).  Because of lazy 
evaluation, it is not evaluated until evalq is called, but it evaluated as 
if it was evaluated greedily.

g(quote(o), D) will work.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] match rows of data frame

2007-06-08 Thread Alfonso Sammassimo
Hi R-experts,

I have a data frame (A) , and a subset (B) of this data frame. I am trying 
to create a new data frame which gives me all the rows of B, plus the 5th 
next row(occuring in A).  I have used the below code, but it gives me all 5 
rows after the matching row. I only want the 5th.

FiveDaysLater - A[c(sapply(match(rownames(B),rownames(A)), seq, 
length=6))),]

Any guidance much appreciated,
Thankyou.

Alfonso Sammassimo
Melbourne, Australia.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Barplots: Editing the frequency x-axis names

2007-06-08 Thread Tom.O

Hi
I have a timeSeries object (X) with monthly returns. I want to display the
returns with a barplot, which I can fix easily. But my problem is labaling
the x-axis, if I use the positions from the timeseries It gets very messy. I
have tried rotating and changing the font size but it doesn't do the trick.
I think the optimal solution for my purpose is too only display every second
or third date, pherhaps only use every 12 month. But how do I do that?

Thanks Tom
-- 
View this message in context: 
http://www.nabble.com/Barplots%3A-Editing-the-frequency-x-axis-names-tf3888029.html#a11021315
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to partition sample space

2007-06-08 Thread spime

Hi R-users,

I need your help in the following problem. Suppose we have a regression
problem containing 25 predictor variables of 1000 individuals. I want to
divide the data matrix ( 1000 x 25 ) into two partitions for training (70%)
and testing(30%). For this reason, i sample 70% of data into another
training matrix and remaining 30% into testing matrix using pseudorandom
numbers (for future analysis).

I need some efficient solution so that we can generate both matrix with
minimal time. 

Thanks in advance.

Sabyasachi
-- 
View this message in context: 
http://www.nabble.com/How-to-partition-sample-space-tf3888059.html#a11021390
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to partition sample space

2007-06-08 Thread Matthias Kirchner

Hi, 

you could use the sample function:

sample-sample(1:1000)
m.training-m[sample[1:700],]
m.test-m[sample[701:1000],]

Matthias



spime wrote:
 
 Hi R-users,
 
 I need your help in the following problem. Suppose we have a regression
 problem containing 25 predictor variables of 1000 individuals. I want to
 divide the data matrix ( 1000 x 25 ) into two partitions for training
 (70%) and testing(30%). For this reason, i sample 70% of data into another
 training matrix and remaining 30% into testing matrix using pseudorandom
 numbers (for future analysis).
 
 I need some efficient solution so that we can generate both matrix with
 minimal time. 
 
 Thanks in advance.
 
 Sabyasachi
 

-- 
View this message in context: 
http://www.nabble.com/How-to-partition-sample-space-tf3888059.html#a11021527
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Sorting dataframe by different columns

2007-06-08 Thread Gunther Höning
Dear list,

I have a very short question,
Suggest a dataframe of four columns.

df - data.frame(w,x,y,z)

I want this ordered the following way:
first by :x, decreasing = FALSE
and 
secondly by: z, decreasing =TRUE

How can this be done ?

Thanks

Gunther

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sorting dataframe by different columns

2007-06-08 Thread Dimitris Rizopoulos
probably the function sort.data.frame() posted in R-help some time ago 
can be useful; check:

RSiteSearch(sort.data.frame)


I hope it helps.

Best,
Dimitris


Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://med.kuleuven.be/biostat/
 http://www.student.kuleuven.be/~m0390867/dimitris.htm


- Original Message - 
From: Gunther Höning [EMAIL PROTECTED]
To: r-help@stat.math.ethz.ch
Sent: Friday, June 08, 2007 8:58 AM
Subject: [R] Sorting dataframe by different columns


 Dear list,

 I have a very short question,
 Suggest a dataframe of four columns.

 df - data.frame(w,x,y,z)

 I want this ordered the following way:
 first by :x, decreasing = FALSE
 and
 secondly by: z, decreasing =TRUE

 How can this be done ?

 Thanks

 Gunther

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Barplots: Editing the frequency x-axis names

2007-06-08 Thread Knut Krueger
Tom.O schrieb:
 Hi
 I have a timeSeries object (X) with monthly returns. I want to display the
 returns with a barplot, which I can fix easily. But my problem is labaling
 the x-axis, if I use the positions from the timeseries It gets very messy. I
 have tried rotating and changing the font size but it doesn't do the trick.
 I think the optimal solution for my purpose is too only display every second
 or third date, pherhaps only use every 12 month. But how do I do that?

 Thanks Tom
   
I think you could use:

library(chron):
f.e
x - c(dates(02/27/92),dates(02/27/95))
y - c(10,50)
plot(x, y)

Regards Knut

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Barplots: Editing the frequency x-axis names

2007-06-08 Thread Tom.O

Hi thanks for the respone, but cant you be more specific with your example. I
cant see that this will do the trick. What Im looking for is a function that
remembers each position but only displays every n'th date.

For example

positionReturns Disply Date
2003-01-31  1   N
2003-02-28  2   N
2003-03-31  3   Yes
2003-04-30  4   N
2003-05-31  5   N
2003-06-30  6   Yes
2003-07-31  7   N
2003-08-31  8   N
2006-09-30  9   Yes
 and so on until present 

Where I want to display all the returns in a barplot, but where I only want
to display every quarterly date in the plot???

Tom

Knut Krueger-5 wrote:
 
 Tom.O schrieb:
 Hi
 I have a timeSeries object (X) with monthly returns. I want to display
 the
 returns with a barplot, which I can fix easily. But my problem is
 labaling
 the x-axis, if I use the positions from the timeseries It gets very
 messy. I
 have tried rotating and changing the font size but it doesn't do the
 trick.
 I think the optimal solution for my purpose is too only display every
 second
 or third date, pherhaps only use every 12 month. But how do I do that?

 Thanks Tom
   
 I think you could use:
 
 library(chron):
 f.e
 x - c(dates(02/27/92),dates(02/27/95))
 y - c(10,50)
 plot(x, y)
 
 Regards Knut
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/Barplots%3A-Editing-the-frequency-x-axis-names-tf3888029.html#a11021815
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Barplots: Editing the frequency x-axis names

2007-06-08 Thread Knut Krueger
Sorry I forgot the   around the dates
x - c(dates(01/31/03),dates(06/30/07))

But I think your problem is the plot area.
You must first define the plot area with type =n for no plotting, 
afterwards you could fill in the data.
I did this with times() but I am afraid the displayed dates/times will 
depend on your plot area and the settings with par()

did you read the instructions for plot and par already?

Regards Knut

have a look to ?plot and to ?par

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] choose.dir

2007-06-08 Thread Antje
Hi all,

I have written a R-script under Windows using choose.dir. Now, I have 
seen that this function is missing at MacOS. Does anybody know an 
alternative?

Antje

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Barplots: Editing the frequency x-axis names -doouble post

2007-06-08 Thread Knut Krueger
Sorry for double posting - was wrong e-mail adress , thougt this one 
will run into Spam filter

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Tools For Preparing Data For Analysis

2007-06-08 Thread Christophe Pallier
Hi,

Can you provide examples of data formats that are problematic to read and
clean with R ?

The only problematic cases I have encountered were cases with multiline
and/or  varying length records (optional information). Then, it is sometimes
a good idea to preprocess the data to present in a tabular format (one
record per line).

For this purpose, I use awk (e.g. http://www.vectorsite.net/tsawk.html),
which is very adept at processing ascii data files  (awk is much simpler to
learn than perl, spss, sas, ...).

I have never encountered a data file in ascii format that I could not
reformat with Awk.  With binary formats, it is another story...

But, again, this is my limited experience; I would like to know if there are
situations where using SAS/SPSS is really a better approach.

Christophe Pallier


On 6/8/07, Robert Wilkins [EMAIL PROTECTED] wrote:

 As noted on the R-project web site itself ( www.r-project.org -
 Manuals - R Data Import/Export ), it can be cumbersome to prepare
 messy and dirty data for analysis with the R tool itself. I've also
 seen at least one S programming book (one of the yellow Springer ones)
 that says, more briefly, the same thing.
 The R Data Import/Export page recommends examples using SAS, Perl,
 Python, and Java. It takes a bit of courage to say that ( when you go
 to a corporate software web site, you'll never see a page saying This
 is the type of problem that our product is not the best at, here's
 what we suggest instead ). I'd like to provide a few more
 suggestions, especially for volunteers who are willing to evaluate new
 candidates.

 SAS is fine if you're not paying for the license out of your own
 pocket. But maybe one reason you're using R is you don't have
 thousands of spare dollars.
 Using Java for data cleaning is an exercise in sado-masochism, Java
 has a learning curve (almost) as difficult as C++.

 There are different types of data transformation, and for some data
 preparation problems an all-purpose programming language is a good
 choice ( i.e. Perl , or maybe Python/Ruby ). Perl, for example, has
 excellent regular expression facilities.

 However, for some types of complex demanding data preparation
 problems, an all-purpose programming language is a poor choice. For
 example: cleaning up and preparing clinical lab data and adverse event
 data - you could do it in Perl, but it would take way, way too much
 time. A specialized programming language is needed. And since data
 transformation is quite different from data query, SQL is not the
 ideal solution either.

 There are only three statistical programming languages that are
 well-known, all dating from the 1970s: SPSS, SAS, and S. SAS is more
 popular than S for data cleaning.

 If you're an R user with difficult data preparation problems, frankly
 you are out of luck, because the products I'm about to mention are
 new, unknown, and therefore regarded as immature. And while the
 founders of these products would be very happy if you kicked the
 tires, most people don't like to look at brand new products. Most
 innovators and inventers don't realize this, I've learned it the hard
 way.

 But if you are a volunteer who likes to help out by evaluating,
 comparing, and reporting upon new candidates, well you could certainly
 help out R users and the developers of the products by kicking the
 tires of these products. And there is a huge need for such volunteers.

 1. DAP
 This is an open source implementation of SAS.
 The founder: Susan Bassein
 Find it at: directory.fsf.org/math/stats (GNU GPL)

 2. PSPP
 This is an open source implementation of SPSS.
 The relatively early version number might not give a good idea of how
 mature the
 data transformation features are, it reflects the fact that he has
 only started doing the statistical tests.
 The founder: Ben Pfaff, either a grad student or professor at Stanford CS
 dept.
 Also at : directory.fsf.org/math/stats (GNU GPL)

 3. Vilno
 This uses a programming language similar to SPSS and SAS, but quite unlike
 S.
 Essentially, it's a substitute for the SAS datastep, and also
 transposes data and calculates averages and such. (No t-tests or
 regressions in this version). I created this, during the years
 2001-2006 mainly. It's version 0.85, and has a fairly low bug rate, in
 my opinion. The tarball includes about 100 or so test cases used for
 debugging - for logical calculation errors, but not for extremely high
 volumes of data.
 The maintenance of Vilno has slowed down, because I am currently
 (desparately) looking for employment. But once I've found new
 employment and living quarters and settled in, I will continue to
 enhance Vilno in my spare time.
 The founder: that would be me, Robert Wilkins
 Find it at: code.google.com/p/vilno ( GNU GPL )
 ( In particular, the tarball at code.google.com/p/vilno/downloads/list
 , since I have yet to figure out how to use Subversion ).


 4. Who knows?
 It was not easy to find out about the existence of DAP and 

Re: [R] Barplots: Editing the frequency x-axis names

2007-06-08 Thread hadley wickham
On 6/8/07, Tom.O [EMAIL PROTECTED] wrote:

 Hi
 I have a timeSeries object (X) with monthly returns. I want to display the
 returns with a barplot, which I can fix easily. But my problem is labaling
 the x-axis, if I use the positions from the timeseries It gets very messy. I
 have tried rotating and changing the font size but it doesn't do the trick.
 I think the optimal solution for my purpose is too only display every second
 or third date, pherhaps only use every 12 month. But how do I do that?

It's quite easy to do that with ggplot2, see below, or
http://had.co.nz/ggplot2/scale_date.html for examples.

df - data.frame(
 date = seq(Sys.Date(), len=100, by=1 day)[sample(100, 50)],
 price = runif(50)
)

qplot(date, price, data=df, geom=line)
qplot(date, price, data=df, geom=bar, stat=identity)
qplot(date, price, data=df, geom=bar, stat=identity) +
scale_x_date(major=2 months)
qplot(date, price, data=df, geom=bar, stat=identity) +
scale_x_date(major=10 day, format=%d-%m)
qplot(date, price, data=df, geom=bar, stat=identity) +
scale_x_date(major=5 day, format=%d-%m)

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Conditional Sequential Gaussian Simulation

2007-06-08 Thread ONKELINX, Thierry
Steve,

You can do this with the package gstat. Look for ?krige of
?predict.gstat

Post further question on this topic on the R-sig-geo list. You'll get
more response.

Cheers,

Thierry



ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
methodology and quality assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium
tel. + 32 54/436 185
[EMAIL PROTECTED]
www.inbo.be 

Do not put your faith in what statistics say until you have carefully
considered what they do not say.  ~William W. Watt
A statistical analysis, properly conducted, is a delicate dissection of
uncertainties, a surgery of suppositions. ~M.J.Moroney

 

 -Oorspronkelijk bericht-
 Van: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] Namens Friedman, Steven
 Verzonden: donderdag 7 juni 2007 14:46
 Aan: r-help@stat.math.ethz.ch
 Onderwerp: [R] Conditional Sequential Gaussian Simulation
 
 Hello, 
 
  
 
 I'm wondering if there are any packages/functions that can 
 perform conditional sequential gaussian simulation.  
 
  
 
 I'm following an article written by Grunwald, Reddy, Prenger 
 and Fisher 2007. Modeling of the spatial variability of 
 biogeochemical soil properties in a freshwater ecosystem. 
 Ecological Modelling 201: 521 - 535, and would like to 
 explore this methodology.
 
  
 
 Thanks
 
 Steve
 
  
 
  
 
 Steve Friedman, PhD
 
 Everglades Division
 
 Senior Environmental Scientist, Landscape Ecology
 
 South Florida Water Management District
 
 3301 Gun Club Road
 
 West Palm Beach, Florida 33406
 
 email:  [EMAIL PROTECTED]
 
 Office:  561 - 682 - 6312
 
 Fax:  561 - 682 - 5980
 
  
 
 If you are not doing what you truly enjoy its your obligation 
 to yourself to change.
 
  
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] update packages with R on Vista: error

2007-06-08 Thread Stefan Grosse
Thanks for pointing at this. But you know, the user is writable. R is
installing Packages in /Documents/R/win-library which works fine so I
find it absolutely naturally that update should work as well. Especially
since when I install the packages it gets the latest version, library
loads this latest version but update still does want to update this
latest package with the package I already installed and fails ...

In my opinion the update on windows is simply buggy.

I think one should definitely not turn UAC off. ( its a good security
feature). Btw. MikTeX 2.6 is able to deal with UAC - I can update my
latex packages without any problems even though they are in the Program
File directory (and also on-the-fly installation does work) ...

Stefan

 Original Message  
Subject: Re:[R] update packages with R on Vista: error
From: R. Villegas [EMAIL PROTECTED]
To: Stefan Grosse [EMAIL PROTECTED]
Date: 07.06.2007 23:13
 If R is installed within Program Files, one of Vista's security
 settings may interfere with the -update- process.

 The setting may be disabled globally by choosing:
 Windows (Start) menu, Control Panels, User Accounts and Family
 Safety (green title), User Accounts (green title), and
 Turn User Account Control on or off (very bottom).  You will be
 prompted for permission to continue; click continue.  On the
 screen you will see a checkbox titled Use User Account Control
 (UAC) to help protect your computer.  Uncheck this and click
 the OK button to save the changes.  Windows Vista will now allow
 programs, including R, to update files in Program Files.

 Rod.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] overplots - fixing scientific vs normal notation in output

2007-06-08 Thread Peter Lercher
Moving from S-plus to R I encountered many great features and a much
more stable system.
Currently, I am left with 2 problems that are handled differently:

1) I did lots of overplots in S-Plus using
par(new=T,xaxs='d',yaxs='d') to fix the axes
-What is the workaround in R ?

2) In S-Plus I could fix scientific notation or normal notation in output
-How can I handle this in R ?
I found no fix in the documentation

I am using R version 2.4.1 (2006-12-18) on Windows XP SR2


Peter Lercher, M.D., M.P.H., Assoc Prof

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Tools For Preparing Data For Analysis

2007-06-08 Thread Ted Harding
On 08-Jun-07 08:27:21, Christophe Pallier wrote:
 Hi,
 
 Can you provide examples of data formats that are problematic
 to read and clean with R ?
 
 The only problematic cases I have encountered were cases with
 multiline and/or  varying length records (optional information).
 Then, it is sometimes a good idea to preprocess the data to
 present in a tabular format (one record per line).
 
 For this purpose, I use awk (e.g.
 http://www.vectorsite.net/tsawk.html),
 which is very adept at processing ascii data files  (awk is
 much simpler to learn than perl, spss, sas, ...).

I want to join in with an enthusiastic Me too!!. For anything
which has to do with basic checking for the kind of messes that
people can get data into when they put it on the computer,
I think awk is ideal. It is very flexible (far more so than
many, even long-time, awk users suspect), very transparent
in its programming language (as opposed to say perl), fast,
and with light impact on system resources (rare delight in
these days, when upgrading your software may require upgrading
your hardware).

Although it may seem on the surface that awk is two-dimensional
in its view of data (line by line, and per field in a line),
it has some flexible internal data structures and recursive
function capability, which allows a lot more to be done with
the data that have been read in.

For example, I've used awk to trace ancestry through a genealogy,
given a data file where each line includes the identifier of an
individual and the identifiers of its male and female parents
(where known). And that was for pedigree dogs, where what happens
in real life makes Oedipus look trivial.

 I have never encountered a data file in ascii format that I
 could not reformat with Awk.  With binary formats, it is
 another story...

But then it is a good idea to process the binary file using an
instance of the creating software, to produce a ASCII file (say
in CSV format).

 But, again, this is my limited experience; I would like to
 know if there are situations where using SAS/SPSS is really
 a better approach.

The main thing often useful for data cleaning that awk does
not have is any associated graphics. It is -- by design -- a
line-by-line text-file processor. While, for instance, you
could use awk to accumulate numerical histogram counts, you
would have to use something else to display the histogram.
And for scatter-plots there's probably not much point in
bringing awk into the picture at all (unless a preliminary
filtration of mess is needed anyway).

That being said, though, there can still be a use to extract
data fields from a file for submission to other software.

Another kind of area where awk would not have much to offer
is where, as a part of your preliminary data inspection,
you want to inspect the results of some standard statistical
analyses.

As a final comment, utilities like awk can be used far more
fruitfully on operating systems (the unixoid family) which
incorporate at ground level the infrastructure for plumbing
together streams of data output from different programs.

Ted.


E-Mail: (Ted Harding) [EMAIL PROTECTED]
Fax-to-email: +44 (0)870 094 0861
Date: 08-Jun-07   Time: 10:43:05
-- XFMail --

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Re : Sorting dataframe by different columns

2007-06-08 Thread justin bem
see sort_df() in the reshape package
 
Justin BEM
Elève Ingénieur Statisticien Economiste
BP 294 Yaoundé.
Tél (00237)9597295.

- Message d'origine 
De : Gunther Höning [EMAIL PROTECTED]
À : r-help@stat.math.ethz.ch
Envoyé le : Vendredi, 8 Juin 2007, 7h58mn 53s
Objet : [R] Sorting dataframe by different columns

Dear list,

I have a very short question,
Suggest a dataframe of four columns.

df - data.frame(w,x,y,z)

I want this ordered the following way:
first by :x, decreasing = FALSE
and 
secondly by: z, decreasing =TRUE

How can this be done ?

Thanks

Gunther

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.







  
_ 
Ne gardez plus qu'une seule adresse mail ! Copiez vos mails vers Yahoo! Mail 
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] icc from GLMM?

2007-06-08 Thread Shinichi Nakagawa
Dear R users

I would like to ask a question regarding to icc (intraclass correlation) or many
biologists refer it to as repeatability. It is very useful to get icc for many
reasons and it is easy to do so from linear mixed-effects models and many
packages like psy, psychometric, aod and irr have functions to calculate icc. 

icc = between-group variance/(between-group variance + residual variance) 

*residual variance = within-group variance 

However, I have yet to find a convincing reference or some sort on how to
calculate icc from GLMM. I have found below:

icc = between-group variance/(between-group variance + 1)
*between-group variance = scaled between-group variance
Or variance obtained from random intercept of GLMM
icc = between-group variance/(between-group variance + pi^2/3)
icc = between-group variance/(between-group variance + pi^2/3*(dispersion
parameter))
for binomial GLMM
icc = between-group variance/(between-group variance + 1/(p(1-p))


I am a little confused which one to trust and use. Or there are no easy formulas
to do this? I am guessing formula would change depending on what distribution
you use and what link function as well? I want to calculate icc from GLMM with
Poisson with log link function and also binomial with logit function. Could
anybody help me please?

Many thanks for your help

Shinichi

-- 
Shinichi Nakagawa
Dept of Animal  Plant Sciences
University of Sheffield
Tel: 0114-222-0113
Fax: 0114-222-0002

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Re : How to partition sample space

2007-06-08 Thread justin bem
also try

active.sample-sample(1:1000,n=700)
active.df-thedf[active.sample, ]
   test.df-thedf[-active.sample, ]

 
Justin BEM
Elève Ingénieur Statisticien Economiste
BP 294 Yaoundé.
Tél (00237)9597295.

- Message d'origine 
De : Matthias Kirchner [EMAIL PROTECTED]
À : r-help@stat.math.ethz.ch
Envoyé le : Vendredi, 8 Juin 2007, 8h06mn 41s
Objet : Re: [R] How to partition sample space


Hi, 

you could use the sample function:

sample-sample(1:1000)
m.training-m[sample[1:700],]
m.test-m[sample[701:1000],]

Matthias



spime wrote:
 
 Hi R-users,
 
 I need your help in the following problem. Suppose we have a regression
 problem containing 25 predictor variables of 1000 individuals. I want to
 divide the data matrix ( 1000 x 25 ) into two partitions for training
 (70%) and testing(30%). For this reason, i sample 70% of data into another
 training matrix and remaining 30% into testing matrix using pseudorandom
 numbers (for future analysis).
 
 I need some efficient solution so that we can generate both matrix with
 minimal time. 
 
 Thanks in advance.
 
 Sabyasachi
 

-- 
View this message in context: 
http://www.nabble.com/How-to-partition-sample-space-tf3888059.html#a11021527
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.









  


___





[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] help.search and Baysian regression

2007-06-08 Thread Christian Hennig
Hi there,

two questions.
1) Is there any possibility to look up the help pages within R for more 
complex combinations of character strings, for example Bayesian AND 
regression but not necessarily Bayesian regression?

2) Is there a package/command that does fully Bayesian linear regression
(if possible with variable selection)?

Thanks,
Christian

*** --- ***
Christian Hennig
University College London, Department of Statistical Science
Gower St., London WC1E 6BT, phone +44 207 679 1698
[EMAIL PROTECTED], www.homepages.ucl.ac.uk/~ucakche

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Dependency 'Design' is not available

2007-06-08 Thread Ruixin ZHU
Dear R-users,
 
When I installed rattle package with the command:
install.packages(rattle, dependencies=TRUE), I got 
Warning message:
Dependency 'Design' is not available
 
Is this warning serious? How to avoid this warning?
 
Thanks
_
Dr.Ruixin ZHU
Shanghai Center for Bioinformation Technology
[EMAIL PROTECTED]
[EMAIL PROTECTED]
86-21-13040647832
 

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sorting dataframe by different columns

2007-06-08 Thread Knut Krueger
maybe this page could give you some hints:
http://www.ats.ucla.edu/STAT/r/faq/sort.htm
Regards Knut

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] update packages with R on Vista: error

2007-06-08 Thread Stefan Grosse
I was pointed at that my message might be considered as impolite. It was
not intended so. I was just trying to formulate that there should be
some improvement since the solutions offered were either not optimal for
me (disabling security features) or where not working (FAQ).

I apologize for any possible inconvenience caused by my frustration
spiced up possibly with my inabilities.

Stefan

 Original Message  
Subject: Re:[R] update packages with R on Vista: error
From: Stefan Grosse [EMAIL PROTECTED]
To: R. Villegas [EMAIL PROTECTED]
Date: 08.06.2007 11:07
 Thanks for pointing at this. But you know, the user is writable. R is
 installing Packages in /Documents/R/win-library which works fine so I
 find it absolutely naturally that update should work as well. Especially
 since when I install the packages it gets the latest version, library
 loads this latest version but update still does want to update this
 latest package with the package I already installed and fails ...

 In my opinion the update on windows is simply buggy.

 I think one should definitely not turn UAC off. ( its a good security
 feature). Btw. MikTeX 2.6 is able to deal with UAC - I can update my
 latex packages without any problems even though they are in the Program
 File directory (and also on-the-fly installation does work) ...

 Stefan


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] world map matrix

2007-06-08 Thread Antonio Rodríguez
Hi,

Is it possible to make a world map matrix where land values are set to 0 and
sea values to 1?

Cheers,

Antonio

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dependency 'Design' is not available

2007-06-08 Thread Uwe Ligges


Ruixin ZHU wrote:
 Dear R-users,
  
 When I installed rattle package with the command:
 install.packages(rattle, dependencies=TRUE), I got 
 Warning message:
 Dependency 'Design' is not available

Version of R? OS? Please do read the posting guide!

If R-2.5.0 under Windows: Design did not pass the checks under Windows 
and is not available for download. In this case please contact the 
package maintainer and convince him to fix the package.

Uwe Ligges


  
 Is this warning serious? How to avoid this warning?
  
 Thanks
 _
 Dr.Ruixin ZHU
 Shanghai Center for Bioinformation Technology
 [EMAIL PROTECTED]
 [EMAIL PROTECTED]
 86-21-13040647832
  
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] match rows of data frame

2007-06-08 Thread jim holtman
try:

FiveDaysLater - A[match(rownames(B),rownames(A))+5,]

On 6/7/07, Alfonso Sammassimo [EMAIL PROTECTED] wrote:

 Hi R-experts,

 I have a data frame (A) , and a subset (B) of this data frame. I am trying
 to create a new data frame which gives me all the rows of B, plus the 5th
 next row(occuring in A).  I have used the below code, but it gives me all
 5
 rows after the matching row. I only want the 5th.

 FiveDaysLater - A[c(sapply(match(rownames(B),rownames(A)), seq,
 length=6))),]

 Any guidance much appreciated,
 Thankyou.

 Alfonso Sammassimo
 Melbourne, Australia.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] world map matrix

2007-06-08 Thread Duncan Murdoch
On 6/8/2007 6:52 AM, Antonio Rodríguez wrote:
 Hi,
 
 Is it possible to make a world map matrix where land values are set to 0 and
 sea values to 1?

It's not hard to produce a bitmap of a world map with the maps package, 
and then some image manipulation functions could convert it to 0's and 
1's.  I don't know if there's a more direct way.

One minor problem you may encounter is that the default world map 
display isn't really rectangular:  e.g. bits of Siberia that cross 180 
degrees east are still displayed attached to Siberia rather than 
wrapping around and being displayed on the other side of the map.  The 
display also doesn't go all the way to the south pole.  I produced a 
couple of rectangular bitmaps covering 90 south to 90 north and 180 west 
to 180 east; they're included in the rgl package (and used to display 
globes in the persp3d example).

Duncan Murdoch

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Tools For Preparing Data For Analysis

2007-06-08 Thread Douglas Bates
On 6/7/07, Robert Wilkins [EMAIL PROTECTED] wrote:
 As noted on the R-project web site itself ( www.r-project.org -
 Manuals - R Data Import/Export ), it can be cumbersome to prepare
 messy and dirty data for analysis with the R tool itself. I've also
 seen at least one S programming book (one of the yellow Springer ones)
 that says, more briefly, the same thing.
 The R Data Import/Export page recommends examples using SAS, Perl,
 Python, and Java. It takes a bit of courage to say that ( when you go
 to a corporate software web site, you'll never see a page saying This
 is the type of problem that our product is not the best at, here's
 what we suggest instead ). I'd like to provide a few more
 suggestions, especially for volunteers who are willing to evaluate new
 candidates.

 SAS is fine if you're not paying for the license out of your own
 pocket. But maybe one reason you're using R is you don't have
 thousands of spare dollars.
 Using Java for data cleaning is an exercise in sado-masochism, Java
 has a learning curve (almost) as difficult as C++.

 There are different types of data transformation, and for some data
 preparation problems an all-purpose programming language is a good
 choice ( i.e. Perl , or maybe Python/Ruby ). Perl, for example, has
 excellent regular expression facilities.

 However, for some types of complex demanding data preparation
 problems, an all-purpose programming language is a poor choice. For
 example: cleaning up and preparing clinical lab data and adverse event
 data - you could do it in Perl, but it would take way, way too much
 time. A specialized programming language is needed. And since data
 transformation is quite different from data query, SQL is not the
 ideal solution either.

 There are only three statistical programming languages that are
 well-known, all dating from the 1970s: SPSS, SAS, and S. SAS is more
 popular than S for data cleaning.

 If you're an R user with difficult data preparation problems, frankly
 you are out of luck, because the products I'm about to mention are
 new, unknown, and therefore regarded as immature. And while the
 founders of these products would be very happy if you kicked the
 tires, most people don't like to look at brand new products. Most
 innovators and inventers don't realize this, I've learned it the hard
 way.

 But if you are a volunteer who likes to help out by evaluating,
 comparing, and reporting upon new candidates, well you could certainly
 help out R users and the developers of the products by kicking the
 tires of these products. And there is a huge need for such volunteers.

 1. DAP
 This is an open source implementation of SAS.
 The founder: Susan Bassein
 Find it at: directory.fsf.org/math/stats (GNU GPL)

 2. PSPP
 This is an open source implementation of SPSS.
 The relatively early version number might not give a good idea of how
 mature the
 data transformation features are, it reflects the fact that he has
 only started doing the statistical tests.
 The founder: Ben Pfaff, either a grad student or professor at Stanford CS 
 dept.
 Also at : directory.fsf.org/math/stats (GNU GPL)

 3. Vilno
 This uses a programming language similar to SPSS and SAS, but quite unlike S.
 Essentially, it's a substitute for the SAS datastep, and also
 transposes data and calculates averages and such. (No t-tests or
 regressions in this version). I created this, during the years
 2001-2006 mainly. It's version 0.85, and has a fairly low bug rate, in
 my opinion. The tarball includes about 100 or so test cases used for
 debugging - for logical calculation errors, but not for extremely high
 volumes of data.
 The maintenance of Vilno has slowed down, because I am currently
 (desparately) looking for employment. But once I've found new
 employment and living quarters and settled in, I will continue to
 enhance Vilno in my spare time.
 The founder: that would be me, Robert Wilkins
 Find it at: code.google.com/p/vilno ( GNU GPL )
 ( In particular, the tarball at code.google.com/p/vilno/downloads/list
 , since I have yet to figure out how to use Subversion ).

 4. Who knows?
 It was not easy to find out about the existence of DAP and PSPP. So
 who knows what else is out there. However, I think you'll find a lot
 more statistics software ( regression , etc ) out there, and not so
 much data transformation software. Not many people work on data
 preparation software. In fact, the category is so obscure that there
 isn't one agreed term: data cleaning , data munging , data crunching ,
 or just getting the data ready for analysis.

Thanks for bringing up this topic.  I think there is definitely a
place for such languages, which I would regard as data-filtering
languages, but I also think that trying to reproduce the facilities in
SAS or SPSS for data analysis is redundant.

Other responses in this thread have mentioned 'little language'
filters like awk, which is fine for those who were raised in the Bell
Labs tradition of programming 

[R] data mining/text mining?

2007-06-08 Thread Ruixin ZHU
Dear R-user,
 
Could anybody tell me of the key difference between data mining and text
mining?
Please make a list for packages about data/text mining.
And give me an example of text mining with R (any relating materials
will be highly appreciated), because a vignette written by Ingo Feinerer
seems too concise for me.
 
Thanks 
_
Dr.Ruixin ZHU
Shanghai Center for Bioinformation Technology
[EMAIL PROTECTED]
[EMAIL PROTECTED]
86-21-13040647832
 

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] logical 'or' on list of vectors

2007-06-08 Thread Sundar Dorai-Raj


Tim Bergsma said the following on 6/8/2007 5:57 AM:
 Suppose I have a list of logicals, such as returned by lapply:
 
 Theoph$Dose[1] - NA
 Theoph$Time[2] - NA
 Theoph$conc[3] - NA
 lapply(Theoph,is.na)
 
 Is there a direct way to execute logical or across all vectors?  The 
 following gives the desired result, but seems unnecessarily complex.
 
 as.logical(apply(do.call(rbind,lapply(Theoph,is.na)),2,sum))
 
 Regards,
 
 Tim
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

How about:

apply(sapply(Theoph, is.na), 1, any)

HTH,

--sundar

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] rlm results on trellis plot

2007-06-08 Thread Chuck Cleland
Alan S Barnett wrote:
 How do I add to a trellis plot the best fit line from a robust fit? I
 can use panel.lm to add a least squares fit, but there is no panel.rlm
 function.

  How about using panel.abline() instead of panel.lmline()?

fit1 - coef(lm(stack.loss ~ Air.Flow, data = stackloss))
fit2 - coef(rlm(stack.loss ~ Air.Flow, data = stackloss))

xyplot(stack.loss ~ Air.Flow, data=stackloss,
   panel = function(x, y, ...){
 panel.xyplot(x, y, ...)
 panel.abline(fit1, type=l, col=blue)
 panel.abline(fit2, type=l, col=red)
   }, aspect=1)

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] logical 'or' on list of vectors

2007-06-08 Thread Dimitris Rizopoulos
try the following:

as.logical(rowSums(is.na(Theoph)))
## or
!complete.cases(Theoph)


I hope it helps.

Best,
Dimitris


Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://med.kuleuven.be/biostat/
 http://www.student.kuleuven.be/~m0390867/dimitris.htm


- Original Message - 
From: Tim Bergsma [EMAIL PROTECTED]
To: r-help@stat.math.ethz.ch
Sent: Friday, June 08, 2007 2:57 PM
Subject: [R] logical 'or' on list of vectors


 Suppose I have a list of logicals, such as returned by lapply:

 Theoph$Dose[1] - NA
 Theoph$Time[2] - NA
 Theoph$conc[3] - NA
 lapply(Theoph,is.na)

 Is there a direct way to execute logical or across all vectors? 
 The
 following gives the desired result, but seems unnecessarily complex.

 as.logical(apply(do.call(rbind,lapply(Theoph,is.na)),2,sum))

 Regards,

 Tim

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Formating the data

2007-06-08 Thread Chuck Cleland
A Ezhil wrote:
 Hi All,
 
 I have a vector of length 48, something like:
 0 0 1 1 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 
 I would like to print (reformat) this vector as:
 00110011
 
 by simply removing the spaces between them. I have
 been trying with many option but not able to do this
 task.
 I would greatly appreciate your suggestion on fixing
 this simple task.

X - rbinom(n=48, size=1, prob=.3)

paste(X, collapse=)
[1] 1001001000100011010010010111

print(paste(X, collapse=), quote=FALSE)
[1] 1001001000100011010010010111

 Thanks in advance.
 
 Kind regards,
 Ezhil
  
 
 
  
 
 Bored stiff? Loosen up...
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Formating the data

2007-06-08 Thread A Ezhil
Hi All,

I have a vector of length 48, something like:
0 0 1 1 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

I would like to print (reformat) this vector as:
00110011

by simply removing the spaces between them. I have
been trying with many option but not able to do this
task.
I would greatly appreciate your suggestion on fixing
this simple task.

Thanks in advance.

Kind regards,
Ezhil
 


 

Bored stiff? Loosen up...

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] logical 'or' on list of vectors

2007-06-08 Thread Chuck Cleland
Tim Bergsma wrote:
 Suppose I have a list of logicals, such as returned by lapply:
 
 Theoph$Dose[1] - NA
 Theoph$Time[2] - NA
 Theoph$conc[3] - NA
 lapply(Theoph,is.na)
 
 Is there a direct way to execute logical or across all vectors?  The 
 following gives the desired result, but seems unnecessarily complex.
 
 as.logical(apply(do.call(rbind,lapply(Theoph,is.na)),2,sum))

  Is this what you want?

apply(is.na(Theoph), 1, any)

 Regards,
 
 Tim
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] logical 'or' on list of vectors

2007-06-08 Thread Tim Bergsma
Suppose I have a list of logicals, such as returned by lapply:

Theoph$Dose[1] - NA
Theoph$Time[2] - NA
Theoph$conc[3] - NA
lapply(Theoph,is.na)

Is there a direct way to execute logical or across all vectors?  The 
following gives the desired result, but seems unnecessarily complex.

as.logical(apply(do.call(rbind,lapply(Theoph,is.na)),2,sum))

Regards,

Tim

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] rlm results on trellis plot

2007-06-08 Thread Alan S Barnett
How do I add to a trellis plot the best fit line from a robust fit? I
can use panel.lm to add a least squares fit, but there is no panel.rlm
function.
-- 
Alan S Barnett [EMAIL PROTECTED]
NIMH/CBDB

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] logical 'or' on list of vectors

2007-06-08 Thread jim holtman
a little simplier:

apply(do.call(rbind,lapply(Theoph,is.na)),2,any)

or

!complete.cases(Theoph)

On 6/8/07, Tim Bergsma [EMAIL PROTECTED] wrote:

 Suppose I have a list of logicals, such as returned by lapply:

 Theoph$Dose[1] - NA
 Theoph$Time[2] - NA
 Theoph$conc[3] - NA
 lapply(Theoph,is.na)

 Is there a direct way to execute logical or across all vectors?  The
 following gives the desired result, but seems unnecessarily complex.

 as.logical(apply(do.call(rbind,lapply(Theoph,is.na)),2,sum))

 Regards,

 Tim

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R is not a validated software package..

2007-06-08 Thread Giovanni Parrinello
Dear All,
discussing with a statistician of a pharmaceutical company I received 
this answer about the statistical package that I have planned to use:

As R is not a validated software package, we would like to ask if it 
would rather be possible for you to use SAS, SPSS or another approved 
statistical software system.

Could someone suggest me a 'polite' answer?
TIA
Giovanni

-- 
dr. Giovanni Parrinello
External Lecturer
Medical Statistics Unit
Department of Biomedical Sciences
Viale Europa, 11 - 25123 Brescia Italy
Tel: +390303717528
Fax: +390303717488
email: [EMAIL PROTECTED]


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Formating the data

2007-06-08 Thread Gavin Simpson
On Fri, 2007-06-08 at 06:13 -0700, A Ezhil wrote:
 Hi All,
 
 I have a vector of length 48, something like:
 0 0 1 1 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 
 I would like to print (reformat) this vector as:
 00110011
 
 by simply removing the spaces between them. I have
 been trying with many option but not able to do this
 task.
 I would greatly appreciate your suggestion on fixing
 this simple task.
 
 Thanks in advance.

 dat - scan()
1: 0 0 1 1 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1
28: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
49:
Read 48 items
 dat
 [1] 0 0 1 1 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1[39] 1 1 1 1 1 1 1 1 1 1
 print(dat, print.gap = 0)
 [1]00110011

Is that what you want? It is just altering how the data are printed. You
still get the [1] at the start though.

G
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fwd: Using odesolve to produce non-negative solutions

2007-06-08 Thread Spencer Graves
  On the 'lsoda' help page, I did not see any option to force some 
or all parameters to be nonnegative. 

  Have you considered replacing the parameters that must be 
nonnegative with their logarithms?  This effective moves the 0 lower 
limit to (-Inf) and seems to have worked well for me in the past.  
Often, it can even make the log likelihood or sum of squares surface 
more elliptical, which means that the standard normal approximation for 
the sampling distribution of parameter estimates will likely be more 
accurate. 

  Hope this helps. 
  Spencer Graves
p.s.  Your example seems not to be self contained.  If I could have 
easily copied it from your email and run it myself, I might have been 
able to offer more useful suggestions. 

Jeremy Goldhaber-Fiebert wrote:
 Hello,

 I am using odesolve to simulate a group of people moving through time and 
 transmitting infections to one another. 

 In Matlab, there is a NonNegative option which tells the Matlab solver to 
 keep the vector elements of the ODE solution non-negative at all times. What 
 is the right way to do this in R?

 Thanks,
 Jeremy

 P.S., Below is a simplified version of the code I use to try to do this, but 
 I am not sure that it is theoretically right 

 dynmodel - function(t,y,p) 
 { 
 ## Initialize parameter values

   birth - p$mybirth(t)
   death - p$mydeath(t)
   recover - p$myrecover
   beta - p$mybeta
   vaxeff - p$myvaxeff
   vaccinated - p$myvax(t)

   vax - vaxeff*vaccinated/100

 ## If the state currently has negative quantities (shouldn't have), then 
 reset to reasonable values for computing meaningful derivatives

   for (i in 1:length(y)) {
   if (y[i]0) {
   y[i] - 0
   }
   }

   S - y[1]
   I - y[2]
   R - y[3]
   N - y[4]

   shat - (birth*(1-vax)) - (death*S) - (beta*S*I/N)
   ihat - (beta*S*I/N) - (death*I) - (recover*I)
   rhat - (birth*(vax)) + (recover*I) - (death*R)

 ## Do we overshoot into negative space, if so shrink derivative to bring 
 state to 0 
 ## then rescale the components that take the derivative negative

   if (shat+S0) {
   shat_old - shat
   shat - -1*S
   scaled_transmission - (shat/shat_old)*(beta*S*I/N)
   ihat - scaled_transmission - (death*I) - (recover*I)
   
   }   
   if (ihat+I0) {
   ihat_old - ihat
   ihat - -1*I
   scaled_recovery - (ihat/ihat_old)*(recover*I)
   rhat - scaled_recovery +(birth*(vax)) - (death*R)
   
   }   
   if (rhat+R0) {
   rhat - -1*R
   }   

   nhat - shat + ihat + rhat

   if (nhat+N0) {
   nhat - -1*N
   }   

 ## return derivatives

   list(c(shat,ihat,rhat,nhat),c(0))

 }

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] character to time problem

2007-06-08 Thread John Kane

--- Jason Barnhart [EMAIL PROTECTED] wrote:

 Hi John,
 
 a) The NA appears because '30/02/1995' is not a
 valid date.
 
  strptime('30/02/1995' , %d/%m/%Y)
 [1] NA


I knew we should never have moved to the Gregorian
Calender!  

Thanks.  I accidently made up the date but this means
that I have some invalid dates in the file. Not a
problem now I know what's happening. And our contract
says someone else gets to fix them :)

 b) dates which has the following classes uses
 sort.POSIXlt which in 
 turns sets na.last to NA.  ?order details how NA's
 are handled in 
 ordering data via na.last.
 
  class(dates)
 [1] POSIXt  POSIXlt
 
  methods(sort)
 [1] sort.default sort.POSIXlt
 
  sort.POSIXlt
 function (x, decreasing = FALSE, na.last = NA,
 ...)
 x[order(as.POSIXct(x), na.last = na.last,
 decreasing = 
 decreasing)]
 environment: namespace:base
 
 After resetting the Feb. date the code works.
 
 HTH,
 -jason
 

So it does.  

I had not thought to look at the sort.POSIXlt
function.  I don't quite understand what na.last is
doing and don't seem to see the documentation.  Is it
sorting the NA's to the last place(s) in the vector
and then dropping them? 

Thanks again

 - Original Message - 
 From: John Kane [EMAIL PROTECTED]
 To: R R-help r-help@stat.math.ethz.ch
 Sent: Thursday, June 07, 2007 2:17 PM
 Subject: [R] character to time problem
 
 
 I am trying to clean up some dates and I am clearly
  doing something wrong.  I have laid out an example
  that seems to show what is happening with the
 real
  data.  The  coding is lousy but it looks like it
  should have worked.
 
  Can anyone suggest a) why I am getting that NA
  appearing after the strptime() command and b) why
 the
  NA is disappearing in the sort()? It happens with
  na.rm=TRUE  and na.rm=FALSE
  -
  aa  - data.frame( c(12/05/2001,  ,
 30/02/1995,
  NA, 14/02/2007, M ) )
  names(aa)  - times
  aa[is.na(aa)] - M
  aa[aa== ]  - M
  bb - unlist(subset(aa, aa[,1] !=M))
  dates - strptime(bb, %d/%m/%Y)
  dates
  sort(dates)
  --
 
  Session Info
  R version 2.4.1 (2006-12-18)
  i386-pc-mingw32
 
  locale:
  LC_COLLATE=English_Canada.1252;
  LC_CTYPE=English_Canada.1252;
  LC_MONETARY=English_Canada.1252;
  LC_NUMERIC=C;LC_TIME=English_Canada.1252
 
  attached base packages:
  [1] stats graphics  grDevices utils
  datasets  methods   base
 
  other attached packages:
   gdata   Hmisc
  2.3.1 3.3-2
 
  (Yes I know I'm out of date but I don't like
  upgrading just as I am finishing a project)
 
  Thanks
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained,
 reproducible code.
  
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] pointwise confidence bands or interval values for a non parametric sm.regression

2007-06-08 Thread M. P. Papadatos

Dear all,

Is there a way to plot / calculate pointwise confidence bands or  
interval values for a non parametric regression like sm.regression?


Thank you in advance.

Regards,

Martin


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R is not a validated software package..

2007-06-08 Thread Sicotte, Hugues Ph.D.
People, don't get angry at the pharma statistician, he is just trying to
abide by an FDA requirement that is designed to insure that test perform
reliably the same. There is no point in getting into which product is
better. As far as the FDA rules are concerned a validated system beats a
better system any day of the week.

Here is your polite answer.
You can develop and try your software in R.
Should they need to use those results in a report that will matter to
the FDA, then you can work together with him to set up a validated
environment for S-plus. You then have to commit to port your code to
S-plus.

As I assume that you do not work in a regulated environment, you
probably wouldn't have access to a validated SAS environment anyways. It
is not usually enough to install a piece of software, you have to
validate every step of the installation. Since AFAIK the FDA uses
S-plus, it would be to your pharma person's advantage to speed-up
submissions if they also had a validated S-plus environment.

http://www.msmiami.com/custom/downloads/S-PLUSValidationdatasheet_Final.
pdf


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Wensui Liu
Sent: Friday, June 08, 2007 9:24 AM
To: Giovanni Parrinello
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] R is not a validated software package..

I like to know the answer as well.
To be honest, I really have hard time to understand the mentality of
clinical trial guys and rather believe it is something related to job
security.

On 6/8/07, Giovanni Parrinello [EMAIL PROTECTED] wrote:
 Dear All,
 discussing with a statistician of a pharmaceutical company I received
 this answer about the statistical package that I have planned to use:

 As R is not a validated software package, we would like to ask if it
 would rather be possible for you to use SAS, SPSS or another approved
 statistical software system.

 Could someone suggest me a 'polite' answer?
 TIA
 Giovanni

 --
 dr. Giovanni Parrinello
 External Lecturer
 Medical Statistics Unit
 Department of Biomedical Sciences
 Viale Europa, 11 - 25123 Brescia Italy
 Tel: +390303717528
 Fax: +390303717488
 email: [EMAIL PROTECTED]


 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
WenSui Liu
A lousy statistician who happens to know a little programming
(http://spaces.msn.com/statcompute/blog)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] logical 'or' on list of vectors

2007-06-08 Thread Tim Bergsma
Thanks all for the many excellent suggestions!

!complete.cases(Theoph) is probably the most succinct form for the 
current problem, while the examples with 'any' seem readily adaptable to 
similar situations.

Kind regards,

Tim.

Dimitris Rizopoulos wrote:
 try the following:
 
 as.logical(rowSums(is.na(Theoph)))
 ## or
 !complete.cases(Theoph)
 
 
 I hope it helps.
 
 Best,
 Dimitris
 
 
 Dimitris Rizopoulos
 Ph.D. Student
 Biostatistical Centre
 School of Public Health
 Catholic University of Leuven
 
 Address: Kapucijnenvoer 35, Leuven, Belgium
 Tel: +32/(0)16/336899
 Fax: +32/(0)16/337015
 Web: http://med.kuleuven.be/biostat/
 http://www.student.kuleuven.be/~m0390867/dimitris.htm
 
 
 - Original Message - From: Tim Bergsma [EMAIL PROTECTED]
 To: r-help@stat.math.ethz.ch
 Sent: Friday, June 08, 2007 2:57 PM
 Subject: [R] logical 'or' on list of vectors
 
 
 Suppose I have a list of logicals, such as returned by lapply:

 Theoph$Dose[1] - NA
 Theoph$Time[2] - NA
 Theoph$conc[3] - NA
 lapply(Theoph,is.na)

 Is there a direct way to execute logical or across all vectors? The
 following gives the desired result, but seems unnecessarily complex.

 as.logical(apply(do.call(rbind,lapply(Theoph,is.na)),2,sum))

 Regards,

 Tim

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 
 
 Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
 
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Ubu edgy + latest CRAN R + Rmpi = no go

2007-06-08 Thread Dirk Eddelbuettel

On 7 June 2007 at 17:22, Tim Keitt wrote:
| I'm just curious if anyone else has had problems with this
| configuration. I added the CRAN repository to apt and installed 2.5.0
| with apt-get. I then did an install.packages(Rmpi) on cluster nodes.
| Rmpi loads and lamhosts() shows the nodes, but mpi.spawn.Rslaves()
| fails (something to do with temp files?). Rmpi works fine with the

I have had similar issues at work. If you fix the lam packages at version
7.1.1, it works.  It does not seem to work with 7.1.2 in the current Ubuntu,
not does it work with 7.1.4 (current upstream version).

As other MPI tools seem to work, I would put the error on Rmpi, but I have
not had time to pin this down.

For what it's worth, a few of us are trying to revive the OpenMPI packages in
Debian, and I have started to on a port of Rmpi to ROpenMPI.  No ETA for that.

| Edgy-native version of R (2.3.x) and installing Edgy's r-cran-rmpi
| with apt. (But I need some other packages that only work in 2.4+!)
| Could this be a problem with the latest Ubu debs on CRAN? The Rmpi

R itself is just fine on Ubuntu, thank you.

Dirk

| author says his R 2.5 setup works fine. CC me please as I'm not
| subscribed.
| 
| THK
| 
| -- 
| Timothy H. Keitt, University of Texas at Austin
| Contact info and schedule at http://www.keittlab.org/tkeitt/
| Reprints at http://www.keittlab.org/tkeitt/papers/
| ODF attachment? See http://www.openoffice.org/
| 
| __
| R-help@stat.math.ethz.ch mailing list
| https://stat.ethz.ch/mailman/listinfo/r-help
| PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
| and provide commented, minimal, self-contained, reproducible code.

-- 
Hell, there are no rules here - we're trying to accomplish something. 
  -- Thomas A. Edison

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Tools For Preparing Data For Analysis

2007-06-08 Thread Wensui Liu
I had mentioned exactly the same thing to others and the feedback I got is -
'when you have a hammer, everything will look like a nail'
^_^.

On 6/7/07, Frank E Harrell Jr [EMAIL PROTECTED] wrote:
 Robert Wilkins wrote:
  As noted on the R-project web site itself ( www.r-project.org -
  Manuals - R Data Import/Export ), it can be cumbersome to prepare
  messy and dirty data for analysis with the R tool itself. I've also
  seen at least one S programming book (one of the yellow Springer ones)
  that says, more briefly, the same thing.
  The R Data Import/Export page recommends examples using SAS, Perl,
  Python, and Java. It takes a bit of courage to say that ( when you go
  to a corporate software web site, you'll never see a page saying This
  is the type of problem that our product is not the best at, here's
  what we suggest instead ). I'd like to provide a few more
  suggestions, especially for volunteers who are willing to evaluate new
  candidates.
 
  SAS is fine if you're not paying for the license out of your own
  pocket. But maybe one reason you're using R is you don't have
  thousands of spare dollars.
  Using Java for data cleaning is an exercise in sado-masochism, Java
  has a learning curve (almost) as difficult as C++.
 
  There are different types of data transformation, and for some data
  preparation problems an all-purpose programming language is a good
  choice ( i.e. Perl , or maybe Python/Ruby ). Perl, for example, has
  excellent regular expression facilities.
 
  However, for some types of complex demanding data preparation
  problems, an all-purpose programming language is a poor choice. For
  example: cleaning up and preparing clinical lab data and adverse event
  data - you could do it in Perl, but it would take way, way too much
  time. A specialized programming language is needed. And since data
  transformation is quite different from data query, SQL is not the
  ideal solution either.

 We deal with exactly those kinds of data solely using R.  R is
 exceptionally powerful for data manipulation, just a bit hard to learn.
   Many examples are at
 http://biostat.mc.vanderbilt.edu/twiki/pub/Main/RS/sintro.pdf

 Frank

 
  There are only three statistical programming languages that are
  well-known, all dating from the 1970s: SPSS, SAS, and S. SAS is more
  popular than S for data cleaning.
 
  If you're an R user with difficult data preparation problems, frankly
  you are out of luck, because the products I'm about to mention are
  new, unknown, and therefore regarded as immature. And while the
  founders of these products would be very happy if you kicked the
  tires, most people don't like to look at brand new products. Most
  innovators and inventers don't realize this, I've learned it the hard
  way.
 
  But if you are a volunteer who likes to help out by evaluating,
  comparing, and reporting upon new candidates, well you could certainly
  help out R users and the developers of the products by kicking the
  tires of these products. And there is a huge need for such volunteers.
 
  1. DAP
  This is an open source implementation of SAS.
  The founder: Susan Bassein
  Find it at: directory.fsf.org/math/stats (GNU GPL)
 
  2. PSPP
  This is an open source implementation of SPSS.
  The relatively early version number might not give a good idea of how
  mature the
  data transformation features are, it reflects the fact that he has
  only started doing the statistical tests.
  The founder: Ben Pfaff, either a grad student or professor at Stanford CS 
  dept.
  Also at : directory.fsf.org/math/stats (GNU GPL)
 
  3. Vilno
  This uses a programming language similar to SPSS and SAS, but quite unlike 
  S.
  Essentially, it's a substitute for the SAS datastep, and also
  transposes data and calculates averages and such. (No t-tests or
  regressions in this version). I created this, during the years
  2001-2006 mainly. It's version 0.85, and has a fairly low bug rate, in
  my opinion. The tarball includes about 100 or so test cases used for
  debugging - for logical calculation errors, but not for extremely high
  volumes of data.
  The maintenance of Vilno has slowed down, because I am currently
  (desparately) looking for employment. But once I've found new
  employment and living quarters and settled in, I will continue to
  enhance Vilno in my spare time.
  The founder: that would be me, Robert Wilkins
  Find it at: code.google.com/p/vilno ( GNU GPL )
  ( In particular, the tarball at code.google.com/p/vilno/downloads/list
  , since I have yet to figure out how to use Subversion ).
 
 
  4. Who knows?
  It was not easy to find out about the existence of DAP and PSPP. So
  who knows what else is out there. However, I think you'll find a lot
  more statistics software ( regression , etc ) out there, and not so
  much data transformation software. Not many people work on data
  preparation software. In fact, the category is so obscure that there
  isn't one agreed term: data 

Re: [R] Tools For Preparing Data For Analysis

2007-06-08 Thread Martin Henry H. Stevens
Is there an example available of this sort of problematic data that  
requires this kind of data screening and filtering? For many of us,  
this issue would be nice to learn about, and deal with within R. If a  
package could be created, that would be optimal for some of us. I  
would like to learn a tad more, if it were not too much effort for  
someone else to point me in the right direction?
Cheers,
Hank
On Jun 8, 2007, at 8:47 AM, Douglas Bates wrote:

 On 6/7/07, Robert Wilkins [EMAIL PROTECTED] wrote:
 As noted on the R-project web site itself ( www.r-project.org -
 Manuals - R Data Import/Export ), it can be cumbersome to prepare
 messy and dirty data for analysis with the R tool itself. I've also
 seen at least one S programming book (one of the yellow Springer  
 ones)
 that says, more briefly, the same thing.
 The R Data Import/Export page recommends examples using SAS, Perl,
 Python, and Java. It takes a bit of courage to say that ( when you go
 to a corporate software web site, you'll never see a page saying  
 This
 is the type of problem that our product is not the best at, here's
 what we suggest instead ). I'd like to provide a few more
 suggestions, especially for volunteers who are willing to evaluate  
 new
 candidates.

 SAS is fine if you're not paying for the license out of your own
 pocket. But maybe one reason you're using R is you don't have
 thousands of spare dollars.
 Using Java for data cleaning is an exercise in sado-masochism, Java
 has a learning curve (almost) as difficult as C++.

 There are different types of data transformation, and for some data
 preparation problems an all-purpose programming language is a good
 choice ( i.e. Perl , or maybe Python/Ruby ). Perl, for example, has
 excellent regular expression facilities.

 However, for some types of complex demanding data preparation
 problems, an all-purpose programming language is a poor choice. For
 example: cleaning up and preparing clinical lab data and adverse  
 event
 data - you could do it in Perl, but it would take way, way too much
 time. A specialized programming language is needed. And since data
 transformation is quite different from data query, SQL is not the
 ideal solution either.

 There are only three statistical programming languages that are
 well-known, all dating from the 1970s: SPSS, SAS, and S. SAS is more
 popular than S for data cleaning.

 If you're an R user with difficult data preparation problems, frankly
 you are out of luck, because the products I'm about to mention are
 new, unknown, and therefore regarded as immature. And while the
 founders of these products would be very happy if you kicked the
 tires, most people don't like to look at brand new products. Most
 innovators and inventers don't realize this, I've learned it the hard
 way.

 But if you are a volunteer who likes to help out by evaluating,
 comparing, and reporting upon new candidates, well you could  
 certainly
 help out R users and the developers of the products by kicking the
 tires of these products. And there is a huge need for such  
 volunteers.

 1. DAP
 This is an open source implementation of SAS.
 The founder: Susan Bassein
 Find it at: directory.fsf.org/math/stats (GNU GPL)

 2. PSPP
 This is an open source implementation of SPSS.
 The relatively early version number might not give a good idea of how
 mature the
 data transformation features are, it reflects the fact that he has
 only started doing the statistical tests.
 The founder: Ben Pfaff, either a grad student or professor at  
 Stanford CS dept.
 Also at : directory.fsf.org/math/stats (GNU GPL)

 3. Vilno
 This uses a programming language similar to SPSS and SAS, but  
 quite unlike S.
 Essentially, it's a substitute for the SAS datastep, and also
 transposes data and calculates averages and such. (No t-tests or
 regressions in this version). I created this, during the years
 2001-2006 mainly. It's version 0.85, and has a fairly low bug  
 rate, in
 my opinion. The tarball includes about 100 or so test cases used for
 debugging - for logical calculation errors, but not for extremely  
 high
 volumes of data.
 The maintenance of Vilno has slowed down, because I am currently
 (desparately) looking for employment. But once I've found new
 employment and living quarters and settled in, I will continue to
 enhance Vilno in my spare time.
 The founder: that would be me, Robert Wilkins
 Find it at: code.google.com/p/vilno ( GNU GPL )
 ( In particular, the tarball at code.google.com/p/vilno/downloads/ 
 list
 , since I have yet to figure out how to use Subversion ).

 4. Who knows?
 It was not easy to find out about the existence of DAP and PSPP. So
 who knows what else is out there. However, I think you'll find a lot
 more statistics software ( regression , etc ) out there, and not so
 much data transformation software. Not many people work on data
 preparation software. In fact, the category is so obscure that there
 isn't one agreed term: data cleaning , 

Re: [R] R is not a validated software package..

2007-06-08 Thread Wensui Liu
I like to know the answer as well.
To be honest, I really have hard time to understand the mentality of
clinical trial guys and rather believe it is something related to job
security.

On 6/8/07, Giovanni Parrinello [EMAIL PROTECTED] wrote:
 Dear All,
 discussing with a statistician of a pharmaceutical company I received
 this answer about the statistical package that I have planned to use:

 As R is not a validated software package, we would like to ask if it
 would rather be possible for you to use SAS, SPSS or another approved
 statistical software system.

 Could someone suggest me a 'polite' answer?
 TIA
 Giovanni

 --
 dr. Giovanni Parrinello
 External Lecturer
 Medical Statistics Unit
 Department of Biomedical Sciences
 Viale Europa, 11 - 25123 Brescia Italy
 Tel: +390303717528
 Fax: +390303717488
 email: [EMAIL PROTECTED]


 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
WenSui Liu
A lousy statistician who happens to know a little programming
(http://spaces.msn.com/statcompute/blog)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] character to time problem

2007-06-08 Thread John Kane
Looks much better. I seldom use dates for much and
didn't think to look at the sort.POSIXlt function.

If I understand this correctly the sort.POSIXlt with
na.last = FALSE is dropping all the NAs.  Very nice.


--- Gabor Grothendieck [EMAIL PROTECTED]
wrote:

 Perhaps you want one of these:
 
  sort(as.Date(aa$times, %d/%m/%Y))
 [1] 1995-03-02 2001-05-12 2007-02-14
 
  sort(as.Date(aa$times, %d/%m/%Y), na.last =
 TRUE)
 [1] 1995-03-02 2001-05-12 2007-02-14 NA   
NA
 [6] NA
 
 
 On 6/7/07, John Kane [EMAIL PROTECTED] wrote:
  I am trying to clean up some dates and I am
 clearly
  doing something wrong.  I have laid out an example
  that seems to show what is happening with the
 real
  data.  The  coding is lousy but it looks like it
  should have worked.
 
  Can anyone suggest a) why I am getting that NA
  appearing after the strptime() command and b) why
 the
  NA is disappearing in the sort()? It happens with
  na.rm=TRUE  and na.rm=FALSE
  -
  aa  - data.frame( c(12/05/2001,  ,
 30/02/1995,
  NA, 14/02/2007, M ) )
  names(aa)  - times
  aa[is.na(aa)] - M
  aa[aa== ]  - M
  bb - unlist(subset(aa, aa[,1] !=M))
  dates - strptime(bb, %d/%m/%Y)
  dates
  sort(dates)
  --
 
  Session Info
  R version 2.4.1 (2006-12-18)
  i386-pc-mingw32
 
  locale:
  LC_COLLATE=English_Canada.1252;
  LC_CTYPE=English_Canada.1252;
  LC_MONETARY=English_Canada.1252;
  LC_NUMERIC=C;LC_TIME=English_Canada.1252
 
  attached base packages:
  [1] stats graphics  grDevices utils
  datasets  methods   base
 
  other attached packages:
   gdata   Hmisc
  2.3.1 3.3-2
 
   (Yes I know I'm out of date but I don't like
  upgrading just as I am finishing a project)
 
  Thanks
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained,
 reproducible code.
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] rlm results on trellis plot

2007-06-08 Thread hadley wickham
On 6/7/07, Alan S Barnett [EMAIL PROTECTED] wrote:
 How do I add to a trellis plot the best fit line from a robust fit? I
 can use panel.lm to add a least squares fit, but there is no panel.rlm
 function.

It's not trellis, but it's really easy to do this with ggplot2:

install.packages(ggplot2, dep=T)
library(ggplot2)

p - qplot(x, y, data=diamonds)
p + geom_smooth(method=lm)
p + geom_smooth(method=rlm)
p + geom_smooth(method=lm, formula=y ~ poly(x,3))

see http://had.co.nz/ggplot2/stat_smooth.html for more examples.

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sorting dataframe by different columns

2007-06-08 Thread Kevin Wright
On the R wiki site there is a general-purpose function
(sort.data.frame) that allows you to do this:

sort(df, by=~ x-z)

See: http://wiki.r-project.org/rwiki/doku.php?id=tips:data-frames:sort

Regards,

Kevin

On 6/8/07, Gunther Höning [EMAIL PROTECTED] wrote:
 Dear list,

 I have a very short question,
 Suggest a dataframe of four columns.

 df - data.frame(w,x,y,z)

 I want this ordered the following way:
 first by :x, decreasing = FALSE
 and
 secondly by: z, decreasing =TRUE

 How can this be done ?

 Thanks

 Gunther

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] overplots - fixing scientific vs normal notation in output

2007-06-08 Thread John Kane

--- Peter Lercher [EMAIL PROTECTED] wrote:

 Moving from S-plus to R I encountered many great
 features and a much
 more stable system.
 Currently, I am left with 2 problems that are
 handled differently:
 
 1) I did lots of overplots in S-Plus using
 par(new=T,xaxs='d',yaxs='d') to fix the axes
 -What is the workaround in R ?

What does S=Plus do here?  
 
 2) In S-Plus I could fix scientific notation or
 normal notation in output
 -How can I handle this in R ?
 I found no fix in the documentation

?format() maybe?

 
 I am using R version 2.4.1 (2006-12-18) on Windows
 XP SR2
 
 
 Peter Lercher, M.D., M.P.H., Assoc Prof
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R is not a validated software package..

2007-06-08 Thread Frank E Harrell Jr
Giovanni Parrinello wrote:
 Dear All,
 discussing with a statistician of a pharmaceutical company I received 
 this answer about the statistical package that I have planned to use:
 
 As R is not a validated software package, we would like to ask if it 
 would rather be possible for you to use SAS, SPSS or another approved 
 statistical software system.
 
 Could someone suggest me a 'polite' answer?
 TIA
 Giovanni
 

Search the archives and you'll find a LOT of responses.

Briefly, in my view there are no requirements, just some pharma 
companies that think there are.  FDA is required to accepted all 
submissions, and they get some where only Excel was used, or Minitab, 
and lots more.  There is a session on this at the upcoming R 
International Users Meeting in Iowa in August.  The session will include 
dicussions of federal regulation compliance for R, for those users who 
feel that such compliance is actually needed.

Frank

-- 
Frank E Harrell Jr   Professor and Chair   School of Medicine
  Department of Biostatistics   Vanderbilt University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] matrix and data frame

2007-06-08 Thread elyakhlifi mustapha
hello,
I have just a question before the week end it's that I don't know how to do to 
paste matrixs and these matrix they have one same column and I'd like to paste 
its by this column
and I wanna paste its not below but just at right side hand
thanks good week end


  
_ 

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to make a table of a desired dimension

2007-06-08 Thread Rubén Roa-Ureta
Hi ComRades,

I want to make a matrix of frequencies from vectors of a continuous 
variable spanning different values. For example this code
x-c(runif(100,10,40),runif(100,43,55))
y-c(runif(100,7,35),runif(100,37,50))
z-c(runif(100,10,42),runif(100,45,52))
a-table(ceiling(x))
b-table(ceiling(y))
c-table(ceiling(z))
a
b
c

will give me three tables that start and end at different integer 
values, and besides, they have 'holes' in between different integer 
values. Is it possible to use 'table' to make these three tables have 
the same dimensions, filling in the absent labels with zeroes? In the 
example above, the desired tables should all start at 8 and tables 'a' 
and 'c' should put a zero at labels '8' to '10', should all put zeros in 
the frequencies of the labels corresponding to the holes, and should all 
end at label '55'. The final purpose is the make a matrix and use 
'matplot' to plot all the frequencies in one plot, such as

#code valid only when 'a', 'b', and 'c' have the proper dimension
p-mat.or.vec(48,4)
p[,1]-8:55
p[,2]-c(matrix(a)[1:48])
p[,3]-c(matrix(b)[1:48])
p[,4]-c(matrix(c)[1:48])
matplot(p)

I read the help about 'table' but I couldn't figure out if dnn, 
deparse.level, or the other arguments could serve my purpose. Thanks for 
your help
Rubén

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Use R in a pipeline as a filter

2007-06-08 Thread Dirk Eddelbuettel

On 7 June 2007 at 14:27, [EMAIL PROTECTED] wrote:
| how can I use R in a pipline like this
| 
|  $ ./generate-data | R --script-file=Script.R | ./further-analyse-data  
result.dat

The 'r' in our 'littler' package can do that. One example we show on the
littler webpage is

$ ls -l /boot | awk '!/^total/ {print $5}' | \
 r -e 'fsizes - as.integer(readLines());
print(summary(fsizes)); stem(fsizes)'

We use R's readLines to read from stdin, and you can of course also have r
'in the middle' if you take care of the output generated -- which our example
doesn't do as it prints straight to screen.

Dirk

-- 
Hell, there are no rules here - we're trying to accomplish something. 
  -- Thomas A. Edison

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] character to time problem

2007-06-08 Thread Gabor Grothendieck
The code in my post uses Date class, not POSIX.
sort.POSIXlt is never invoked.  Suggest you read the
help desk article in R News 4/1 for more.

On 6/8/07, John Kane [EMAIL PROTECTED] wrote:
 Looks much better. I seldom use dates for much and
 didn't think to look at the sort.POSIXlt function.

 If I understand this correctly the sort.POSIXlt with
 na.last = FALSE is dropping all the NAs.  Very nice.


 --- Gabor Grothendieck [EMAIL PROTECTED]
 wrote:

  Perhaps you want one of these:
 
   sort(as.Date(aa$times, %d/%m/%Y))
  [1] 1995-03-02 2001-05-12 2007-02-14
 
   sort(as.Date(aa$times, %d/%m/%Y), na.last =
  TRUE)
  [1] 1995-03-02 2001-05-12 2007-02-14 NA
 NA
  [6] NA
 
 
  On 6/7/07, John Kane [EMAIL PROTECTED] wrote:
   I am trying to clean up some dates and I am
  clearly
   doing something wrong.  I have laid out an example
   that seems to show what is happening with the
  real
   data.  The  coding is lousy but it looks like it
   should have worked.
  
   Can anyone suggest a) why I am getting that NA
   appearing after the strptime() command and b) why
  the
   NA is disappearing in the sort()? It happens with
   na.rm=TRUE  and na.rm=FALSE
   -
   aa  - data.frame( c(12/05/2001,  ,
  30/02/1995,
   NA, 14/02/2007, M ) )
   names(aa)  - times
   aa[is.na(aa)] - M
   aa[aa== ]  - M
   bb - unlist(subset(aa, aa[,1] !=M))
   dates - strptime(bb, %d/%m/%Y)
   dates
   sort(dates)
   --
  
   Session Info
   R version 2.4.1 (2006-12-18)
   i386-pc-mingw32
  
   locale:
   LC_COLLATE=English_Canada.1252;
   LC_CTYPE=English_Canada.1252;
   LC_MONETARY=English_Canada.1252;
   LC_NUMERIC=C;LC_TIME=English_Canada.1252
  
   attached base packages:
   [1] stats graphics  grDevices utils
   datasets  methods   base
  
   other attached packages:
gdata   Hmisc
   2.3.1 3.3-2
  
(Yes I know I'm out of date but I don't like
   upgrading just as I am finishing a project)
  
   Thanks
  
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained,
  reproducible code.
  
 



  Be smarter than spam. See how smart SpamGuard is at giving junk email 
 the boot with the All-new Yahoo! Mail at 
 http://mrd.mail.yahoo.com/try_beta?.intl=ca



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] evaluating variables in the context of a data frame

2007-06-08 Thread Zack Weinberg
On 6/7/07, Prof Brian Ripley [EMAIL PROTECTED] wrote:
  f - function(x, dat) evalq(x, dat)
  f(o, D)
  Error in eval(expr, envir, enclos) : object o not found
  g - function(x, dat) eval(x, dat)
  g(o, D)
  Error in eval(x, dat) : object o not found
 
  What am I doing wrong?  This seems to be what the helpfiles say you do
  to evaluate arguments in the context of a passed-in data frame...

 When you call f(o, D), the argument 'o' is evaluated in the current
 environment ('context' in R means something different).  Because of lazy
 evaluation, it is not evaluated until evalq is called, but it evaluated as
 if it was evaluated greedily.

 g(quote(o), D) will work.

Thanks.

After a bit more experimentation I figured out that this does what I want:

 h - function(x, d) eval(substitute(x), d, parent.frame())

but I don't understand why the substitute() helps, or indeed why it
has any effect at all...

zw

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] matrix and data frame

2007-06-08 Thread Sarah Goslee
I'm not at all certain I understand your question, but try
?cbind

Sarah

On 6/8/07, elyakhlifi mustapha [EMAIL PROTECTED] wrote:
 hello,
 I have just a question before the week end it's that I don't know how to do 
 to paste matrixs and these matrix they have one same column and I'd like to 
 paste its by this column
 and I wanna paste its not below but just at right side hand
 thanks good week end


-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data mining/text mining?

2007-06-08 Thread Weiwei Shi
Dear Ruixin:
Among others, text mining is dealing with non-structural data while
data mining mainly focuses on structural one. Many algorithms can be
shared b/w them; however, some necessary data preprocessing is
required for text mining. There are a lot of online-resource there.

As to packages used for text mining in R, esp. for preprocessing,
please check the following link:
http://wwwpeople.unil.ch/jean-pierre.mueller/

I used that package very long time ago and am not sure if they are
updated for this current version of R; otherwise, you might need to go
back the old version like R1.1.

If you want to do text mining for chinese text (I guess :), there is
additional work (i.e. word splitting) needed. I remember there is some
researcher from Taiwan who does pretty good work and you can google
that. I cannot remember the details.

HTH,

Weiwei


On 6/8/07, Ruixin ZHU [EMAIL PROTECTED] wrote:
 Dear R-user,

 Could anybody tell me of the key difference between data mining and text
 mining?
 Please make a list for packages about data/text mining.
 And give me an example of text mining with R (any relating materials
 will be highly appreciated), because a vignette written by Ingo Feinerer
 seems too concise for me.

 Thanks
 _
 Dr.Ruixin ZHU
 Shanghai Center for Bioinformation Technology
 [EMAIL PROTECTED]
 [EMAIL PROTECTED]
 86-21-13040647832


 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Weiwei Shi, Ph.D
Research Scientist
GeneGO, Inc.

Did you always know?
No, I did not. But I believed...
---Matrix III

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R is not a validated software package..

2007-06-08 Thread Wensui Liu
agree with Frank.
as far as I've known, FDA doesn't encourage or discourage the usage of
software.

On 6/8/07, Frank E Harrell Jr [EMAIL PROTECTED] wrote:
 Giovanni Parrinello wrote:
  Dear All,
  discussing with a statistician of a pharmaceutical company I received
  this answer about the statistical package that I have planned to use:
 
  As R is not a validated software package, we would like to ask if it
  would rather be possible for you to use SAS, SPSS or another approved
  statistical software system.
 
  Could someone suggest me a 'polite' answer?
  TIA
  Giovanni
 

 Search the archives and you'll find a LOT of responses.

 Briefly, in my view there are no requirements, just some pharma
 companies that think there are.  FDA is required to accepted all
 submissions, and they get some where only Excel was used, or Minitab,
 and lots more.  There is a session on this at the upcoming R
 International Users Meeting in Iowa in August.  The session will include
 dicussions of federal regulation compliance for R, for those users who
 feel that such compliance is actually needed.

 Frank

 --
 Frank E Harrell Jr   Professor and Chair   School of Medicine
   Department of Biostatistics   Vanderbilt University

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
WenSui Liu
A lousy statistician who happens to know a little programming
(http://spaces.msn.com/statcompute/blog)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Ubu edgy + latest CRAN R + Rmpi = no go

2007-06-08 Thread Tim Keitt
On 6/8/07, Dirk Eddelbuettel [EMAIL PROTECTED] wrote:

 On 7 June 2007 at 17:22, Tim Keitt wrote:
 | I'm just curious if anyone else has had problems with this
 | configuration. I added the CRAN repository to apt and installed 2.5.0
 | with apt-get. I then did an install.packages(Rmpi) on cluster nodes.
 | Rmpi loads and lamhosts() shows the nodes, but mpi.spawn.Rslaves()
 | fails (something to do with temp files?). Rmpi works fine with the

 I have had similar issues at work. If you fix the lam packages at version
 7.1.1, it works.  It does not seem to work with 7.1.2 in the current Ubuntu,
 not does it work with 7.1.4 (current upstream version).

 As other MPI tools seem to work, I would put the error on Rmpi, but I have
 not had time to pin this down.

 For what it's worth, a few of us are trying to revive the OpenMPI packages in
 Debian, and I have started to on a port of Rmpi to ROpenMPI.  No ETA for that.

 | Edgy-native version of R (2.3.x) and installing Edgy's r-cran-rmpi
 | with apt. (But I need some other packages that only work in 2.4+!)
 | Could this be a problem with the latest Ubu debs on CRAN? The Rmpi

 R itself is just fine on Ubuntu, thank you.

And very much appreciated. ;-)

THK


 Dirk

 | author says his R 2.5 setup works fine. CC me please as I'm not
 | subscribed.
 |
 | THK
 |
 | --
 | Timothy H. Keitt, University of Texas at Austin
 | Contact info and schedule at http://www.keittlab.org/tkeitt/
 | Reprints at http://www.keittlab.org/tkeitt/papers/
 | ODF attachment? See http://www.openoffice.org/
 |
 | __
 | R-help@stat.math.ethz.ch mailing list
 | https://stat.ethz.ch/mailman/listinfo/r-help
 | PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 | and provide commented, minimal, self-contained, reproducible code.

 --
 Hell, there are no rules here - we're trying to accomplish something.
   -- Thomas A. Edison



-- 
Timothy H. Keitt, University of Texas at Austin
Contact info and schedule at http://www.keittlab.org/tkeitt/
Reprints at http://www.keittlab.org/tkeitt/papers/
ODF attachment? See http://www.openoffice.org/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] overplots - fixing scientific vs normal notation in output

2007-06-08 Thread Greg Snow

Peter Lercher wrote:

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Peter Lercher
 Sent: Friday, June 08, 2007 3:07 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] overplots - fixing scientific vs normal notation 
 in output
 
 Moving from S-plus to R I encountered many great features and 
 a much more stable system.
 Currently, I am left with 2 problems that are handled differently:
 
 1) I did lots of overplots in S-Plus using
 par(new=T,xaxs='d',yaxs='d') to fix the axes
 -What is the workaround in R ?

Since you are using the same axes, do you really need to do the
overplotting instead of just using lines/points to add to the plot?

R has not implemented xaxs='d', so on your additional plots, just
specify xlim and/or ylim directly.  There are a couple of ways to do
this.  First, find the range of values from all of your plots then use
this as the argument to xlim and ylim for each plot.  Second, create the
first plot then use par('usr') to find what the limits of the
coordinates are, then use these values for xlim/ylim in further plots
(using xaxs/yaxs='i' so the extra 4% is not added). Third, there are
probably other ways, but the above should get you started.



 
 2) In S-Plus I could fix scientific notation or normal 
 notation in output
 -How can I handle this in R ?
 I found no fix in the documentation

Look at options('scipen'), this is not exactly fixing it like S-PLUS,
but could solve most your problems.

 
 I am using R version 2.4.1 (2006-12-18) on Windows XP SR2
 
 
 Peter Lercher, M.D., M.P.H., Assoc Prof
 


Hope this helps,

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
(801) 408-8111

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R is not a validated software package..

2007-06-08 Thread Frank E Harrell Jr
Sicotte, Hugues Ph.D. wrote:
 People, don't get angry at the pharma statistician, he is just trying to
 abide by an FDA requirement that is designed to insure that test perform
 reliably the same. There is no point in getting into which product is
 better. As far as the FDA rules are concerned a validated system beats a
 better system any day of the week.

There is no such requirement.

 
 Here is your polite answer.
 You can develop and try your software in R.
 Should they need to use those results in a report that will matter to
 the FDA, then you can work together with him to set up a validated
 environment for S-plus. You then have to commit to port your code to
 S-plus.

That doesn't follow.  What matters is good statistical analysis practice 
no matter which environment you use.  Note that more errors are made in 
the data preparation / derived variables stage than are made by 
statistical software.

Frank

 
 As I assume that you do not work in a regulated environment, you
 probably wouldn't have access to a validated SAS environment anyways. It
 is not usually enough to install a piece of software, you have to
 validate every step of the installation. Since AFAIK the FDA uses
 S-plus, it would be to your pharma person's advantage to speed-up
 submissions if they also had a validated S-plus environment.
 
 http://www.msmiami.com/custom/downloads/S-PLUSValidationdatasheet_Final.
 pdf
 
 
 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Wensui Liu
 Sent: Friday, June 08, 2007 9:24 AM
 To: Giovanni Parrinello
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] R is not a validated software package..
 
 I like to know the answer as well.
 To be honest, I really have hard time to understand the mentality of
 clinical trial guys and rather believe it is something related to job
 security.
 
 On 6/8/07, Giovanni Parrinello [EMAIL PROTECTED] wrote:
 Dear All,
 discussing with a statistician of a pharmaceutical company I received
 this answer about the statistical package that I have planned to use:

 As R is not a validated software package, we would like to ask if it
 would rather be possible for you to use SAS, SPSS or another approved
 statistical software system.

 Could someone suggest me a 'polite' answer?
 TIA
 Giovanni

 --
 dr. Giovanni Parrinello
 External Lecturer
 Medical Statistics Unit
 Department of Biomedical Sciences
 Viale Europa, 11 - 25123 Brescia Italy
 Tel: +390303717528
 Fax: +390303717488
 email: [EMAIL PROTECTED]


 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 
 


-- 
Frank E Harrell Jr   Professor and Chair   School of Medicine
  Department of Biostatistics   Vanderbilt University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] rlm results on trellis plot

2007-06-08 Thread Bert Gunter
I don't think the code below does what's requested, as it assumes a single
overall fit for all panels, and I think the requester wanted separate fits
by panel. This can be easily done, of course, by a minor modification:

xyplot( y ~ x | z,
 panel = function(x,y,...){
   panel.xyplot(x,y,...)
   panel.abline(lm(y~x),col=blue,lwd=2)
   panel.abline(rlm(y~x),col = red,lwd=2)
})

Note that the coefficients do not need to be explicitly extracted by coef(),
as panel.abline will do this automatically.

Bert Gunter
Genentech Nonclinical Statistics
South San Francisco, CA 94404
650-467-7374



Alan S Barnett wrote:
 How do I add to a trellis plot the best fit line from a robust fit? I
 can use panel.lm to add a least squares fit, but there is no panel.rlm
 function.

  How about using panel.abline() instead of panel.lmline()?

fit1 - coef(lm(stack.loss ~ Air.Flow, data = stackloss))
fit2 - coef(rlm(stack.loss ~ Air.Flow, data = stackloss))

xyplot(stack.loss ~ Air.Flow, data=stackloss,
   panel = function(x, y, ...){
 panel.xyplot(x, y, ...)
 panel.abline(fit1, type=l, col=blue)
 panel.abline(fit2, type=l, col=red)
   }, aspect=1)

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R is not a validated software package..

2007-06-08 Thread Bert Gunter
Frank et. al:

I believe this is a bit too facile. 21 CFR Part 11 does necessitate a
software validation **process** -- but this process does not require any
particular software. Rather, it requires that those using whatever software
demonstrate to the FDA's satisfaction that the software does what it's
supposed to do appropriately. This includes a lot more than assuring, say,
the numerical accuracy of computations; I think it also requires
demonstration that the data are secure, that it is properly transferred
from one source to another, etc. I assume that the statistical validation of
R would be relatively simple, as R already has an extensive test suite, and
it would simply be a matter of providing that test suite info. A bit more
might be required, but I don't think it's such a big deal. 

I think Wensui Liu's characterization of clinical statisticians as having a
mentality related to job security is a canard. Although I work in
nonclinical, my observation is that clinical statistics is complex and
difficult, not only because of many challenging statistical issues, but also
because of the labyrinthian complexities of the regulated and extremely
costly environment in which they work. It is certainly a job that I could
not do.

That said, probably the greatest obstacle to change from SAS is neither
obstinacy nor ignorance, but rather inertia: pharmaceutical companies have
over the decades made a huge investment in SAS infrastructure to support the
collection, organization, analysis, and submission of data for clinical
trials. To convert this to anything else would be a herculean task involving
huge expense, risk, and resources. R, S-Plus (and much else -- e.g. numerous
unvalidated data mining software packages) are routinely used by clinical
statisticians to better understand their data and for exploratory analyses
that are used to supplement official analyses (e.g. for trying to justify
collection of tissue samples or a pivotal study in a patient subpopulation).
But it is difficult for me to see how one could make a business case to
change clinical trial analysis software infrastructure from SAS to S-Plus,
SPSS, or anything else.

**DISCLAINMER** 
My opinions only. They do not in any way represent the view of my company or
its employees.


Bert Gunter
Genentech Nonclinical Statistics
South San Francisco, CA 94404
650-467-7374


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Frank E Harrell Jr
Sent: Friday, June 08, 2007 7:45 AM
To: Giovanni Parrinello
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] R is not a validated software package..

Giovanni Parrinello wrote:
 Dear All,
 discussing with a statistician of a pharmaceutical company I received 
 this answer about the statistical package that I have planned to use:
 
 As R is not a validated software package, we would like to ask if it 
 would rather be possible for you to use SAS, SPSS or another approved 
 statistical software system.
 
 Could someone suggest me a 'polite' answer?
 TIA
 Giovanni
 

Search the archives and you'll find a LOT of responses.

Briefly, in my view there are no requirements, just some pharma 
companies that think there are.  FDA is required to accepted all 
submissions, and they get some where only Excel was used, or Minitab, 
and lots more.  There is a session on this at the upcoming R 
International Users Meeting in Iowa in August.  The session will include 
dicussions of federal regulation compliance for R, for those users who 
feel that such compliance is actually needed.

Frank

-- 
Frank E Harrell Jr   Professor and Chair   School of Medicine
  Department of Biostatistics   Vanderbilt University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] rlm results on trellis plot

2007-06-08 Thread deepayan . sarkar
On 6/7/07, Alan S Barnett [EMAIL PROTECTED] wrote:
 How do I add to a trellis plot the best fit line from a robust fit? I
 can use panel.lm to add a least squares fit, but there is no panel.rlm
 function.

Well, panel.lmline (not panel.lm, BTW) is defined as:

 panel.lmline
function (x, y, ...)
{
if (length(x)  0)
panel.abline(lm(as.numeric(y) ~ as.numeric(x)), ...)
}

So it's not much of a stretch to define

panel.rlmline - function(x, y, ...)
if (require(MASS)  length(x)  0)
panel.abline(rlm(as.numeric(y) ~ as.numeric(x)), ...)

The other replies have already shown you how you might use this in a call.

-Deepayan

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R is not a validated software package..

2007-06-08 Thread Sicotte, Hugues Ph.D.
I may have overstated  things a bit.

See section VIII
http://www.fda.gov/CDER/GUIDANCE/2396dft.htm

If you are analyzing data your statistical package does not necessarely
have to be validated. You may have to show that the statistical methods
are adequate/appropriate or that the results are reproduced with
different softwares if you are using non-standard packages. By all
tests, S-plus appears acceptable, do not know about R.

However, If your statistical method is an intricate part of a test, then
you do have to validate the system.
This is becoming increasingly relevant for theragnostics.

.. Which is why I said
Should they need to use those results in a report [where] that will
matter to the FDA..
(I added the where .. It makes more sense)



-Original Message-
From: Frank E Harrell Jr [mailto:[EMAIL PROTECTED] 
Sent: Friday, June 08, 2007 11:08 AM
To: Sicotte, Hugues Ph.D.
Cc: Wensui Liu; Giovanni Parrinello; r-help@stat.math.ethz.ch
Subject: Re: [R] R is not a validated software package..

Sicotte, Hugues Ph.D. wrote:
 People, don't get angry at the pharma statistician, he is just trying
to
 abide by an FDA requirement that is designed to insure that test
perform
 reliably the same. There is no point in getting into which product is
 better. As far as the FDA rules are concerned a validated system beats
a
 better system any day of the week.

There is no such requirement.

 
 Here is your polite answer.
 You can develop and try your software in R.
 Should they need to use those results in a report that will matter to
 the FDA, then you can work together with him to set up a validated
 environment for S-plus. You then have to commit to port your code to
 S-plus.

That doesn't follow.  What matters is good statistical analysis practice

no matter which environment you use.  Note that more errors are made in 
the data preparation / derived variables stage than are made by 
statistical software.

Frank

 
 As I assume that you do not work in a regulated environment, you
 probably wouldn't have access to a validated SAS environment anyways.
It
 is not usually enough to install a piece of software, you have to
 validate every step of the installation. Since AFAIK the FDA uses
 S-plus, it would be to your pharma person's advantage to speed-up
 submissions if they also had a validated S-plus environment.
 

http://www.msmiami.com/custom/downloads/S-PLUSValidationdatasheet_Final.
 pdf
 
 
 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Wensui Liu
 Sent: Friday, June 08, 2007 9:24 AM
 To: Giovanni Parrinello
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] R is not a validated software package..
 
 I like to know the answer as well.
 To be honest, I really have hard time to understand the mentality of
 clinical trial guys and rather believe it is something related to job
 security.
 
 On 6/8/07, Giovanni Parrinello [EMAIL PROTECTED] wrote:
 Dear All,
 discussing with a statistician of a pharmaceutical company I received
 this answer about the statistical package that I have planned to use:

 As R is not a validated software package, we would like to ask if it
 would rather be possible for you to use SAS, SPSS or another approved
 statistical software system.

 Could someone suggest me a 'polite' answer?
 TIA
 Giovanni

 --
 dr. Giovanni Parrinello
 External Lecturer
 Medical Statistics Unit
 Department of Biomedical Sciences
 Viale Europa, 11 - 25123 Brescia Italy
 Tel: +390303717528
 Fax: +390303717488
 email: [EMAIL PROTECTED]


 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 
 


-- 
Frank E Harrell Jr   Professor and Chair   School of Medicine
  Department of Biostatistics   Vanderbilt
University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R is not a validated software package..

2007-06-08 Thread Cody_Hamilton

As I read 21 CFR 11, the regulation deals more with ensuring the security
of the electronic health record itself. Thus, it seemed to me that so long
as the software (SAS, R, Splus, etc.) could not alter the data base in any
way then you're fine (this may be naive, but that's how I understood it).
The 'software validation' referred to in the document seems to be concerned
with software that is directly related to the device (i.e. software that is
part of the functionality of the device). How on earth could you really
validate every process that a stat software package is capable of anyway?
It seems to me that we would be better off focusing our attention on the
quality of the programming/data manipulation rather than on the 'validation
of the software' (which in terms of verifying its complete functionality
isn't possible anyway). I'm also curious that the open source (i.e.
non-black-box) nature of R hasn't struck more of a chord with regulatory
bodies.

I tried to make the case above to our software quality people, but they
insisted on a full OQ/IQ/PQ plan (which I'm in the process of drafting -
possibly my most boring task ever). In addition, my boss insisted on using
Splus instead of R because I couldn't convince him that R would be
acceptable (even though the OQ/IQ/PQ requirements would have been the same
and R would have saved money). That said, I would have had to write new
OQ/IQ/PQ plans every time there was a new version of R available (ouch). I
have no complaints with regards to Splus, but R would have been free.

One last thought to add to Bert's - I think one other thing that is holding
up the spread of S to the pharma/device companies is the availability of S
programmers (SAS programmers are plentiful). I tried to get 'has experience
in S' added to our next job opening, but my boss insisted that we would
never find a person with experience in SAS and S. I countered by asking if
we could just ask for S experience (and drop the SAS requirement), but he
gave me a dirty look. :)

Cody Hamilton, PhD
Edwards Lifesciences
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] compute new variable

2007-06-08 Thread Matthias von Rad
Hello,
maybe my question ist stupid, but I would like to calculate a new  
variable for all cases in my dataset. Inspired by the dialog in Rcmdr  
I tried
Datenmatrix$cohigha- with(Datenmatrix,mean (c(M2ORG, M5ORG, M8ORG,  
M11ORG), na.rm = TRUE)
as output I got the same number for all my cases (possibly the  
overallmean of all cases), instead of a mean for each case.
Can you help me with this problem?
regards
Matthias

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ievent.wait

2007-06-08 Thread ryestone

I am working on a plot and would be like to click on a few points and then
have a line connect them. Could anyone help me with this or advise me in a
direction that would suit this. I know I would be using ievent.wait in iplot
but not sure about this.

thank you.
-- 
View this message in context: 
http://www.nabble.com/ievent.wait-tf3891095.html#a11030568
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R is not a validated software package..

2007-06-08 Thread Frank E Harrell Jr
Bert Gunter wrote:
 Frank et. al:
 
 I believe this is a bit too facile. 21 CFR Part 11 does necessitate a
 software validation **process** -- but this process does not require any

For database software and for medical devices -

 particular software. Rather, it requires that those using whatever software
 demonstrate to the FDA's satisfaction that the software does what it's
 supposed to do appropriately. This includes a lot more than assuring, say,
 the numerical accuracy of computations; I think it also requires
 demonstration that the data are secure, that it is properly transferred
 from one source to another, etc. I assume that the statistical validation of
 R would be relatively simple, as R already has an extensive test suite, and
 it would simply be a matter of providing that test suite info. A bit more
 might be required, but I don't think it's such a big deal. 
 
 I think Wensui Liu's characterization of clinical statisticians as having a
 mentality related to job security is a canard. Although I work in
 nonclinical, my observation is that clinical statistics is complex and
 difficult, not only because of many challenging statistical issues, but also
 because of the labyrinthian complexities of the regulated and extremely
 costly environment in which they work. It is certainly a job that I could
 not do.
 
 That said, probably the greatest obstacle to change from SAS is neither
 obstinacy nor ignorance, but rather inertia: pharmaceutical companies have
 over the decades made a huge investment in SAS infrastructure to support the
 collection, organization, analysis, and submission of data for clinical
 trials. To convert this to anything else would be a herculean task involving
 huge expense, risk, and resources. R, S-Plus (and much else -- e.g. numerous
 unvalidated data mining software packages) are routinely used by clinical
 statisticians to better understand their data and for exploratory analyses
 that are used to supplement official analyses (e.g. for trying to justify
 collection of tissue samples or a pivotal study in a patient subpopulation).
 But it is difficult for me to see how one could make a business case to
 change clinical trial analysis software infrastructure from SAS to S-Plus,
 SPSS, or anything else.

What I would love to have is some efficiency estimates for SAS macro 
programming as done in pharma vs. using a high-level language.  My bias 
is that SAS macro programming, which costs companies more than SAS 
licenses, is incredibly inefficient.

Frank

 
 **DISCLAINMER** 
 My opinions only. They do not in any way represent the view of my company or
 its employees.
 
 
 Bert Gunter
 Genentech Nonclinical Statistics
 South San Francisco, CA 94404
 650-467-7374
 
 
 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Frank E Harrell Jr
 Sent: Friday, June 08, 2007 7:45 AM
 To: Giovanni Parrinello
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] R is not a validated software package..
 
 Giovanni Parrinello wrote:
 Dear All,
 discussing with a statistician of a pharmaceutical company I received 
 this answer about the statistical package that I have planned to use:

 As R is not a validated software package, we would like to ask if it 
 would rather be possible for you to use SAS, SPSS or another approved 
 statistical software system.

 Could someone suggest me a 'polite' answer?
 TIA
 Giovanni

 
 Search the archives and you'll find a LOT of responses.
 
 Briefly, in my view there are no requirements, just some pharma 
 companies that think there are.  FDA is required to accepted all 
 submissions, and they get some where only Excel was used, or Minitab, 
 and lots more.  There is a session on this at the upcoming R 
 International Users Meeting in Iowa in August.  The session will include 
 dicussions of federal regulation compliance for R, for those users who 
 feel that such compliance is actually needed.
 
 Frank
 


-- 
Frank E Harrell Jr   Professor and Chair   School of Medicine
  Department of Biostatistics   Vanderbilt University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] compute new variable

2007-06-08 Thread Chuck Cleland
Matthias von Rad wrote:
 Hello,
 maybe my question ist stupid, but I would like to calculate a new  
 variable for all cases in my dataset. Inspired by the dialog in Rcmdr  
 I tried
 Datenmatrix$cohigha- with(Datenmatrix,mean (c(M2ORG, M5ORG, M8ORG,  
 M11ORG), na.rm = TRUE)
 as output I got the same number for all my cases (possibly the  
 overallmean of all cases), instead of a mean for each case.
 Can you help me with this problem?

Datenmatrix$cohigha - rowMeans(Datenmatrix[,c(M2ORG, M5ORG,
M8ORG, M11ORG)], na.rm=TRUE)

 regards
 Matthias
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R is not a validated software package..

2007-06-08 Thread Wensui Liu
Bert,
I just want to make sure what I said is not overstated to offend
statistician who use SAS. actually, i am using SAS daily and able to
use it pretty well. ^_^
What I meant are:
1) I don't understand the mentality
2) using SAS instead of R might be related to job-security.
which is very different from their mentality is related to job security.

On 6/8/07, Bert Gunter [EMAIL PROTECTED] wrote:
 Frank et. al:

 I believe this is a bit too facile. 21 CFR Part 11 does necessitate a
 software validation **process** -- but this process does not require any
 particular software. Rather, it requires that those using whatever software
 demonstrate to the FDA's satisfaction that the software does what it's
 supposed to do appropriately. This includes a lot more than assuring, say,
 the numerical accuracy of computations; I think it also requires
 demonstration that the data are secure, that it is properly transferred
 from one source to another, etc. I assume that the statistical validation of
 R would be relatively simple, as R already has an extensive test suite, and
 it would simply be a matter of providing that test suite info. A bit more
 might be required, but I don't think it's such a big deal.

 I think Wensui Liu's characterization of clinical statisticians as having a
 mentality related to job security is a canard. Although I work in
 nonclinical, my observation is that clinical statistics is complex and
 difficult, not only because of many challenging statistical issues, but also
 because of the labyrinthian complexities of the regulated and extremely
 costly environment in which they work. It is certainly a job that I could
 not do.

 That said, probably the greatest obstacle to change from SAS is neither
 obstinacy nor ignorance, but rather inertia: pharmaceutical companies have
 over the decades made a huge investment in SAS infrastructure to support the
 collection, organization, analysis, and submission of data for clinical
 trials. To convert this to anything else would be a herculean task involving
 huge expense, risk, and resources. R, S-Plus (and much else -- e.g. numerous
 unvalidated data mining software packages) are routinely used by clinical
 statisticians to better understand their data and for exploratory analyses
 that are used to supplement official analyses (e.g. for trying to justify
 collection of tissue samples or a pivotal study in a patient subpopulation).
 But it is difficult for me to see how one could make a business case to
 change clinical trial analysis software infrastructure from SAS to S-Plus,
 SPSS, or anything else.

 **DISCLAINMER**
 My opinions only. They do not in any way represent the view of my company or
 its employees.


 Bert Gunter
 Genentech Nonclinical Statistics
 South San Francisco, CA 94404
 650-467-7374


 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Frank E Harrell Jr
 Sent: Friday, June 08, 2007 7:45 AM
 To: Giovanni Parrinello
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] R is not a validated software package..

 Giovanni Parrinello wrote:
  Dear All,
  discussing with a statistician of a pharmaceutical company I received
  this answer about the statistical package that I have planned to use:
 
  As R is not a validated software package, we would like to ask if it
  would rather be possible for you to use SAS, SPSS or another approved
  statistical software system.
 
  Could someone suggest me a 'polite' answer?
  TIA
  Giovanni
 

 Search the archives and you'll find a LOT of responses.

 Briefly, in my view there are no requirements, just some pharma
 companies that think there are.  FDA is required to accepted all
 submissions, and they get some where only Excel was used, or Minitab,
 and lots more.  There is a session on this at the upcoming R
 International Users Meeting in Iowa in August.  The session will include
 dicussions of federal regulation compliance for R, for those users who
 feel that such compliance is actually needed.

 Frank

 --
 Frank E Harrell Jr   Professor and Chair   School of Medicine
   Department of Biostatistics   Vanderbilt University

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
WenSui Liu
A lousy statistician who happens to know a little programming
(http://spaces.msn.com/statcompute/blog)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do 

[R] pnorm how to decide lower-tail true or false

2007-06-08 Thread Carmen Meier
Hi to all,
maybe the last question was not clear enough.
I did not found any hints how to decide whether it should use lower.tail 
or not.
As it is an extra R-feature ( written in 
http://finzi.psych.upenn.edu/R/Rhelp02a/archive/66250.html )
I do not find anything about it in any statistical books of me.
Regards Carmen

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Batch processing in Windows

2007-06-08 Thread Sébastien Bihorel
Hi,

I am a complete newbe to R, so the following problem will probably be 
trivial for most of you guys:  I get an error message every time I try 
to run a R file directly from the DOS shell.

My R file (test.R) is intended to create a basic graph and has a very 
simple code:

x-rep(1:10,1)
y-rep(1:10,1)
plot(x,y)

I am using the following command to call this file directly from the c:/ 
root:
C:/R CMD BATCH e:/Documents Seb/3_/test.R

And here is the error message (Translated from french to english):
'R' is not recognized as an internal or external command, an executable 
script or a command file

My OS is a french Windows XP sp2 and I am using R version 2.5.0. I 
wonder if the problem comes from an installation problem...

Thank you in advance for your help.

Sebastien

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to make a table of a desired dimension

2007-06-08 Thread Adaikalavan Ramasamy
You need to basically use table on factors with fixed pre-specified 
levels. For example:

  x - c(runif(100,10,40), runif(100,43,55))
  y - c(runif(100,7,35),  runif(100,37,50))
  z - c(runif(100,10,42), runif(100,45,52))
  xx - ceiling(x);  yy - ceiling(y);  zz - ceiling(z)


  mylevels - min( c(xx, yy, zz) ) : max( c(xx, yy, zz) )

  out - cbind( table( factor(xx, levels=mylevels) ),
table( factor(yy, levels=mylevels) ),
table( factor(zz, levels=mylevels) ) )

You could replace the last command with simply

  sapply( list(xx, yy, zz),
function(vec) table( factor(vec, levels=mylevels) ) )

Regards, Adai



Rubén Roa-Ureta wrote:
 Hi ComRades,
 
 I want to make a matrix of frequencies from vectors of a continuous 
 variable spanning different values. For example this code
 x-c(runif(100,10,40),runif(100,43,55))
 y-c(runif(100,7,35),runif(100,37,50))
 z-c(runif(100,10,42),runif(100,45,52))
 a-table(ceiling(x))
 b-table(ceiling(y))
 c-table(ceiling(z))
 a
 b
 c
 
 will give me three tables that start and end at different integer 
 values, and besides, they have 'holes' in between different integer 
 values. Is it possible to use 'table' to make these three tables have 
 the same dimensions, filling in the absent labels with zeroes? In the 
 example above, the desired tables should all start at 8 and tables 'a' 
 and 'c' should put a zero at labels '8' to '10', should all put zeros in 
 the frequencies of the labels corresponding to the holes, and should all 
 end at label '55'. The final purpose is the make a matrix and use 
 'matplot' to plot all the frequencies in one plot, such as
 
 #code valid only when 'a', 'b', and 'c' have the proper dimension
 p-mat.or.vec(48,4)
 p[,1]-8:55
 p[,2]-c(matrix(a)[1:48])
 p[,3]-c(matrix(b)[1:48])
 p[,4]-c(matrix(c)[1:48])
 matplot(p)
 
 I read the help about 'table' but I couldn't figure out if dnn, 
 deparse.level, or the other arguments could serve my purpose. Thanks for 
 your help
 Rubén
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Tools For Preparing Data For Analysis

2007-06-08 Thread Chris Evans

Martin Henry H. Stevens sent the following  at 08/06/2007 15:11:
 Is there an example available of this sort of problematic data that  
 requires this kind of data screening and filtering? For many of us,  
 this issue would be nice to learn about, and deal with within R. If a  
 package could be created, that would be optimal for some of us. I  
 would like to learn a tad more, if it were not too much effort for  
 someone else to point me in the right direction?
 Cheers,
 Hank
 On Jun 8, 2007, at 8:47 AM, Douglas Bates wrote:
 
 On 6/7/07, Robert Wilkins [EMAIL PROTECTED] wrote:
 As noted on the R-project web site itself ( www.r-project.org -

... rest snipped ...

OK, I can't resist that invitation.  I think there are many kinds of
problematic data.  I handle some nasty textish things in perl (and I
loved the purgatory quote) and I'm afraid I do some things in Excel and
some cleaning I can handle in R, but I never enter data directly into R.

However, one very common scenario I have faceda all my working life is
psych data from questionnaires or interviews in low budget work, mostly
student research or routine entry of therapists' data.  Typically you
have an identifier, a date, some demographics and then a lot of item
data.  There's little money (usual zero) involved for data entry and
cleaning but I've produced a lot of good(ish) papers out of this sort of
very low budget work over the last 20 years.  (Right at the other end of
a financial spectrum from the FDA/validated s'ware thread but this is
about validation again!)

The problem I often face is that people are lousy data entry machines
(well, actually, they vary ... enormously) and if they mess up the data
entry we all know how horrible this can be.

SPSS (boo hiss) used to have an excellent module, actually a
standalone PC/Windoze program, that allowed you to define variables so
they had allowed values and it would refuse to accept out of range or
out of acceptable entries, it also allowed you to create checking rules
and rules that would, in the light of earlier entries, set later values
and not ask about them.  In a rudimentary way you could also lay things
out on the screen so that it paginated where the q'aire or paper data
record did etc.  The final nice touch was that you could define some
variables as invariant and then set the thing so an independent data
entry person could re-enter the other data (i.e. pick up q'aire, see if
ID fits the one showing on screen, if so, enter the rest of the data).
It would bleep and not move on if you entered a value other than that
entered by the first person and you had to confirm that one of you was
right.

That saved me wasted weeks I'm sure on analysing data that turned out to
be awful and I'd love to see someone build something to replace that.

Currently I tend to use (boo hiss) Excel for this as everyone I work
with seems to have it (and not all can install open office and anyway I
haven't had time to learn that properly yet either ...) and I set up
spreadsheets with validation rules set.  That doesn't get the branching
rules and checks (e.g. if male, skip questions about periods, PMT and
pregnancies), or at least, with my poor Excel skills it doesn't.  I just
skip a column to indicate page breaks in the q'aire, and I get, when I
can, two people to enter the data separately and then use R to compare
the two spreadsheets having yanked them into data frames.

I would really, really love someone to develop (and perhaps replace) the
rather buggy edit() and fix() routines (seem to hang on big data frames
in Rcmdr which is what I'm trying to get students onto) with something
that did some or all of what SPSS/DE used to do for me or I bodge now in
Excel.  If any generous coding whiz were willing to do this, I'll try to
alpha and beta test and write help etc.

There _may_ be good open source things out there that do what I need but
something that really integrated into R would be another huge step
forward in being able to phase out SPSS in my work settings and phase in R.

Very best all,

Chris



-- 
Chris Evans [EMAIL PROTECTED] Skype: chris-psyctc
Professor of Psychotherapy, Nottingham University;
Consultant Psychiatrist in Psychotherapy, Notts PDD network;
Research Programmes Director, Nottinghamshire NHS Trust;
*If I am writing from one of those roles, it will be clear. Otherwise*
*my views are my own and not representative of those institutions*

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] pointwise confidence bands or interval values for a non parametric sm.regression

2007-06-08 Thread Mark Difford

Hi Martin,

Do please, at least, read the documentation for the package you are using!:

?sm.options   ## sub: display

## Example
with(iris, sm.regression(Sepal.Length, Sepal.Width, display=se))

Regards,
Mark Difford.


M. P. Papadatos wrote:
 
 Dear all,
 
 Is there a way to plot / calculate pointwise confidence bands or  
 interval values for a non parametric regression like sm.regression?
 
 Thank you in advance.
 
 Regards,
 
 Martin
 
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/pointwise-confidence-bands-or-interval-values-for-a-non-parametric-sm.regression-tf3890206.html#a11030924
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Batch processing in Windows

2007-06-08 Thread Gabor Grothendieck
R isn't in your path.  Either change your path to include it or place
Rcmd.bat from batchfiles anywhere in your existing path:

   http://code.google.com/p/batchfiles/

and then:

   Rcmd BATCH ...whatever...


On 6/8/07, Sébastien Bihorel [EMAIL PROTECTED] wrote:
 Hi,

 I am a complete newbe to R, so the following problem will probably be
 trivial for most of you guys:  I get an error message every time I try
 to run a R file directly from the DOS shell.

 My R file (test.R) is intended to create a basic graph and has a very
 simple code:

 x-rep(1:10,1)
 y-rep(1:10,1)
 plot(x,y)

 I am using the following command to call this file directly from the c:/
 root:
 C:/R CMD BATCH e:/Documents Seb/3_/test.R

 And here is the error message (Translated from french to english):
 'R' is not recognized as an internal or external command, an executable
 script or a command file

 My OS is a french Windows XP sp2 and I am using R version 2.5.0. I
 wonder if the problem comes from an installation problem...

 Thank you in advance for your help.

 Sebastien

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Tools For Preparing Data For Analysis

2007-06-08 Thread Dale Steele
For windows users, EpiData Entry http://www.epidata.dk/ is an
excellent (free) tool for data entry and documentation.--Dale


On 6/8/07, Chris Evans [EMAIL PROTECTED] wrote:

 Martin Henry H. Stevens sent the following  at 08/06/2007 15:11:
  Is there an example available of this sort of problematic data that
  requires this kind of data screening and filtering? For many of us,
  this issue would be nice to learn about, and deal with within R. If a
  package could be created, that would be optimal for some of us. I
  would like to learn a tad more, if it were not too much effort for
  someone else to point me in the right direction?
  Cheers,
  Hank
  On Jun 8, 2007, at 8:47 AM, Douglas Bates wrote:
 
  On 6/7/07, Robert Wilkins [EMAIL PROTECTED] wrote:
  As noted on the R-project web site itself ( www.r-project.org -

 ... rest snipped ...

 OK, I can't resist that invitation.  I think there are many kinds of
 problematic data.  I handle some nasty textish things in perl (and I
 loved the purgatory quote) and I'm afraid I do some things in Excel and
 some cleaning I can handle in R, but I never enter data directly into R.

 However, one very common scenario I have faceda all my working life is
 psych data from questionnaires or interviews in low budget work, mostly
 student research or routine entry of therapists' data.  Typically you
 have an identifier, a date, some demographics and then a lot of item
 data.  There's little money (usual zero) involved for data entry and
 cleaning but I've produced a lot of good(ish) papers out of this sort of
 very low budget work over the last 20 years.  (Right at the other end of
 a financial spectrum from the FDA/validated s'ware thread but this is
 about validation again!)

 The problem I often face is that people are lousy data entry machines
 (well, actually, they vary ... enormously) and if they mess up the data
 entry we all know how horrible this can be.

 SPSS (boo hiss) used to have an excellent module, actually a
 standalone PC/Windoze program, that allowed you to define variables so
 they had allowed values and it would refuse to accept out of range or
 out of acceptable entries, it also allowed you to create checking rules
 and rules that would, in the light of earlier entries, set later values
 and not ask about them.  In a rudimentary way you could also lay things
 out on the screen so that it paginated where the q'aire or paper data
 record did etc.  The final nice touch was that you could define some
 variables as invariant and then set the thing so an independent data
 entry person could re-enter the other data (i.e. pick up q'aire, see if
 ID fits the one showing on screen, if so, enter the rest of the data).
 It would bleep and not move on if you entered a value other than that
 entered by the first person and you had to confirm that one of you was
 right.

 That saved me wasted weeks I'm sure on analysing data that turned out to
 be awful and I'd love to see someone build something to replace that.

 Currently I tend to use (boo hiss) Excel for this as everyone I work
 with seems to have it (and not all can install open office and anyway I
 haven't had time to learn that properly yet either ...) and I set up
 spreadsheets with validation rules set.  That doesn't get the branching
 rules and checks (e.g. if male, skip questions about periods, PMT and
 pregnancies), or at least, with my poor Excel skills it doesn't.  I just
 skip a column to indicate page breaks in the q'aire, and I get, when I
 can, two people to enter the data separately and then use R to compare
 the two spreadsheets having yanked them into data frames.

 I would really, really love someone to develop (and perhaps replace) the
 rather buggy edit() and fix() routines (seem to hang on big data frames
 in Rcmdr which is what I'm trying to get students onto) with something
 that did some or all of what SPSS/DE used to do for me or I bodge now in
 Excel.  If any generous coding whiz were willing to do this, I'll try to
 alpha and beta test and write help etc.

 There _may_ be good open source things out there that do what I need but
 something that really integrated into R would be another huge step
 forward in being able to phase out SPSS in my work settings and phase in R.

 Very best all,

 Chris



 --
 Chris Evans [EMAIL PROTECTED] Skype: chris-psyctc
 Professor of Psychotherapy, Nottingham University;
 Consultant Psychiatrist in Psychotherapy, Notts PDD network;
 Research Programmes Director, Nottinghamshire NHS Trust;
 *If I am writing from one of those roles, it will be clear. Otherwise*
 *my views are my own and not representative of those institutions*

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



Re: [R] pnorm how to decide lower-tail true or false

2007-06-08 Thread Robert A LaBudde
At 01:31 PM 6/8/2007, Carmen wrote:
Hi to all,
maybe the last question was not clear enough.
I did not found any hints how to decide whether it should use lower.tail
or not.
As it is an extra R-feature ( written in
http://finzi.psych.upenn.edu/R/Rhelp02a/archive/66250.html )
I do not find anything about it in any statistical books of me.
Regards Carmen

pnorm(z, lower.tail=TRUE) (the R default) gives the probability of a 
normal variate being at or below z. This is the value commonly called 
the cumulative distribution function at the point z, or the integral 
from -Inf to z of the gaussian density.

pnorm(z, lower.tail=FALSE) gives the complement of the above, or 1 - 
cdf(z), and is the integral from z to Inf of the gaussian density.

E.g.,

  pnorm(1.96, lower.tail=TRUE)
[1] 0.9750021
  pnorm(1.96, lower.tail=FALSE)
[1] 0.02499790
 

Use lower.tail=TRUE if you are, e.g., finding the probability at the 
lower tail of a confidence interval or if you want to the probability 
of values no larger than z.

Use lower.tail=FALSE if you are, e.g., trying to calculate test value 
significance or at the upper confidence limit, or you want the 
probability of values z or larger.

You should use pnorm(z, lower.tail=FALSE) instead of 1-pnorm(z) 
because the former returns a more accurate answer for large z.

This is really simple issue, and has no inherent complexity 
associated with it.

Robert A. LaBudde, PhD, PAS, Dpl. ACAFS  e-mail: [EMAIL PROTECTED]
Least Cost Formulations, Ltd.URL: http://lcfltd.com/
824 Timberlake Drive Tel: 757-467-0954
Virginia Beach, VA 23464-3239Fax: 757-467-2947

Vere scire est per causas scire

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Batch processing in Windows

2007-06-08 Thread Bos, Roger
Alternatively, use the full path in your call to R as I do below:

F:\Program Files\R\R-2.4.1pat\bin\R.exe CMD BATCH --vanilla --slave whatever.R

HTH,

Roger

 

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Gabor 
Grothendieck
Sent: Friday, June 08, 2007 1:51 PM
To: Sébastien Bihorel
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] Batch processing in Windows

R isn't in your path.  Either change your path to include it or place Rcmd.bat 
from batchfiles anywhere in your existing path:

   http://code.google.com/p/batchfiles/

and then:

   Rcmd BATCH ...whatever...


On 6/8/07, Sébastien Bihorel [EMAIL PROTECTED] wrote:
 Hi,

 I am a complete newbe to R, so the following problem will probably be 
 trivial for most of you guys:  I get an error message every time I try 
 to run a R file directly from the DOS shell.

 My R file (test.R) is intended to create a basic graph and has a very 
 simple code:

 x-rep(1:10,1)
 y-rep(1:10,1)
 plot(x,y)

 I am using the following command to call this file directly from the 
 c:/
 root:
 C:/R CMD BATCH e:/Documents Seb/3_/test.R

 And here is the error message (Translated from french to english):
 'R' is not recognized as an internal or external command, an 
 executable script or a command file

 My OS is a french Windows XP sp2 and I am using R version 2.5.0. I 
 wonder if the problem comes from an installation problem...

 Thank you in advance for your help.

 Sebastien

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

** * 
This message is for the named person's use only. It may 
contain confidential, proprietary or legally privileged 
information. No right to confidential or privileged treatment 
of this message is waived or lost by any error in 
transmission. If you have received this message in error, 
please immediately notify the sender by e-mail, 
delete the message and all copies from your system and destroy 
any hard copies. You must not, directly or indirectly, use, 
disclose, distribute, print or copy any part of this message 
if you are not the intended recipient.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] glm() for log link and Weibull family

2007-06-08 Thread Robert A. LaBudde
I need to be able to run a generalized linear model with a log() link 
and a Weibull family, or something similar to deal with an extreme 
value distribution.

I actually have a large dataset where this is apparently necessary. 
It has to do with recovery of forensic samples from surfaces, where 
as much powder as possible is collected. This apparently causes the 
results to conform to some type of extreme value distribution, so 
Weibull is a reasonable starting point for exploration.

I have tried ('surface' and 'team' are factors)

glm(surfcount ~ surface*team, data=powderd, family=Gamma(link='log'))

but this doesn't quite do the trick. The standardized deviance 
residuals are still curved away from normal at the tails.

Thanks for any info you can give on this nonstandard model.

Robert A. LaBudde, PhD, PAS, Dpl. ACAFS  e-mail: [EMAIL PROTECTED]
Least Cost Formulations, Ltd.URL: http://lcfltd.com/
824 Timberlake Drive Tel: 757-467-0954
Virginia Beach, VA 23464-3239Fax: 757-467-2947

Vere scire est per causas scire

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] evaluating variables in the context of a data frame

2007-06-08 Thread Duncan Murdoch
On 6/8/2007 11:33 AM, Zack Weinberg wrote:
 On 6/7/07, Prof Brian Ripley [EMAIL PROTECTED] wrote:
  f - function(x, dat) evalq(x, dat)
  f(o, D)
  Error in eval(expr, envir, enclos) : object o not found
  g - function(x, dat) eval(x, dat)
  g(o, D)
  Error in eval(x, dat) : object o not found
 
  What am I doing wrong?  This seems to be what the helpfiles say you do
  to evaluate arguments in the context of a passed-in data frame...

 When you call f(o, D), the argument 'o' is evaluated in the current
 environment ('context' in R means something different).  Because of lazy
 evaluation, it is not evaluated until evalq is called, but it evaluated as
 if it was evaluated greedily.

 g(quote(o), D) will work.
 
 Thanks.
 
 After a bit more experimentation I figured out that this does what I want:
 
 h - function(x, d) eval(substitute(x), d, parent.frame())
 
 but I don't understand why the substitute() helps, or indeed why it
 has any effect at all...

Within the evaluation frame of h, x is a promise to evaluate an 
expression.  substitute(x) extracts the expression.  If you just use x, 
it gets evaluated in the frame from which h was called, rather than in a 
frame created from d.

Duncan Murdoch

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R is not a validated software package..

2007-06-08 Thread Cody_Hamilton

Not to mention all the work that goes into PROC TEMPLATE and ANNOTATE to
make SAS graphs presentable! I suspect that a lot of companies don't use
SAS graphs or tables at all - they just export the data from SAS to Excel.

-Cody

Cody Hamilton, PhD
Edwards Lifesciences

What I would love to have is some efficiency estimates for SAS macro
programming as done in pharma vs. using a high-level language.  My bias
is that SAS macro programming, which costs companies more than SAS
licenses, is incredibly inefficient.

Frank
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] wrapping lattice xyplot

2007-06-08 Thread Zack Weinberg
This is an expanded version of the question I tried to ask last night
- I thought I had it this morning, but it's still not working and I
just do not understand what is going wrong.

What I am trying to do is write a wrapper for lattice xyplot() that
passes a whole bunch of its secondary arguments, so that I can produce
similarly formatted graphs for several different data sets.  This is
what I've got:

graph - function (x, data, groups, xlab) {
  g - eval(substitute(groups), data, parent.frame())

  pg - function(x, y, group.number, ...) {
panel.xyplot(x, y, ..., group.number=group.number)
panel.text(2, unique(y[x==2]),
   levels(g)[group.number],
   pos=4, cex=0.5)
  }

  xyplot(x, data=data, groups=substitute(g),
  type='l',
  ylab=list(cex=1.1, label='Mean RT (ms)'),
  xlab=list(cex=1.1, label=xlab),
  scales=list(
x=list(alternating=c(1,1), tck=c(1,0)),
y=list(alternating=c(1,0))
),
  panel=panel.superpose,
  panel.groups=pg
  )
}

pg is supposed to pick g up from the lexical enclosure. I have no
idea whether that actually works, because it never gets that far.  A
typical call to this function looks like so:

 graph(est ~ pro | hemi, sm, obs, Probe type)

(where 'sm' is a data frame that really does contain all four columns
'est', 'pro', 'hemi', and 'obs', pinky swear) and, as it stands above,
invariably gives me this error:

Error in eval(expr, envir, enclos) : object est not found

I tried substitute(x) (as that seems to have cured a similar problem
with g) but then x is not a formula and method dispatch fails.

Help?
zw

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] evaluating variables in the context of a data frame

2007-06-08 Thread Zack Weinberg
On 6/8/07, Duncan Murdoch [EMAIL PROTECTED] wrote:
  After a bit more experimentation I figured out that this does what I want:
 
  h - function(x, d) eval(substitute(x), d, parent.frame())
 
  but I don't understand why the substitute() helps, or indeed why it
  has any effect at all...

 Within the evaluation frame of h, x is a promise to evaluate an
 expression.  substitute(x) extracts the expression.  If you just use x,
 it gets evaluated in the frame from which h was called, rather than in a
 frame created from d.

Thanks, that's helpful.  Could you comment on substitute() use in the
message I just posted which contains the actual code I'm trying to get
to work?  In addition to the question asked there, after your
explanation I still do not understand why

  g - ...
  xyplot ( ..., groups=g, ... )

should refuse to find g, and the same thing with groups=substitute(g)
works (well, gets farther before blowing up).

zw

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to find how many modes in 2 dimensions case

2007-06-08 Thread Patrick Wang
Hi,

Does anyone know how to count the number of modes in 2 dimensions using
kde2d function?

Thanks
Pat

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R is not a validated software package..

2007-06-08 Thread Marc Schwartz
On Fri, 2007-06-08 at 16:02 +0200, Giovanni Parrinello wrote:
 Dear All,
 discussing with a statistician of a pharmaceutical company I received 
 this answer about the statistical package that I have planned to use:
 
 As R is not a validated software package, we would like to ask if it 
 would rather be possible for you to use SAS, SPSS or another approved 
 statistical software system.
 
 Could someone suggest me a 'polite' answer?
 TIA
 Giovanni
 

The polite answer is that there is no such thing as 'FDA approved'
software for conducting clinical trials. The FDA does not approve,
validate or otherwise endorse software.

If the pharma company in question has developed their own list of
acceptable software applications that you must comply with, that is
different, but is independent of any FDA requirements.

As the saying used to be several decades ago, Nobody ever got fired for
buying IBM.  In the clinical trials realm today, the same could be said
for SAS or Oracle Clinical. 

That is a political, and perhaps a corporate legal counsel driven risk
aversion based issue, not a scientific one.  It is also a human
behavioral issue, as Bert noted, relative to fighting inertia, training
or re-training issues and the pre-existing investment in internal
processes and infrastructure.  This will change over time as more
statisticians, who have been trained in the use of R during their
academic years, enter into industry positions.

As others have noted, there is a PERCEPTION that somehow SAS is endorsed
by the FDA or that it constitutes a 'gold standard' of sorts. This is a
perception and not reality.

That being said:

There are a variety of relevant Guidance and Guideline documents that
the FDA has put forth to address these issues. Most recently, the FDA
approved final guidance for the use of computerized systems in clinical
investigations (May 2007):

http://www.fda.gov/OHRMS/DOCKETS/98fr/04d-0440-gdl0002.pdf

In addition, there is a General Principles of Software Validation
document:

http://www.fda.gov/cdrh/comp/guidance/938.html

The majority of the 21 CFR Part 11 requirements (audit trails,
electronic signatures, etc.) are relevant to systems that manage source
medical records. These would typically be database applications and
medical devices, not statistical applications. In our shop for example,
our Oracle 10g server has been implemented in accordance with these
requirements.

There is a 21 CFR Part 11 guidance document here:

http://www.fda.gov/ohrms/dockets/98fr/5667fnl.pdf

There are also all of the so-called FDA and ICH GxP (Good x Practice)
documents:

   http://www.fda.gov/oc/gcp/guidance.html
   http://www.ich.org/cache/compo/475-272-1.html

that provide a framework for operations in a regulated environment and
for relevant statistical practice guidance. The 'x' above is replaced by
words such as Clinical, Manufacturing, Laboratory, etc.

There is even a draft guidance document on the use of Bayesian
techniques for medical device trials:

http://www.fda.gov/cdrh/osb/guidance/1601.html


Some of the references in other posts have to do with software embedded
in medical devices, which could be anything such as bedside ECG
monitoring stations, diagnostic imaging systems, radiation therapy
instrumentation and pacemakers. These are generally not relevant to this
discussion.

The bottom line, is that while there is a burden on the part of the
'software publisher' to utilize and document reasonable manufacturing,
version control, software maintenance and quality processes, the
overwhelming burden is on the END USER to determine that their
statistical package is suitable for the application intended and to have
written SOPs (Standard Operating Procedures) to define how they will
validate their installation and use of the statistical software. 

This goes to some of the comments that Cody had relative to IQ/OQ/PQ
documentation, which refers to Installation Qualification, Operational
Qualification and Performance Qualification.

For example, in the context of R, the use of make check-all and the
retention of the output subsequent to compiling R from source code can
be part of that documentation process. Bert referred to this in his
comments.

Beyond that, the details of such documentation will be driven by a
variety of characteristics that are relevant to the nature of the
environment (academic, commercial, clinical, pre-clinical, etc.) in
which one is operating and related considerations.

As Frank noted, there will be a session at useR!2007:

  http://user2007.org/

entitled The Use of R in Clinical Trials and Industry-Sponsored Medical
Research.  This session will take place on Friday, August 10 and I
would invite any interested parties to attend the meetings. I think that
you will find the subject matter quite enlightening.

One closing comment:  There is increasing use of R within the FDA itself
and this will only further help to assuage the fears of prospective
users over time.

Best regards,

Marc 

Re: [R] glm() for log link and Weibull family

2007-06-08 Thread Prof Brian Ripley
On Fri, 8 Jun 2007, Robert A. LaBudde wrote:

 I need to be able to run a generalized linear model with a log() link
 and a Weibull family, or something similar to deal with an extreme
 value distribution.

The Weibull with log link is not a GLM, but survreg() in package survival
can fit it, as well as other extreme-value distributions.

 I actually have a large dataset where this is apparently necessary.
 It has to do with recovery of forensic samples from surfaces, where
 as much powder as possible is collected. This apparently causes the
 results to conform to some type of extreme value distribution, so
 Weibull is a reasonable starting point for exploration.

 I have tried ('surface' and 'team' are factors)

 glm(surfcount ~ surface*team, data=powderd, family=Gamma(link='log'))

 but this doesn't quite do the trick. The standardized deviance
 residuals are still curved away from normal at the tails.

 Thanks for any info you can give on this nonstandard model.

It's perfectly standard, just not a GLM.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] still trying to wrap xyplot - ignore previous

2007-06-08 Thread Zack Weinberg
As you may not be surprised to hear, no sooner did I post the previous
message than I realized I had a really dumb mistake.  I've now gotten
a bit farther but am still stuck.  New code:

graph - function (x, data, groups, xlab) {
  pg - function(x, y, group.number, ...) fnord
  body(pg) - substitute({
panel.xyplot(x, y, ..., group.number=group.number)
panel.text(2, unique(y[x==2]),
   levels(G)[group.number],
   pos=4, cex=0.5)
  }, list(G=eval(substitute(groups), data, parent.frame(

  print(xyplot(x, data=data, groups=substitute(groups),
   type='l',
   ylab=list(cex=1.1, label='Mean RT (ms)'),
   xlab=list(cex=1.1, label=xlab),
   scales=list(
 x=list(alternating=c(1,1), tck=c(1,0)),
 y=list(alternating=c(1,0))
 ),
   panel=panel.superpose,
   panel.groups=pg
  ))
}

Questions:
1) The groups=substitute(groups) bit (in the call to xyplot) still
doesn't work.  As far as I can tell, xyplot wants the *symbol* which
is the name of the factor (in the data frame) to group by.
The above seems to wind up passing it the symbol groups, which
causes the prepanel function to barf.  I have not been able to find
any way to evaluate one layer of groups to get me the symbol passed
in, rather than the value of that symbol.  Am I right?  How do I give
it what it wants?

2) Why do I have to do that stupid dance with replacing the body of
pg?  The documentation leads me to believe this is a lexically scoped
language, shouldn't it be able to pick G out of the enclosing frame?

zw

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Tools For Preparing Data For Analysis

2007-06-08 Thread Frank E Harrell Jr
Dale Steele wrote:
 For windows users, EpiData Entry http://www.epidata.dk/ is an
 excellent (free) tool for data entry and documentation.--Dale

Note that EpiData seems to work well under linux using wine.
Frank

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R is not a validated software package..

2007-06-08 Thread Cody_Hamilton


The fact that FDA statisticians are using R also assuages one of the main
concerns that I have heard voiced about using R for FDA submissions - that
there would be no statisticians available at FDA to review R code which
would seriously delay the review of a submission.

Mark also brings up a good point by mentioning the FDA guidance on Bayesian
submissions.  If SAS were the only approved product, Bayesian trials would
be in real trouble.

Cody Hamilton, PhD
Staff Biostatistician
Edwards Lifesciences

Disclaimer:  As always, I am speaking for myself and not necessarily for
Edwards lifesciences.

One closing comment:  There is increasing use of R within the FDA itself
and this will only further help to assuage the fears of prospective
users over time.

Best regards,

Marc Schwartz
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to find how many modes in 2 dimensions case

2007-06-08 Thread Bert Gunter
Note that the number of modes (local maxima??)  is a function of the
bandwidth, so I'm not sure your question is even meaningful. 

Bert Gunter
Genentech Nonclinical Statistics
South San Francisco, CA 94404
650-467-7374

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Patrick Wang
Sent: Friday, June 08, 2007 11:54 AM
To: R-help@stat.math.ethz.ch
Subject: [R] how to find how many modes in 2 dimensions case

Hi,

Does anyone know how to count the number of modes in 2 dimensions using
kde2d function?

Thanks
Pat

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to find how many modes in 2 dimensions case

2007-06-08 Thread Patrick Wang
Thanks for the reply,

maybe I shall say bumps, I can use persp to show a density on a X Y
dimensions.
one peak is one mode I think. I try to find an automatic way to detect how
many peaks of the densities.

Pat
 Note that the number of modes (local maxima??)  is a function of the
 bandwidth, so I'm not sure your question is even meaningful.

 Bert Gunter
 Genentech Nonclinical Statistics
 South San Francisco, CA 94404
 650-467-7374

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Patrick Wang
 Sent: Friday, June 08, 2007 11:54 AM
 To: R-help@stat.math.ethz.ch
 Subject: [R] how to find how many modes in 2 dimensions case

 Hi,

 Does anyone know how to count the number of modes in 2 dimensions using
 kde2d function?

 Thanks
 Pat

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >