[R] Help!!

2004-06-30 Thread Sankalp Chaturvedi
Hello,

I am a Second year PhD student in Dept of Management and Organization
from NUS Business School, Singapore. Presently I am doing a multilevel
analysis for a small research project with my supervisor.

 

While surfing the web, I found your website on multilevel analysis.
These are very useful for the understanding of the concept to me. I was
also trying to learn more by completing the lab exercises. For doing
that I have installed R 1.6.2 software (and also R1.9.1) from the web.

 

Actually, I need to calculate inter member agreement (rwg) and inter
member reliability (ICC). I was told that the R software developed by
you can help me in this regard. But at the same time I am not able to
upload multilevel package from the library by using the following
command.

 

  library(multilevel)

It says the package doesn't exist. 

 

 

I really don't know how to proceed further. As I want to calculate Rwg
and ICC. I will be very thankful to you if you can help me in this
regard.

 

Thanking you

 

Best regards,

Sankalp

 


[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] nls fitting problems (singularity)

2004-06-30 Thread Gabor Grothendieck

Have a look at optim (which supports a number of different algorithms via
the method= arg) and segmented in package segmented which does segmented
regression.

For example,

ss - function(par) {
b - par[1]; c1 - par[2]; c2 - par[3]; d - par[4]
x - df1$x; y - df1$y
sum((y - (d+(x-b)*c1*(x-b0)+(x-b)*c2*(x-b=0)))^2)
}
optim(sl,ss)



Karl Knoblick karlknoblich at yahoo.de writes:

: 
: Hallo!
: 
: I have a problem with fitting data with nls. The first
: example with y1 (data frame df1) shows an error, the
: second works fine.
: 
: Is there a possibility to get a fit (e.g. JMP can fit
: also data I can not manage to fit with R). Sometimes I
: also got an error singularity with starting
: parameters.
: 
: # x-values
: x-c(-1,5,8,11,13,15,16,17,18,19,21,22)
: # y1-values (first data set)
: y1=c(-55,-22,-13,-11,-9.7,-1.4,-0.22,5.3,8.5,10,14,20)
: # y2-values (second data set)
: y2=c(-92,-42,-15,1.3,2.7,8.7,9.7,13,11,19,18,22)
: 
: # data frames
: df1-data.frame(x=x, y=y1)
: df2-data.frame(x=x, y=y2)
: 
: # start list for parameters
: sl-list( d=0, b=10, c1=90, c2=20) 
: 
: # y1-Analysis - Result: Error in ...  singular
: gradient
: nls(y~d+(x-b)*c1*(x-b0)+(x-b)*c2*(x-b=0), data=df1,
: start=sl)
: # y2-Analysis - Result: working...
: nls(y~d+(x-b)*c1*(x-b0)+(x-b)*c2*(x-b=0), data=df2,
: start=sl)
: 
: # plots to look at data
: par(mfrow=c(1,2))
: plot(df1$x,df1$y)
: plot(df2$x,df2$y)
: 
: Perhaps there is another fitting routine? Can anybody
: help?
: 
: Best wishes,
: Karl
: 
: 
:   
: 
:   
:   
: ___
: Bestellen Sie Y! DSL und erhalten Sie die AVM FritzBox SL fr 0.
: Sie sparen 119 und bekommen 2 Monate Grundgebhrbefreiung.
: 
: __
: R-help at stat.math.ethz.ch mailing list
: https://www.stat.math.ethz.ch/mailman/listinfo/r-help
: PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
: 
:

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] naive question

2004-06-30 Thread Gabor Grothendieck
Douglas Bates bates at stat.wisc.edu writes:

: If you are routinely working with very large data sets it would be 
: worthwhile learning to use a relational database (PostgreSQL, MySQL, 
: even Access) to store the data and then access it from R with RODBC or 
: one of the specialized database packages.

Along this line, if you have and already know another stat package then
you could read the data into that, write it out in that package's format
and then read the written out file into R.  The R foreign package 
and the R Hmisc package support a number of such foreign formats.
Don't know about speed -- it probably varies by format so you will
have to experiment. 

Not sure what the current status is of HDF5, CDF, netCDF, etc. are for R but
those may or may not be formats to consider as well.

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] anti-R vitriol

2004-06-30 Thread TEMPL Matthias
Hi,

I wonder, why SAS should be better in time for reading a data in the system.
I have an example, that shows that R is (sometimes?, always?) faster.

-
Data with 14432 observations and 120 variables.
Time for reading the data:

SAS 8e:
data testt;
set l1.lse01;run;

real time   1.46 seconds
  cpu time0.18 seconds

R 1.9.0:
system.time(read.table(lse01.txt,header=T))
[1] 0.63 0.06 6.22   NA   NA


And this is 2.5 times faster as SAS. 
(SAS reads the .sas7bdat and R the .txt file)

I´m working with SAS (I should working with SAS) and R (I'm going to work with R) on 
the same Computer. In my examples about time series and in something simple but also 
time consuming procedures like summaries,... R is always 2 times faster and sometimes 
30 times faster (with the same results).
I think R is a great software and you can do more things as in SAS.
Some new developments in SAS 9, like COM-server to Excel, some new procedures, better 
graphs, ... is developed and implemented in R for many years ago.
Thanks to the R Development Team!!!

Matthias

 -Ursprüngliche Nachricht-
 Von: Liaw, Andy [mailto:[EMAIL PROTECTED] 
 Gesendet: Dienstag, 29. Juni 2004 20:21
 An: 'Barry Rowlingson'; R-help
 Betreff: RE: [R] anti-R vitriol
 
 
  From: Barry Rowlingson
  
  A colleague is receiving some data from another person. That person
  reads the data in SAS and it takes 30s and uses 64k RAM. 
 That person 
  then tries to read the data in R and it takes 10 minutes and uses a 
  gigabyte of RAM. Person then goes on to say:
  
 It's not that I think SAS is such great software,
 it's not.  But I really hate badly designed
 software.  R is designed by committee.  Worse,
 it's designed by a committee of statisticians.
 They tend to confuse numerical analysis with
 computer science and don't have any idea about
 software development at all.  The result is R.
  
 I do hope [your colleague] won't have to waste time doing
 [this analysis] in an outdated and poorly designed piece
 of software like R.
  
  Would any of the committee like to respond to this? Or
  shall we just 
  slap our collective forehead and wonder how someone could get 
  such a view?
  
  Barry
  
 
 My $0.02:
 
 R, being a flexible programming language, has an amazing 
 ability to cope with people's laziness/ignorance/inelegance, 
 but it comes at a (sometimes
 hefty) price.  While there is no specifics on the situation 
 leading to the person's comments, here's one (not as extreme) 
 example that I happen to come across today:
 
  system.time(spam - read.table(data_dmc2003_train.txt,
 + header=T, 
 + colClasses=c(rep(numeric, 833), 
 +  character)))
 [1] 15.92  0.09 16.80NANA
  system.time(spam - read.table(data_dmc2003_train.txt, header=T))
 [1] 187.29   0.60 200.19 NA NA
 
 My SAS ability is rather serverely limited, but AFAIK, one 
 needs to specify _all_ variables to be read into a dataset in 
 order to read in the data in SAS.  If one has that 
 information, R can be very efficient as well.  Without that 
 information, one gets nothing in SAS, or just let R does the 
 hard work.
 
 Best,
 Andy
 
 __
 [EMAIL PROTECTED] mailing list 
 https://www.stat.math.ethz.ch/mailman/listinfo /r-help
 PLEASE 
 do read the posting guide! 
 http://www.R-project.org/posting-guide.html


__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Large addresses (Rgui.exe editing)

2004-06-30 Thread Franck Siclon
I'm running R under a 32-bit Win2003 Server with 24 Go RAM ...
I read the FAQ and found that I have to edit my boot.ini file with the 
/PAE switch; Done.
I noticed also that I have to set the R image header with the flag 
/LARGEADDRESSAWARE, but to do this I need Microsoft Visual Studio 6.0 
software that I do not have ... Because Pr Ripley clearly mentionned the 
steps to follow in the rw-FAQ, I'm wonderring whether someone has already 
edited the R executable in that way, and thus could share this new exe 
file with others ??
Many thanks !

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] GLM problem

2004-06-30 Thread Varrin muriel
HI, I am a studient, so don't be surprise if my question seems so simple for you...
I have a dataframe with 6 qualitative variables divided in 33 modalities, 2 
qualitatives variables and 78 lines. I use a glm to know wich variables have 
interactions... I would like to know if its normal that one (the first in alphabetical 
order) of the modalities of each qualitatives variable doesn't appear in the results?
Here is what i did:
glm-glm(data$IP~data$temp*data$fvent*data$dirvent*data$Ensol+data$mois*data$mil, 
family=gaussian)
summary(glm)
anova(glm, test=Chisq)
thanck you a lot and excuse me again for my little and perhaps simple question
muriel varrin (french student in biostatistical)





-




[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Large addresses (Rgui.exe editing)

2004-06-30 Thread Prof Brian Ripley
It's undesirable to need Visual Studio (I suspect later versions than 6
also work), and Duncan Murdoch wrote some standalone code to do this which
he may be willing to share (but he is away until next week).

Meanwhile I have put edited .exe's for rw1091 at

http://www.stats.ox.ac.uk/pub/RWin/EditedEXEs.zip

but will not leave them there for ever.

On Wed, 30 Jun 2004, Franck Siclon wrote:

 I'm running R under a 32-bit Win2003 Server with 24 Go RAM ...
 I read the FAQ and found that I have to edit my boot.ini file with the 
 /PAE switch; Done.
 I noticed also that I have to set the R image header with the flag 
 /LARGEADDRESSAWARE, but to do this I need Microsoft Visual Studio 6.0 
 software that I do not have ... Because Pr Ripley clearly mentionned the 
 steps to follow in the rw-FAQ, I'm wonderring whether someone has already 
 edited the R executable in that way, and thus could share this new exe 
 file with others ??
 Many thanks !

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] GLM problem

2004-06-30 Thread Prof Brian Ripley
On Wed, 30 Jun 2004, Varrin muriel wrote:

 HI, I am a studient, so don't be surprise if my question seems so simple
 for you...

 I have a dataframe with 6 qualitative variables divided in 33
 modalities, 2 qualitatives variables and 78 lines. I use a glm to know
 wich variables have interactions... I would like to know if its normal
 that one (the first in alphabetical order) of the modalities of each
 qualitatives variable doesn't appear in the results?

Yes.  That's the way R codes variables.  I'm not going to rewrite the
literature here, so please consult a good book (and naturally I will
recommend Venables  Ripley, 2002).  Others may be able to tell you if any
of the French guides on CRAN or elsewhere address this point (Emmanuel
Paradis's seems not to get that far).

 Here is what i did:
 glm-glm(data$IP~data$temp*data$fvent*data$dirvent*data$Ensol+data$mois*data$mil, 
 family=gaussian)

Please use something like

glm(IP ~ temp+fvent, data=data, family=gaussian)

and don't call the result `glm' (or call your data, `data').  Also, this
is not a question about GLMs with that family but about linear models, so
why not use lm.

 summary(glm)
 anova(glm, test=Chisq)
 thanck you a lot and excuse me again for my little and perhaps simple question
 muriel varrin (french student in biostatistical)

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] GLM problem

2004-06-30 Thread Peter Dalgaard
Prof Brian Ripley [EMAIL PROTECTED] writes:

 and don't call the result `glm' (or call your data, `data').  Also, this
 is not a question about GLMs with that family but about linear models, so
 why not use lm.
 
  summary(glm)
  anova(glm, test=Chisq)
 *

And shouldn't this be F? Do you really know that your observations
have variance 1?

-- 
   O__   Peter Dalgaard Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics 2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark  Ph: (+45) 35327918
~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Question about mesurating time

2004-06-30 Thread zze-PELAY Nicolas FTRD/DMR/BEL
Hello , 
Is there any function to mesurate the duration of a procedure (like tic
and toc in matlab) ?

Tic
Source(procedure.R)
Toc

(toc is the duration between the execution of tic and the execution of
toc)

Thank you

nicolas

[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Question about mesurating time

2004-06-30 Thread Wolski
?system.time



*** REPLY SEPARATOR  ***

On 30.06.2004 at 10:59 zze-PELAY Nicolas FTRD/DMR/BEL wrote:

Hello , 
Is there any function to mesurate the duration of a procedure (like tic
and toc in matlab) ?

Tic
Source(procedure.R)
Toc

(toc is the duration between the execution of tic and the execution of
toc)

Thank you

nicolas

   [[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Question about mesurating time

2004-06-30 Thread Prof Brian Ripley
help.search(time)

gives

proc.time(base) Running Time of R
system.time(base)   CPU Time Used

both of which could be used.

On Wed, 30 Jun 2004, zze-PELAY Nicolas FTRD/DMR/BEL wrote:

 Is there any function to mesurate the duration of a procedure (like tic
 and toc in matlab) ?
 
 Tic
 Source(procedure.R)
 Toc
 
 (toc is the duration between the execution of tic and the execution of
 toc)

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Help!!

2004-06-30 Thread Karl Knoblick
Hi!

Have you downloaded the package multilevel first?

Best wishes,
Karl






___
Bestellen Sie Y! DSL und erhalten Sie die AVM FritzBox SL für 0€.
Sie sparen 119€ und bekommen 2 Monate Grundgebührbefreiung.

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Installation of R-1.9.1.tgz

2004-06-30 Thread Frank.Muehlau

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] rgl installation problems

2004-06-30 Thread Martyn Plummer
On Wed, 2004-06-30 at 00:52, E GCP wrote:
 Thanks for your replies. I do not HTML-ize my mail, but free email accounts 
 do that and there is not a switch to turn it off.  I apologize in advance.
 
 I installed R from the redhat package provided by Martyn Plummer. It 
 installed fine and without problems. I can use R and have installed and used 
 other packages within R without any problems whatsoever. I do not think the 
 problem is with R or its installation.   I do think there is a problem with 
 the installation of rgl_0.64-13.tar.gz on RedHat 9 (linux).  So, if there is 
 anybody out there who has installed succesfully rgl_0.64-13.tar.gz on RedHat 
 9, I would like to know how.

This is a little strange.  I'm now building RPMS for older Red Hat
versions on FC2 using a tool called Mach.  There is a possibility that
there is some configuration problem, but I can't see it. As Brian has
pointed out, you are missing the crucial -shared flag when building
the shared library for rgl. This comes from the line

SHLIB_CXXLDFLAGS = -shared

in the file /usr/lib/R/etc/Makeconf.  This is present in my latest RPM
for Red Hat ( R-1.9.1-0.fdr.2.rh90.i386.rpm ) so I don't know why it
isn't working for you.

Check that you do have the latest RPM and that you don't have a locally
built version of R in /usr/local since this will have precedence on your
PATH.

Martyn

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] rgl installation problems

2004-06-30 Thread Peter Dalgaard
Martyn Plummer [EMAIL PROTECTED] writes:

 On Wed, 2004-06-30 at 00:52, E GCP wrote:
  Thanks for your replies. I do not HTML-ize my mail, but free email accounts 
  do that and there is not a switch to turn it off.  I apologize in advance.
  
  I installed R from the redhat package provided by Martyn Plummer. It 
  installed fine and without problems. I can use R and have installed and used 
  other packages within R without any problems whatsoever. I do not think the 
  problem is with R or its installation.   I do think there is a problem with 
  the installation of rgl_0.64-13.tar.gz on RedHat 9 (linux).  So, if there is 
  anybody out there who has installed succesfully rgl_0.64-13.tar.gz on RedHat 
  9, I would like to know how.
 
 This is a little strange.  I'm now building RPMS for older Red Hat
 versions on FC2 using a tool called Mach.  There is a possibility that
 there is some configuration problem, but I can't see it. As Brian has
 pointed out, you are missing the crucial -shared flag when building
 the shared library for rgl. This comes from the line
 
 SHLIB_CXXLDFLAGS = -shared
 
 in the file /usr/lib/R/etc/Makeconf.  This is present in my latest RPM
 for Red Hat ( R-1.9.1-0.fdr.2.rh90.i386.rpm ) so I don't know why it
 isn't working for you.
 
 Check that you do have the latest RPM and that you don't have a locally
 built version of R in /usr/local since this will have precedence on your
 PATH.

One thing to notice is that rgl does its own configure step (hadn't
noticed on my first build since g++ is so noisy about a number of
other things), and it is possible that something is getting messed up
on E's RH9 system.

Redirecting the output as in

R CMD INSTALL rgll_0.64-13.tar.gz 21 | tee rgl.log

and poring over the result might prove informative.

-- 
   O__   Peter Dalgaard Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics 2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark  Ph: (+45) 35327918
~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] anti-R vitriol

2004-06-30 Thread John Maindonald
I am curious.  What were the dimensions of this data set?  Did this 
person know use read.table(), or scan().  Did they know about the 
possibility of reading the data one part at a time?

The way that SAS processes the data row by row limits what can be done. 
 It is often possible with scant loss of information, and more 
satisfactory, to work with a subset of the large data set or with 
multiple subsets.  Neither SAS (in my somewhat dated experience of it) 
nor R is entirely satisfactory for this purpose.  But at least in R, 
given a subset that fits so easily into memory that the graphs are not 
masses of black, there are few logistic problems in doing, rapidly and 
interactively, a variety of manipulations and plots, with each new task 
taking advantage of the learning that has gone before.  To do that well 
in the SAS world, it is necessary to use something like JMP or its 
equivalent in one of the newer modules, which process data in a way 
that is not all that different from R.

I have wondered about possibilities for a suite of functions that would 
make it easy to process through R data that is stored in one large data 
set, with a mix of adding a new variable or variables, repeating a 
calculation on successive subsets of the data, producing predictions or 
suchlike for separate subsets, etc. Database connections may be the way 
to go (c.f., the Ripley and Fei Chen paper at ISI 2003), but it might 
also be useful to have a simple set of functions that would handle some 
standard requirements.

John Maindonald.
On 30 Jun 2004, at 8:02 PM, Barry Rowlingson 
[EMAIL PROTECTED] wrote:

A colleague is receiving some data from another person. That person 
reads the data in SAS and it takes 30s and uses 64k RAM. That person 
then tries to read the data in R and it takes 10 minutes and uses a 
gigabyte of RAM. Person then goes on to say:

  It's not that I think SAS is such great software,
  it's not.  But I really hate badly designed
  software.  R is designed by committee.  Worse,
  it's designed by a committee of statisticians.
  They tend to confuse numerical analysis with
  computer science and don't have any idea about
  software development at all.  The result is R.
  I do hope [your colleague] won't have to waste time doing
  [this analysis] in an outdated and poorly designed piece
  of software like R.
Would any of the committee like to respond to this? Or shall we just 
slap our collective forehead and wonder how someone could get such a 
view?

John Maindonald email: [EMAIL PROTECTED]
phone : +61 2 (6125)3473fax  : +61 2(6125)5549
Centre for Bioinformation Science, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] pca with missing values

2004-06-30 Thread Angel Lopez
I need to perform a principal components analysis on a matrix with 
missing values.
I've searched the R web site but didn't manage to find anything.
Any pointers/guidelines are much appreciatted.
Angel

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] pca with missing values

2004-06-30 Thread Prof Brian Ripley
On Wed, 30 Jun 2004, Angel Lopez wrote:

 I need to perform a principal components analysis on a matrix with 
 missing values.
 I've searched the R web site but didn't manage to find anything.

?princomp

has a description of an na.action argument, and 

help.search(missing values)

comes up with several relevant entries.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] nls fitting problems (singularity)

2004-06-30 Thread Douglas Bates
Often when nls doesn't converge there is a good reason for it.
I'm on a very slow internet connection these days and will not be able 
to look at the data myself but I ask you to bear in mind that, when 
dealing with nonlinear models, there are model/data set combinations for 
which there are no parameter estimates.

Gabor Grothendieck wrote:
Have a look at optim (which supports a number of different algorithms via
the method= arg) and segmented in package segmented which does segmented
regression.
For example,
ss - function(par) {
b - par[1]; c1 - par[2]; c2 - par[3]; d - par[4]
x - df1$x; y - df1$y
sum((y - (d+(x-b)*c1*(x-b0)+(x-b)*c2*(x-b=0)))^2)
}
optim(sl,ss)

Karl Knoblick karlknoblich at yahoo.de writes:
: 
: Hallo!
: 
: I have a problem with fitting data with nls. The first
: example with y1 (data frame df1) shows an error, the
: second works fine.
: 
: Is there a possibility to get a fit (e.g. JMP can fit
: also data I can not manage to fit with R). Sometimes I
: also got an error singularity with starting
: parameters.
: 
: # x-values
: x-c(-1,5,8,11,13,15,16,17,18,19,21,22)
: # y1-values (first data set)
: y1=c(-55,-22,-13,-11,-9.7,-1.4,-0.22,5.3,8.5,10,14,20)
: # y2-values (second data set)
: y2=c(-92,-42,-15,1.3,2.7,8.7,9.7,13,11,19,18,22)
: 
: # data frames
: df1-data.frame(x=x, y=y1)
: df2-data.frame(x=x, y=y2)
: 
: # start list for parameters
: sl-list( d=0, b=10, c1=90, c2=20) 
: 
: # y1-Analysis - Result: Error in ...  singular
: gradient
: nls(y~d+(x-b)*c1*(x-b0)+(x-b)*c2*(x-b=0), data=df1,
: start=sl)
: # y2-Analysis - Result: working...
: nls(y~d+(x-b)*c1*(x-b0)+(x-b)*c2*(x-b=0), data=df2,
: start=sl)
: 
: # plots to look at data
: par(mfrow=c(1,2))
: plot(df1$x,df1$y)
: plot(df2$x,df2$y)
: 
: Perhaps there is another fitting routine? Can anybody
: help?
: 
: Best wishes,
: Karl
: 
: 
: 	
: 
: 	
: 		
: ___
: Bestellen Sie Y! DSL und erhalten Sie die AVM FritzBox SL fr 0.
: Sie sparen 119 und bekommen 2 Monate Grundgebhrbefreiung.
: 
: __
: R-help at stat.math.ethz.ch mailing list
: https://www.stat.math.ethz.ch/mailman/listinfo/r-help
: PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
: 
:

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] MacOS X binaries won't install

2004-06-30 Thread Ulises Mora Alvarez
Hi!

I'm afraid we need some more bits of information. Do you have 
administrative privileges?, which version of R are you trying to install?, 
Have you read the RMacOSX FAQ?


On Tue, 29 Jun 2004, Ruben Solis wrote:

 I've tried installing the MacOS X binaries for R available at:
 
 http://www.bioconductor.org/CRAN/
 
 I'm running MacOS X version 10.2.8.
 
 I get a message indicating the installation is successful, but when I 
 double-click on the R icon that shows up in my Applications folder, the 
 application seems to try to open but closes immediately.
 
 I looked for  /Library/Frameworks/R.framework (by typing ls 
 /Library/Frameworks) and it does not appear.  A global search for 
 R.framework yields no results, so it seems that the installation is not 
 working. (I was going to try command line execution.)
 
 Any help would be appreciated.  Thanks! - RSS
 
 Ruben S. Solis
 
 __
 [EMAIL PROTECTED] mailing list
 https://www.stat.math.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
 

-- 
Ulises M. Alvarez
LAB. DE ONDAS DE CHOQUE
FISICA APLICADA Y TECNOLOGIA AVANZADA
UNAM
[EMAIL PROTECTED]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] nls fitting problems (singularity)

2004-06-30 Thread Peter Dalgaard
Douglas Bates [EMAIL PROTECTED] writes:

 Often when nls doesn't converge there is a good reason for it.
 
 I'm on a very slow internet connection these days and will not be able
 to look at the data myself but I ask you to bear in mind that, when
 dealing with nonlinear models, there are model/data set combinations
 for which there are no parameter estimates.


In this particular case, the model describes a curve consisting of two
line segments that meet at the point (b,d)

  : nls(y~d+(x-b)*c1*(x-b0)+(x-b)*c2*(x-b=0), data=df2,

Now if b is between the two smallest x, you can diddle b, c1, and d in
such a way that the value at x1 is constant. I.e. the model becomes
unidentifiable. Putting trace=T suggests that this is what happens in
this example.


-- 
   O__   Peter Dalgaard Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics 2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark  Ph: (+45) 35327918
~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] formatting

2004-06-30 Thread rivin


I could not figure this out from the documentation: is there a way to send
formatted non-graphical data to a fancy output device (eg, latex, pdf...)
For example, if I want to include the summary of a linear model in a
document, I might want to have it automatically texified.

 Thanks!

Igor

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] formatting

2004-06-30 Thread tobias . verbeke




[EMAIL PROTECTED] wrote on 30/06/2004 16:05:15:



 I could not figure this out from the documentation: is there a way to
send
 formatted non-graphical data to a fancy output device (eg, latex, pdf...)
 For example, if I want to include the summary of a linear model in a
 document, I might want to have it automatically texified.

Have a look at the latex() function in package Hmisc,
and/or at package xtable

If you want to use R code in a LaTeX document, you can
use Sweave; see

http://www.ci.tuwien.ac.at/~leisch/Sweave/

HTH,
Tobias

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Principal Surface function help

2004-06-30 Thread Fred
Dear All

Do you know some functions that can perform
the PRINCIPAL SURFACE estimation?

Please give me a hint.
Thanks for your help in advance.


Fred
[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] outlier tests

2004-06-30 Thread Greg Tarpinian
I have been learning about some outlier tests -- Dixon
and Grubb, specifically -- for small data sets.  When
I try help.start() and search for outlier tests, the
only response I manage to find is the Bonferroni test
avaiable from the CAR package...  are there any other
packages the offer outlier tests?  Are the Dixon and
Grubb tests good for small samples or are others
more
recommended?

Much thanks in advance,

 Greg

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] outlier tests

2004-06-30 Thread Prof Brian Ripley
I thought outlier tests were mainly superseded two decades ago by the use
of robust methods -- they certainly were in analytical chemistry, for
example.  All outlier tests are bad in the sense that outliers will
damage the results long before they are detected.  See e.g.

@Article{AMC.89a,
  author   = {Analytical Methods Committee},
  title= Robust statistics --- how not to reject outliers. {Part}
  1. {Basic} concepts,
  journal  = The Analyst,
  volume   = 114,
  pages= 1693--1697,
  year = 1989,
}


On Wed, 30 Jun 2004, Greg Tarpinian wrote:

 I have been learning about some outlier tests -- Dixon and Grubb,
 specifically -- for small data sets.  When I try help.start() and search
 for outlier tests, the only response I manage to find is the Bonferroni
 test avaiable from the CAR package...  are there any other packages the
 offer outlier tests?

That's not an outlier test in the sense used by Dixon and Grubb, but is an 
illustration of the point about robust methods being better, in this case 
protecting better against multiple outliers.

 Are the Dixon and Grubb tests good for small samples or are others
 more recommended?

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] camberra distance?

2004-06-30 Thread Angel Lopez
From Legendre  Legendre Numerical Ecology I read: The Australians 
Lance  Williams (1967a) give several variants of the Manhattan metric 
including their Canberra metric (Lance  Williams 1966c)
Lance Williams (1967a) Mixed-data classificatory programs 
I.Agglomerative systems. Aust. Comput.J.1:15-20
Lance  Williams 1966c Computer programs for classification. Proc. 
ANCCAC Conference, Canberra, May 1966, Paper 12/3

HTH,
Angel
Wolski wrote:
Hi!
Its not an R specific question but had no idea where to ask elsewhere.
Does anyone know the orginal reference to the CAMBERA  DISTANCE?
Eryk.
Ps.:
I knew that its an out of topic question (sorry).
Can anyone reccomend a mailing list where such questions are in topic?
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
.
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] naive question

2004-06-30 Thread Tony Plate
As far as I know, read.table() in S-plus performs similarly to read.table() 
in R with respect to speed.  So, I wouldn't put high hopes in finding much 
satisfaction there.

I do frequently read large tables in S-plus, and with a considerable amount 
of work was able to speed things up significantly, mainly by using scan() 
with appropriate arguments.  It's possible that some of the add-on modules 
for S-plus (e.g., the data-mining module) have faster I/O, but I haven't 
investigated those.  I get the best read performance out of S-plus by using 
a homegrown binary file format with each column stored in a contiguous 
block of memory and meta data (i.e., column types and dimensions) stored at 
the start of the file.  The S-plus read function reads the columns one at a 
time using readRaw(). One would be able to do something similar in R.  If 
you have to read from a text file, then, as others have suggested, writing 
a C program wouldn't be that hard, as long as you make the format inflexible.

-- Tony Plate
At Tuesday 06:19 PM 6/29/2004, Igor Rivin wrote:
I was not particularly annoyed, just disappointed, since R seems like
a much better thing than SAS in general, and doing everything with a 
combination
of hand-rolled tools is too much work. However, I do need to work with 
very large data sets, and if it takes 20 minutes to read them in, I have 
to explore other
options (one of which might be S-PLUS, which claims scalability as a major
, er, PLUS over R).

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] naive question

2004-06-30 Thread Gabor Grothendieck
Tony Plate tplate at blackmesacapital.com writes:

 I get the best read performance out of S-plus by using 
 a homegrown binary file format with each column stored in a contiguous 
 block of memory and meta data (i.e., column types and dimensions) stored at 
 the start of the file.  The S-plus read function reads the columns one at a 
 time using readRaw(). One would be able to do something similar in R.  If 
 you have to read from a text file, then, as others have suggested, writing 
 a C program wouldn't be that hard, as long as you make the format inflexible.


At one time there was a program around (independent of R) that would read in
a file in numerous formats including text and write out an R .rda file
(among other formats).  The .rda file could then be rapidly read into R
using the R load command.  Unfortunately the author withdrew it and it is 
no longer available.

I suspect that if someone came up with such a program in C code but to keep 
it simple just restricted it to ASCII input files with a minimum number 
of data types and .rda or homegrown binary output it would be of 
general interest to the R community.

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Question about plotting related to roll-up

2004-06-30 Thread Coburn Watson
Hello R'ers,

I have a large set of data which has many y samples for each unit x.  The data 
might look like:

Seconds Response_time
--
0   0.150
0   0.202
0   0.065
1   0.110
1   0.280
2   0.230
2   0.156
3   0.070
3   0.185
3   0.255
3   0.311
3   0.120
4   
 and so on

When I do a basic plot with type=l or the default of points it obviously plots 
every point.  What I would like to do is generate a line plot where the 
samples for each second are rolled up, averged and plotted with a bar which 
represents either std dev or some other aspect of variance.  Can someone 
recommend a plotting mechanism to achieve this?  I have adding lines using 
some of the smoothing functions but seem unable to remove the original plot 
line which is drawn (is there a way to just plot the data as feed through the 
smoothing function without the original data?).

Please remove _nospam from the email address to reply directly to me.

Thanks,

Coburn Watson
Software Performance Engineering
DST Innovis

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] naive question

2004-06-30 Thread Igor Rivin

Thank you! It's interesting about S-Plus, since they apparently try to support
work with much larger data sets by writing everything out to disk (thus getting
around the, eg, address space limitations, I guess), so it is a little surprising
that they did not tweak the I/O more...

Thanks again,

Igor

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Mac OS X 10.2.8 versus 10.3.4 package load failure

2004-06-30 Thread Ingmar Visser

Hello All,

I have built a package depmix and build a source package file from it for
distribution. I have done all this on the following platform:

platform powerpc-apple-darwin6.8
arch powerpc   
os   darwin6.8 
system   powerpc, darwin6.8
status 
major1 
minor9.0   
year 2004  
month04
day  12
language R 

When I install the package using the menu Packages  Install from local file
 From source package file (tar.gz) the package installs without problems.

However when I do the same another MAC OS X (10.2.8), I get the following
error message:

Error in dyn.load(x, as.logical(local), as.logical(now)) :
unable to
load shared library /Users/564/Library/R/library/depmix/libs/depmix.so:

dlcompat: dyld: /Library/Frameworks/R.framework/Resources/bin/R.bin
Undefined 
symbols:
_btowc
_iswctype
_mbsrtowcs
_towlower
_towupper
_wcscoll
_wcsftime

_wcslen
_wcsrtombs
_wcsxfrm
_wctob
_wctype
_wmemchr
_wmemcmp
_wmemcpy
_wmemm
ove
_wmemset
Error in library(depmix) : .First.lib failed

Can anyone help me understand what the problem is here and how to solve it?

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Question about plotting related to roll-up

2004-06-30 Thread Gabor Grothendieck

boxplot(Response_time ~ seconds, data = my.data.frame)



Coburn Watson cpwww at comcast.net writes:

: 
: Hello R'ers,
: 
: I have a large set of data which has many y samples for each unit x.  The 
data 
: might look like:
: 
: Seconds   Response_time
: --
: 0 0.150
: 0 0.202
: 0 0.065
: 1 0.110
: 1 0.280
: 2 0.230
: 2 0.156
: 3 0.070
: 3 0.185
: 3 0.255
: 3 0.311
: 3 0.120
: 4 
:  and so on
: 
: When I do a basic plot with type=l or the default of points it obviously 
plots 
: every point.  What I would like to do is generate a line plot where the 
: samples for each second are rolled up, averged and plotted with a bar which 
: represents either std dev or some other aspect of variance.  Can someone 
: recommend a plotting mechanism to achieve this?  I have adding lines using 
: some of the smoothing functions but seem unable to remove the original plot 
: line which is drawn (is there a way to just plot the data as feed through 
the 
: smoothing function without the original data?).
: 
: Please remove _nospam from the email address to reply directly to me.
: 
: Thanks,
: 
: Coburn Watson
: Software Performance Engineering
: DST Innovis
: 
: __
: R-help at stat.math.ethz.ch mailing list
: https://www.stat.math.ethz.ch/mailman/listinfo/r-help
: PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
: 
:

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Mac OS X 10.2.8 versus 10.3.4 package load failure

2004-06-30 Thread Prof Brian Ripley
The problem is that you compiled the code with wide characters, and on 
your 10.2.8 the support functions are not compiled into the R binary 
(which is what I would expect).

As to why you are getting wide chars, I suggest you ask on the R-SIG-Mac
list (https://www.stat.math.ethz.ch/mailman/listinfo/r-sig-mac), as this 
is very specialized.

On Wed, 30 Jun 2004, Ingmar Visser wrote:

 
 Hello All,
 
 I have built a package depmix and build a source package file from it for
 distribution. I have done all this on the following platform:
 
 platform powerpc-apple-darwin6.8
 arch powerpc   
 os   darwin6.8 
 system   powerpc, darwin6.8
 status 
 major1 
 minor9.0   
 year 2004  
 month04
 day  12
 language R 
 
 When I install the package using the menu Packages  Install from local file
  From source package file (tar.gz) the package installs without problems.
 
 However when I do the same another MAC OS X (10.2.8), I get the following
 error message:
 
 Error in dyn.load(x, as.logical(local), as.logical(now)) :
 unable to
 load shared library /Users/564/Library/R/library/depmix/libs/depmix.so:
 
 dlcompat: dyld: /Library/Frameworks/R.framework/Resources/bin/R.bin
 Undefined 
 symbols:
 _btowc
 _iswctype
 _mbsrtowcs
 _towlower
 _towupper
 _wcscoll
 _wcsftime
 
 _wcslen
 _wcsrtombs
 _wcsxfrm
 _wctob
 _wctype
 _wmemchr
 _wmemcmp
 _wmemcpy
 _wmemm
 ove
 _wmemset
 Error in library(depmix) : .First.lib failed
 
 Can anyone help me understand what the problem is here and how to solve it?

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] MacOS X binaries won't install

2004-06-30 Thread Paul Roebuck
On Tue, 29 Jun 2004, Ruben Solis wrote:

 I've tried installing the MacOS X binaries for R available at:

 http://www.bioconductor.org/CRAN/

 I'm running MacOS X version 10.2.8.

 I get a message indicating the installation is successful, but when I
 double-click on the R icon that shows up in my Applications folder, the
 application seems to try to open but closes immediately.

 I looked for  /Library/Frameworks/R.framework (by typing ls
 /Library/Frameworks) and it does not appear.  A global search for
 R.framework yields no results, so it seems that the installation is not
 working. (I was going to try command line execution.)

Check your log file for related error messages.
Can also inspect the package contents to ensure scripts
have execute permission as that would also cause the behavior
you describe.

--
SIGSIG -- signature too long (core dumped)

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Question about plotting related to roll-up

2004-06-30 Thread Adaikalavan Ramasamy
grp  - rep(1:5, each=3)
resp - rnorm(15)

mu - tapply(resp, grp, mean)
s  - tapply(resp, grp, sd)
stopifnot( identical( names(mu), names(s) ) )

LCL - mu - 2*s  # lower confidence limit
UCL - mu + 2*s
Here I choose 2 as we expect 95% of the data to fall under 4 sd.


# Type 1
plot(names(mu), mu, type=l, ylim=c( min(LCL), max(UCL) ))
lines(names(mu), UCL, lty=3)
lines(names(mu), LCL, lty=3)

Your group must contain only numeric values. Otherwise, you will need to
use a numerical coding followed by mtext() with proper characters.


# Type 2
plot(names(mu), mu, type=p, ylim=c( min(LCL), max(UCL) )) 
arrows( as.numeric(names(mu)), LCL, as.numeric(names(mu)), UCL, code=3,
angle=90, length=0.1 )


On Wed, 2004-06-30 at 17:33, Coburn Watson wrote:
 Hello R'ers,
 
 I have a large set of data which has many y samples for each unit x.  The data 
 might look like:
 
 Seconds   Response_time
 --
 0 0.150
 0 0.202
 0 0.065
 1 0.110
 1 0.280
 2 0.230
 2 0.156
 3 0.070
 3 0.185
 3 0.255
 3 0.311
 3 0.120
 4 
  and so on
 
 When I do a basic plot with type=l or the default of points it obviously plots 
 every point.  What I would like to do is generate a line plot where the 
 samples for each second are rolled up, averged and plotted with a bar which 
 represents either std dev or some other aspect of variance.  Can someone 
 recommend a plotting mechanism to achieve this?  I have adding lines using 
 some of the smoothing functions but seem unable to remove the original plot 
 line which is drawn (is there a way to just plot the data as feed through the 
 smoothing function without the original data?).
 
 Please remove _nospam from the email address to reply directly to me.
 
 Thanks,
 
 Coburn Watson
 Software Performance Engineering
 DST Innovis
 
 __
 [EMAIL PROTECTED] mailing list
 https://www.stat.math.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] naive question

2004-06-30 Thread Tony Plate
To be careful, there's lots more to I/O than the functions read.table()  
scan() -- I was only commenting on those, and no inference should be made 
about other aspects of S-plus I/O based on those comments!

I suspect that what has happened is that memory, CPU speed, and I/O speed 
have evolved at different rates, so what used to be acceptable code in 
read.table() (in both R and S-plus) is now showing its limitations and has 
reached the point where it can take half an hour to read in, on a 
readily-available computer, the largest data table that can be comfortably 
handled.  I'm speculating, but 10 years ago,  on a readily available 
computer, did it take half an hour to read in the largest data table that 
could be comfortably handled in S-plus or R?  People who encounter this now 
are surprised and disappointed, and IMHO, somewhat justifiably so.  The 
fact that R is an open source volunteer project suggests that the time is 
ripe for one of those disappointed people to fix the matter and contribute 
the function read.table.fast()!

-- Tony Plate
At Wednesday 10:08 AM 6/30/2004, Igor Rivin wrote:
Thank you! It's interesting about S-Plus, since they apparently try to support
work with much larger data sets by writing everything out to disk (thus 
getting
around the, eg, address space limitations, I guess), so it is a little 
surprising
that they did not tweak the I/O more...

Thanks again,
Igor
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] naive question

2004-06-30 Thread rivin
 I suspect that what has happened is that memory, CPU speed, and I/O
 speed  have evolved at different rates, so what used to be acceptable
 code in  read.table() (in both R and S-plus) is now showing its
 limitations and has  reached the point where it can take half an hour to
 read in, on a  readily-available computer, the largest data table that
 can be comfortably  handled.  I'm speculating, but 10 years ago,  on a
 readily available  computer, did it take half an hour to read in the
 largest data table that  could be comfortably handled in S-plus or R?

I did not use R ten years ago, but reasonable RAM amounts have
multiplied by roughly a factor of 10 (from 128Mb to 1Gb), CPU speeds have
gone up by a factor of 30 (from 90Mhz to 3Ghz), and disk space availabilty
has gone up probably by a factor of 10. So, unless the I/O performance
scales nonlinearly with size (a bit strange but not inconsistent with my R
experiments), I would think that things should have gotten faster (by the
wall clock, not slower). Of course, it is possible that the other
components of the R system have been worked on more -- I am not equipped
to comment...

  Igor



__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] memory utilization question

2004-06-30 Thread Tae-Hoon Chung
Hi, all;
I have a question on memory utilization of R.
Does R use only RAM memory or can I make it use virtual memory?
Currently, I am using Mac OS X 10.3.3 with 1.5 GB RAM.
However, I need more memory for some large size problem right now.
Thanks in advance,
Tae-Hoon Chung, Ph.D
Post-doctoral Research Fellow
Molecular Diagnostics and Target Validation Division
Translational Genomics Research Institute
1275 W Washington St, Tempe AZ 85281 USA
Phone: 602-343-8724
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] interval regression

2004-06-30 Thread Thomas Ribarits
Hi,
 
does anyone have a quick answer to the question of how to carry out
interval regression in R. I have found ordered logit and ordered
probit as well as multinomial logit etc. The thing is, though, that I
want to apply logit/probit to interval-coded data and I know the cell
limits which are used to turn the quantitative response into an ordered
factor. Hence, it does not make sense to estimate the cell limits (e.g.
in zeta for the function polr).
 
Thanks,
 
Thomas Ribarits
 

[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Developing functions

2004-06-30 Thread daniel
Hi,
I´m new in R. I´m working with similarity coefficients for clustering
items. I created one function (coef), to calculate the coefficients from
two pairs of vectors and then, as an example, the function
simple_matching,
taking a data.frame(X) and using coef in a for cicle.
It works, but I believe it is a bad way to do so (I believe the for cicle
is not necessary). Somebody can suggest anything better.
Thanks
Daniel Rozengardt

coef-function(x1,x2){a-sum(ifelse(x1==1x2==1,1,0));
b-sum(ifelse(x1==1x2==0,1,0));
c-sum(ifelse(x1==0x2==1,1,0));
d-sum(ifelse(x1==0x2==0,1,0));
ret-cbind(a,b,c,d);
ret
}

simple_matching-function(X) {
ret-matrix(ncol=dim(X)[1],nrow=dim(X)[1]);
diag(ret)-1;
for (i in 2:length(X[,1])) {
for (j in i:length(X[,1])) {
vec-coef(X[i-1,],X[j,]);
result-(vec[1]+vec[3])/sum(vec);
ret[i-1,j]-result;
ret[j,i-1]-result}};
ret}

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] naive question

2004-06-30 Thread Peter Dalgaard
[EMAIL PROTECTED] writes:

 I did not use R ten years ago, but reasonable RAM amounts have
 multiplied by roughly a factor of 10 (from 128Mb to 1Gb), CPU speeds have
 gone up by a factor of 30 (from 90Mhz to 3Ghz), and disk space availabilty
 has gone up probably by a factor of 10. So, unless the I/O performance
 scales nonlinearly with size (a bit strange but not inconsistent with my R
 experiments), I would think that things should have gotten faster (by the
 wall clock, not slower). Of course, it is possible that the other
 components of the R system have been worked on more -- I am not equipped
 to comment...

I think your RAM calculation is a bit off. in late 1993, 4MB systems
were the standard PC, with 16 or 32 MB on high-end workstations.
Comparable figures today are probably  256MB for the entry-level PC
and a couple GB in the high end. So that's more like a factor of 64.
On the other hand, CPU's have changed by more than the clock speed; in
particular, the number of clock cycles per FP calculation has
decreased considerably and is currently less than one in some
circumstances. 

-- 
   O__   Peter Dalgaard Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics 2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark  Ph: (+45) 35327918
~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] interval regression

2004-06-30 Thread Prof Brian Ripley
I believe from your description that this is interval-censored survival
and can be handled by survreg.

It would also be very easy to amend polr to do this.

On Wed, 30 Jun 2004, Thomas Ribarits wrote:

 does anyone have a quick answer to the question of how to carry out
 interval regression in R. I have found ordered logit and ordered
 probit as well as multinomial logit etc. The thing is, though, that I
 want to apply logit/probit to interval-coded data and I know the cell
 limits which are used to turn the quantitative response into an ordered
 factor. Hence, it does not make sense to estimate the cell limits (e.g.
 in zeta for the function polr).

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] pca with missing values

2004-06-30 Thread Angel Lopez
Thanks for the pointers.
I've read them with care but I don't seem capable of making it work.
For example, if I do:
data(USArrests)
USArrests2-USArrests
USArrests2[1,1]-NA
princomp(USArrests2, cor = TRUE, na.action = na.omit)
I get the error message:
Error in cov.wt(z) : x must contain finite values only
I've tried changing the 'options' na.action and using other values than 
na.omit with no success.

The only way that I can make it work in some way was if I did:
USArrestsNA-na.omit(USArrests)
princomp(USArrestsNA, cor = TRUE)
I've also obtained the same by giving the correlation matrix instead of 
the data frame:
princomp(covmat=cor(USArrestsNA))
Both solutions do the job by not using the row with the NA.
After more reading I thought I would get the same result by doing:
princomp(covmat=cor(USArrests2,use=complete.obs))
but the result is slightly different. I can not manage to understand the 
difference.
Can someone give me some more light to keep going?
P.S:Using the solution above with na.omit would not be very good in my 
real world problem because it is relatively common to have an NA in a 
row. Maybe using 
princomp(covmat=cor(USArrests2,use=pairwise.complete.obs))
would be a solution but I would like to understand the above before 
doing this next step.
Thanks,
Ange

Prof Brian Ripley wrote:
On Wed, 30 Jun 2004, Angel Lopez wrote:

I need to perform a principal components analysis on a matrix with 
missing values.
I've searched the R web site but didn't manage to find anything.

?princomp
has a description of an na.action argument, and 

help.search(missing values)
comes up with several relevant entries.
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] MacOS X binaries won't install

2004-06-30 Thread Jeff Gentry

Sorry, I missed the original message, so piggybacking off of a reply.

 On Tue, 29 Jun 2004, Ruben Solis wrote:
  I've tried installing the MacOS X binaries for R available at:
  http://www.bioconductor.org/CRAN/
  I'm running MacOS X version 10.2.8.

Since this is coming out of our mirror (bioconductor), I'm curious if the
same problem occurs with the packages available at the other mirrors, or
if this is specific to the BioC mirror?

Thanks
-J

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] spam warning

2004-06-30 Thread ivo_welch-Rstat
hi chaps:  I just found out that this R post list is actively harvested 
by some spammers.  I used this account exclusively for r-help posts, 
and promptly received a nice email with links to nigeria, india, and 
china.  this is not a good thing.  can we obfuscate the email addresses 
of posters, so that at least automatic harvesting won't work?

sincerely,  /ivo welch
---
ivo welch
professor of finance and economics
brown / nber / yale
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Developing functions

2004-06-30 Thread Gabor Grothendieck


Without trying to understand your code in detail let me just 
assume you are trying to create a matrix, ret, whose i,j-th 
entry is some function, f, of row i of X and row j of X.

In that case this should do it:

apply(X,1,function(x)apply(X,1,function(y)f(x,y)))


Date:   Wed, 30 Jun 2004 15:28:47 -0300 (ART) 
From:   [EMAIL PROTECTED]
To:   [EMAIL PROTECTED] 
Subject:   [R] Developing functions 

 
Hi,
I´m new in R. I´m working with similarity coefficients for clustering
items. I created one function (coef), to calculate the coefficients from
two pairs of vectors and then, as an example, the function
simple_matching,
taking a data.frame(X) and using coef in a for cicle.
It works, but I believe it is a bad way to do so (I believe the for cicle
is not necessary). Somebody can suggest anything better.
Thanks
Daniel Rozengardt

coef-function(x1,x2){a-sum(ifelse(x1==1x2==1,1,0));
b-sum(ifelse(x1==1x2==0,1,0));
c-sum(ifelse(x1==0x2==1,1,0));
d-sum(ifelse(x1==0x2==0,1,0));
ret-cbind(a,b,c,d);
ret
}

simple_matching-function(X) {
ret-matrix(ncol=dim(X)[1],nrow=dim(X)[1]);
diag(ret)-1;
for (i in 2:length(X[,1])) {
 for (j in i:length(X[,1])) {
 vec-coef(X[i-1,],X[j,]);
 result-(vec[1]+vec[3])/sum(vec);
 ret[i-1,j]-result;
 ret[j,i-1]-result}};
ret}

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] naive question

2004-06-30 Thread rivin
 [EMAIL PROTECTED] writes:

 I did not use R ten years ago, but reasonable RAM amounts have
 multiplied by roughly a factor of 10 (from 128Mb to 1Gb), CPU speeds
 have gone up by a factor of 30 (from 90Mhz to 3Ghz), and disk space
 availabilty has gone up probably by a factor of 10. So, unless the I/O
 performance scales nonlinearly with size (a bit strange but not
 inconsistent with my R experiments), I would think that things should
 have gotten faster (by the wall clock, not slower). Of course, it is
 possible that the other components of the R system have been worked on
 more -- I am not equipped to comment...

 I think your RAM calculation is a bit off. in late 1993, 4MB systems
 were the standard PC, with 16 or 32 MB on high-end workstations.

I beg to differ. In 1989, Mac II came standard with 8MB, NeXT came
standard with 16MB. By 1994, 16MB was pretty much standard on good quality
(= Pentium, of which the 90Mhz was the first example) PCs, with 32Mb
pretty common (though I suspect that most R/S-Plus users were on SUNs,
which were somewhat more plushly equipped).

 Comparable figures today are probably  256MB for the entry-level PC and
 a couple GB in the high end. So that's more like a factor of 64. On the
 other hand, CPU's have changed by more than the clock speed; in
 particular, the number of clock cycles per FP calculation has
 decreased considerably and is currently less than one in some
 circumstances.

I think that FP performance has increased more than integer performance,
which has pretty much kept pace with the clock speed. The compilers have
also improved a bit...

  Igor

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] naive question

2004-06-30 Thread james . holtman




It is amazing the amount of time that has been spent on this issue.  In
most cases, if you do some timing studies using 'scan', you will find that
you can read some quite large data structures in a reasonable time.  If you
initial concern was having to wait 10 minutes to have your data read in,
you could have read in quite a few data sets by now.

When comparing speeds/feeds of processors, you also have to consider what
it being done on them.  Back in the dark ages we had a 1 MIP computer
with 4M of memory handling input from 200 users on a transaction system.
Today I need a 1GHZ computer with 512M to just handle me.  Now true, I am
doing a lot different processing on it.

With respect to I/O, you have to consider what is being read in and how it
is converted.  Each system/program has different requirements.  I have some
applications (running on a laptop) that can read in approximately 100K rows
of data per second (of course they are already binary).  On the other hand,
I can easily slow that down to 1K rows per second if I do not specify the
correct parameters to 'read.table'.

So go back and take a look at what you are doing, and instrument your code
to see where time is being spent.  The nice thing about R is that there are
a number of ways of approaching a solution and it you don't like the timing
of one way, try another.  That is half the fun of using R.
__
James HoltmanWhat is the problem you are trying to solve?
Executive Technical Consultant  --  Office of Technology, Convergys
[EMAIL PROTECTED]
+1 (513) 723-2929


   

  [EMAIL PROTECTED]   
 
  mple.eduTo:   [EMAIL PROTECTED]   
 
  Sent by: cc:   [EMAIL PROTECTED],
 
  [EMAIL PROTECTED] [EMAIL PROTECTED], [EMAIL PROTECTED]   
   
  ath.ethz.ch  Subject:  Re: [R] naive question

   

   

  06/30/2004 16:25 

   

   





 [EMAIL PROTECTED] writes:

 I did not use R ten years ago, but reasonable RAM amounts have
 multiplied by roughly a factor of 10 (from 128Mb to 1Gb), CPU speeds
 have gone up by a factor of 30 (from 90Mhz to 3Ghz), and disk space
 availabilty has gone up probably by a factor of 10. So, unless the I/O
 performance scales nonlinearly with size (a bit strange but not
 inconsistent with my R experiments), I would think that things should
 have gotten faster (by the wall clock, not slower). Of course, it is
 possible that the other components of the R system have been worked on
 more -- I am not equipped to comment...

 I think your RAM calculation is a bit off. in late 1993, 4MB systems
 were the standard PC, with 16 or 32 MB on high-end workstations.

I beg to differ. In 1989, Mac II came standard with 8MB, NeXT came
standard with 16MB. By 1994, 16MB was pretty much standard on good quality
(= Pentium, of which the 90Mhz was the first example) PCs, with 32Mb
pretty common (though I suspect that most R/S-Plus users were on SUNs,
which were somewhat more plushly equipped).

 Comparable figures today are probably  256MB for the entry-level PC and
 a couple GB in the high end. So that's more like a factor of 64. On the
 other hand, CPU's have changed by more than the clock speed; in
 particular, the number of clock cycles per FP calculation has
 decreased considerably and is currently less than one in some
 circumstances.

I think that FP performance has increased more than integer performance,
which has pretty much kept pace with the clock speed. The compilers have
also improved a bit...

  Igor

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] rgl installation problems

2004-06-30 Thread E GCP
Thanks. Adding the -shared to the Makeconf file fixed the problem. I 
installed R from R-1.9.1-0.fdr.2.rh90.i386.rpm , but for some reason the 
flag to share libraries was never set in SHLIB_CXXLDFLAGS.  
rgl_0.64-13.tar.gz now installed without problems.

Thanks again,
Enrique
On Wed, 2004-06-30 at 00:52, E GCP wrote:
 Thanks for your replies. I do not HTML-ize my mail, but free email 
accounts  do that and there is not a switch to turn it off.  I apologize 
in advance.
  I installed R from the redhat package provided by Martyn Plummer. It  
installed fine and without problems. I can use R and have installed and 
used  other packages within R without any problems whatsoever. I do not 
think the  problem is with R or its installation.   I do think there is a 
problem with  the installation of rgl_0.64-13.tar.gz on RedHat 9 (linux).  
So, if there is  anybody out there who has installed succesfully 
rgl_0.64-13.tar.gz on RedHat  9, I would like to know how.

This is a little strange.  I'm now building RPMS for older Red Hat
versions on FC2 using a tool called Mach.  There is a possibility that
there is some configuration problem, but I can't see it. As Brian has
pointed out, you are missing the crucial -shared flag when building
the shared library for rgl. This comes from the line
SHLIB_CXXLDFLAGS = -shared
in the file /usr/lib/R/etc/Makeconf.  This is present in my latest RPM
for Red Hat ( R-1.9.1-0.fdr.2.rh90.i386.rpm ) so I don't know why it
isn't working for you.
Check that you do have the latest RPM and that you don't have a locally
built version of R in /usr/local since this will have precedence on your
PATH.
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] linear models and colinear variables...

2004-06-30 Thread Peter Gaffney
Hi!

I'm having some issues on both conceptual and
technical levels for selecting the right combination
of variables for this model I'm working on. The basic,
all inclusive form looks like

lm(mic ~ B * D * S * U * V * ICU)

Where mic, U, V, and ICU are numeric values and B D
and S are factors with about 16, 16 and 2 levels
respectively. In short, there's a ton of actual
explanatory variables that look something like this:

Bstaph.aureus:Dvan:Sr:U:ICU

There are a good number of hits but there's also a
staggering number of complete misses, due to a
combination of scare data in that particular niche and
actual lack of deviation from the categorical mean. 
My suspicion is that there's a large degree of
colinearity in some of these variables that serves to
reduce the total effect of either of a nearly colinear
pair to an insignificant level; my hope is that
removing one of a mostly colinear group would allow
the other variables' possibly significant effects to
be measured.

Question 1) Is this legitimate at all? Can I do
regression using the entire data set over only
selected factors while ignoring others?
(Admittedly I only just got my Bachelor's in math; the
gaps in my knowlege here are profound and
aggravating.)

Question 2) How do I go about selecting possible
colinear explanatory variables?
I had originally thought I'd just make a matrix of
coefficients of colinearity for each pair of variables
and iteratively re-run the model until I got the
results I wanted, but I can't really figure out how to
do this.  In addition, I'm not sure how to do this in
the model syntax once I've actually decided on some
variables to exclude.
For instance, supposing I wanted to run the model as
above without the variable
Bstaph.aureus:Dvan:Sr:U:ICU.  What I tried was

lm(mic ~ B * D * S * U * V * ICU -
Bstaph.aureus:Dvan:Sr:U:ICU).

Obviously this doesn't work because the variable name
Bstaph.aureus:Dvan:Sr:U:ICU hasn't been recognized
yet.  How do I do this?  My best guess so far is to
build and define each of the variables like
Bstaph.aureus:Dvan:Sr:U:ICU by hand with some
imperative/iterative style programming using some kind
of string generation system.  This sounds like a royal
pain, and is something I'd rather avoid doing if at
all possible.

Any suggestions? :-D

-petertgaffney

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] R can find some functions in assist package

2004-06-30 Thread Michael Axelrod
I am a new user to R. I installed and loaded the smoothing spline package 
called assist. The function ssr seems to work, but predict.ssr and 
some others don't seem to be available, I get a can't find message. But 
they appear in the extensions folder, c:program 
files\R\rw1091\library\assist\R-ex. What's wrong?

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] linear models and colinear variables...

2004-06-30 Thread Jonathan Baron
On 06/30/04 16:32, Peter Gaffney wrote:
Hi!

I'm having some issues on both conceptual and
technical levels for selecting the right combination
of variables for this model I'm working on. The basic,
all inclusive form looks like

lm(mic ~ B * D * S * U * V * ICU)

When you do this, you are including all the interaction terms.
The * indicates an interaction, as opposed to +.  That might make
sense unders some circumstances, for example if you are just
trying to get the best model and you plan to eliminate
higher-order interactions that are not significant, but usually
it does more to obscure the interesting effects than to display
them.

My suspicion is that there's a large degree of
colinearity in some of these variables that serves to
reduce the total effect of either of a nearly colinear
pair to an insignificant level; my hope is that
removing one of a mostly colinear group would allow
the other variables' possibly significant effects to
be measured.

There may be colinearity, but the most likely problem is that you
are including too many interactions, at too high a level.
Inclusion of nonsignificant interaction terms often turns
significant main effects into nonsignificant effects.

Question 1) Is this legitimate at all? Can I do
regression using the entire data set over only
selected factors while ignoring others?
(Admittedly I only just got my Bachelor's in math; the
gaps in my knowlege here are profound and
aggravating.)

If you select predictors on the basis of which ones are
significant, then the final significance levels don't mean much,
usually.  Remember, 1 out of 20 will be significant at .05 even
if you are using random numbers.

Question 2) How do I go about selecting possible
colinear explanatory variables?

If there is colinearity, then what to do about it depends on the
substance of the questions you are asking.  Some options are to
combine variables, do some sort of factor analysis and use
factors rather than variables as predictors, use the most
meaningful of the variables that are colinear, or just live with
it, if the substantive issues rule out the other options.  (I'm
sure there are other solutions that others might point out.)

I had originally thought I'd just make a matrix of
coefficients of colinearity for each pair of variables
and iteratively re-run the model until I got the
results I wanted, but I can't really figure out how to
do this.  In addition, I'm not sure how to do this in
the model syntax once I've actually decided on some
variables to exclude.
For instance, supposing I wanted to run the model as
above without the variable
Bstaph.aureus:Dvan:Sr:U:ICU.  What I tried was

lm(mic ~ B * D * S * U * V * ICU -
Bstaph.aureus:Dvan:Sr:U:ICU).

Obviously this doesn't work because the variable name
Bstaph.aureus:Dvan:Sr:U:ICU hasn't been recognized
yet.  How do I do this?  My best guess so far is to

Not clear what you mean here.

build and define each of the variables like
Bstaph.aureus:Dvan:Sr:U:ICU by hand with some
imperative/iterative style programming using some kind
of string generation system.  This sounds like a royal
pain, and is something I'd rather avoid doing if at
all possible.

Any suggestions? :-D

-petertgaffney

Jon
-- 
Jonathan Baron, Professor of Psychology, University of Pennsylvania
Home page:http://www.sas.upenn.edu/~baron
R search page:http://finzi.psych.upenn.edu/

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] help with tclVar

2004-06-30 Thread solares
Hi, I can' t load a variable tcltk declared with tclVar, why is this?, the
exmple above explain me ,Thanks Ruben
a-tclVar(init=)
 f-function(){
+ a-pipo
+ }
 f()
 a
[1] pipo
 tclvalue(a)
Error in structure(.External(dotTcl, ..., PACKAGE = tcltk), class =
tclObj) :
[tcl] can't read pipo: no such variable.


__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] R can't find some functions in assist package

2004-06-30 Thread Michael Axelrod
Oh yes. The load package under the packages menu in the Windows version 
does that. To check I typed library(assist) after starting R. Same 
behavior, ssr is found, but others like predict.ssr, and plot.ssr, give a 
not found message.

Thanks for the suggestion.
Mike
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] MS OLAP -- RODBC to SQL Server Slice Server pass-through query to MS OLAP

2004-06-30 Thread James . Callahan
Olivier Collignon wrote:
 I have been doing data analysis/modeling in R, connecting to SQL 
databases 
 with RODBC (winXP client with R1.9.0 and win2k SQL server 2000).
 
 I am now trying to leverage some of the OLAP features to keep the data 
 intensive tasks on the DB server side and only keep the analytical tasks 

 within R (optimize use of memory). Is there any package that would allow 
to 
 connect to OLAP cubes (as a client to SQL Analysis Services PivotTable 
 Service) from an R session (similar to the RODBC package)?
 
 Can this be done directly with special connection parameters (from R) 
 through the RODBC package or do I have to setup an intermediary XL table 

 (pivottable linked to the OLAP cube) and then connect to the XL data 
source 
 from R?
 
 I would appreciate any reference / pointer to OLAP/R configuration 
 instructions. Thanks.
 
 Olivier Collignon

OLAP = On-line Analytical Processing == Using a cube / hypercube of data 
for decision support 
(usually extracted from an transaction processing system)

Direct connection to Microsoft OLAP server (Analysis Services) requires 
and OLE DB provider 
rather than an ODBC connection -- so RODBC cannot be used directly
(querying an OLAP cube rather than a SQL table requires MDX instead of SQL 
queries).

One way to use RODBC to query MS OLAP cube is to use SQL Server as a 
Slice Server.
You could slice three different planes (tables) out of a 3-D cube. For 
example, if you had 
a cube that Financial Data by Company by Year. The three planes 
(tables) would be:

1. Financial statements -- financial line items for one company  (Balance 
Sheet, Income Statement)
2. Time Series -- One or more line items over time (forecasting)
3. Cross-sectional -- One line item for all companies (useful for ranking)

Microsoft OLAP Analysis Services is bundled with (on the same CD-ROM as) 
MS SQL server, 
so licensing should not be an issue

Although I have figured out this is possible (and implemented a similar 
system many years ago 
in a long forgotten language),  I haven't built an MS OLAP cube yet -- so 
I haven't tested it.

The following is summarized from Mary Chipman's and Andy Baron's, 
Microsoft Access Guide to SQL Server, chapter 12, pages 644-654.

SQL Server is capable of sending pass through queries for execution on 
another data base 
(using the foreign data base's syntax). That way MDX is executed on the 
OLAP server, and not locally
in SQL Server. Chipman  Baron, p. 645 -- which would accomplish your 
objective of [Keeping] the data 
intensive tasks on the DB server side.

Although you can use the MS SQL OPENROWSET() function to set up an ad hoc 
connection, the
recommended method would be two steps:

1. Set up a Linked Server using either MS SQL Enterprise Manager or 
 MS SQL system stored procedure EXEC sp_addlinkedserver.

2. Use an OPENQUERY() function (which works with any OLE DB data source) 
to pass the MDX query and
return a resultset.  The OPENQUERY function can be used in the FROM 
clause of a SQL query, 
a stored procedure or  in MS SQL 2000 a User Defined Function (UDF).
 
Also note,
A view created using the OPENQUERY syntax cannot be used [without 
renaming (aliasing) the columns using 'AS'] 
as the RecordSource for a report Chipman  Baron, op. cit. page 647. MDX 
concatenates the dimension names with
periods -- and that does not conform to the column naming conventions of 
either MS SQL Server or MS Access.

Unfortunately, that's too much information for the R list and not enough 
for you, 
so I strongly recommend the full explanation and examples in Chapter 12 of 
Chipman  Baron.

Ideally, one would like an R interface to map an multidimensional array 
directly from OLAP to 
an R data structure such as an R matrix or an R data frame similar to 
RPgSQL -- until then, use a slice server...

Jim Callahan, MBA  MCDBA
Management, Budget  Accounting
City of Orlando
(407) 246-3039 office

[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] R can't find some functions in assist package

2004-06-30 Thread Jason Turner
 Oh yes. The load package under the packages menu in the Windows
 version
 does that. To check I typed library(assist) after starting R. Same
 behavior, ssr is found, but others like predict.ssr, and plot.ssr, give a
 not found message.

Short answer:  Try using predict instead of predict.ssr.  I think
you're meant to quietly use the predict and plot methods provided, and not
mention their inner names.

Long answer:
Namespaces.  This means that a method for an object isn't visible to R as
a whole.  This avoids conflics should another package pick the same names.

Does this work?

getAnywhere(predict.ssr)

Cheers

Jason

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] R can't find some functions in assist package

2004-06-30 Thread Liaw, Andy
That's because `assist' has a namespace:

 library(assist)
Loading required package: nlme 
 predict.ssr
Error: Object predict.ssr not found
 methods(predict)
 [1] predict.ar*predict.Arima*
 [3] predict.arima0*predict.glm   
 [5] predict.gls*   predict.gnls* 
 [7] predict.HoltWinters*   predict.lm
 [9] predict.lme*   predict.lmList*   
[11] predict.loess* predict.mlm   
[13] predict.nlme*  predict.nls*  
[15] predict.poly   predict.ppr*  
[17] predict.princomp*  predict.slm*  
[19] predict.smooth.spline* predict.smooth.spline.fit*
[21] predict.snm*   predict.snr*  
[23] predict.ssr*   predict.StructTS* 

Non-visible functions are asterisked

Use either assist:::predict.ssr or getAnywhere(predict.ssr).

Andy

 From: Michael Axelrod
 
 Oh yes. The load package under the packages menu in the 
 Windows version 
 does that. To check I typed library(assist) after starting R. Same 
 behavior, ssr is found, but others like predict.ssr, and 
 plot.ssr, give a 
 not found message.
 
 Thanks for the suggestion.
 
 Mike
 
 __
 [EMAIL PROTECTED] mailing list
 https://www.stat.math.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 


__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] R can't find some functions in assist package

2004-06-30 Thread roger koenker
An R-News or J. of Statistical Software note titled, Getting to the 
Source
of the Problem  detailing the basic strategies for these adventures in 
the
new world of S4 methods and namespaces would be very useful.

url:www.econ.uiuc.edu/~rogerRoger Koenker
email   [EMAIL PROTECTED]   Department of Economics
vox:217-333-4558University of Illinois
fax:217-244-6678Champaign, IL 61820
On Jun 30, 2004, at 8:03 PM, Liaw, Andy wrote:
That's because `assist' has a namespace:
library(assist)
Loading required package: nlme
predict.ssr
Error: Object predict.ssr not found
methods(predict)
 [1] predict.ar*predict.Arima*
 [3] predict.arima0*predict.glm
 [5] predict.gls*   predict.gnls*
 [7] predict.HoltWinters*   predict.lm
 [9] predict.lme*   predict.lmList*
[11] predict.loess* predict.mlm
[13] predict.nlme*  predict.nls*
[15] predict.poly   predict.ppr*
[17] predict.princomp*  predict.slm*
[19] predict.smooth.spline* predict.smooth.spline.fit*
[21] predict.snm*   predict.snr*
[23] predict.ssr*   predict.StructTS*
Non-visible functions are asterisked
Use either assist:::predict.ssr or getAnywhere(predict.ssr).
Andy
From: Michael Axelrod
Oh yes. The load package under the packages menu in the
Windows version
does that. To check I typed library(assist) after starting R. Same
behavior, ssr is found, but others like predict.ssr, and
plot.ssr, give a
not found message.
Thanks for the suggestion.
Mike
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Developing functions

2004-06-30 Thread Liaw, Andy
 From: [EMAIL PROTECTED]
 
 Hi,
 I´m new in R. I´m working with similarity coefficients for clustering
 items. I created one function (coef), to calculate the 
 coefficients from
 two pairs of vectors and then, as an example, the function
 simple_matching,
 taking a data.frame(X) and using coef in a for cicle.
 It works, but I believe it is a bad way to do so (I believe 
 the for cicle
 is not necessary). Somebody can suggest anything better.
 Thanks
 Daniel Rozengardt
 
 coef-function(x1,x2){a-sum(ifelse(x1==1x2==1,1,0));
 b-sum(ifelse(x1==1x2==0,1,0));
 c-sum(ifelse(x1==0x2==1,1,0));
 d-sum(ifelse(x1==0x2==0,1,0));
 ret-cbind(a,b,c,d);
 ret
 }
 
 simple_matching-function(X) {
 ret-matrix(ncol=dim(X)[1],nrow=dim(X)[1]);
 diag(ret)-1;
 for (i in 2:length(X[,1])) {
   for (j in i:length(X[,1])) {
   vec-coef(X[i-1,],X[j,]);
   result-(vec[1]+vec[3])/sum(vec);
   ret[i-1,j]-result;
   ret[j,i-1]-result}};
 ret}

A few comments first:

1. Unless you are putting multiple statements on the same line, there's no
need to use ;.

2. In `coef' (which is a bad choice for a function name: There's a built-in
generic function by that name in R, for extracting coefficients from fitted
model objects), a, b, c and d are scalars.  You don't need to cbind() them;
c() works just fine.

3. One of the best strategies for efficiency is to vectorize.  Try to
formulate the problem in matrix/vector operations as much as possible.

4. The computation looks a bit odd to me.  Assuming the data are binary
(i.e., all 0s and 1s), you are computing (N11 + N01) / N, where N is the
length of the vectors, N11 is the number of 1-1 matches and N01 is the
number of 0-1 matches.  Are you sure that's what you want to compute?

Here's what I'd do (assuming the input matrix contains all 0s and 1s):

simple_matching - function(X) {
N11 - crossprod(t(X))
N01 - crossprod(t(X), t(1-X))
ans - (N11 + N01) / ncol(X)
diag(ans) - 1
ans
}

HTH,
Andy

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] failure notice

2004-06-30 Thread pierson5

[EMAIL PROTECTED]:
Rejected for the following Virus 
Virus Name: Worm.SomeFool.Gen-1
User Rejected Message (#5.1.1)

--- Below this line is a copy of the message headers.

From [EMAIL PROTECTED] Wed Jun 30 22:26:39 2004
Received: from fairfieldi.com (68-233-97-39.ironoh.adelphia.net [68.233.97.39])
by mxred.fairfieldi.com (8.12.11/8.12.9) with ESMTP id i612QbR0031331
for [EMAIL PROTECTED]; Wed, 30 Jun 2004 22:26:38 -0400
Message-Id: [EMAIL PROTECTED]
From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Subject: Re: hello
Date: Wed, 30 Jun 2004 22:14:51 -0400
MIME-Version: 1.0
Content-Type: multipart/mixed;
boundary==_NextPart_000_0011_3DDC.032F
X-Priority: 3
X-MSMail-Priority: Normal

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] .Net Mono language news: C, C++, C#, Java, Python Perl

2004-06-30 Thread James . Callahan
For those interested in experimenting with (compiling / developing) 
a version of R for the Common Language Runtime (CLR) 
environment (Microsoft .Net, Novell Ximan Mono  DotGNU) 
-- a few links to some free compliers:

C
DotGNU (the official GNU project)
http://dotgnu.org/
http://www.southern-storm.com.au/pnet_faq.html#q1_7
FAQ 1.7. What is pnetC?
Since version 0.4.4 of DotGNU Portable.NET, the cscc compiler has had 
support for compiling C programs. 
The companion libc implementation for the C compiler is called 'pnetC'. 
The code is based on glibc.
===

C
lcc -- lcc is a retargetable compiler for Standard C.
lcc.NET, an lcc 4.2 backend for MSIL 
According to DotGNU [MSIL] is exactly the same as CIL, but some media 
reports have called it 'MSIL' for some reason.
http://www.cs.princeton.edu/software/lcc/
1. LCC 4.2 has been recently released. This release adds support for 
compiling ANSI C programs to CIL. 
Note that the CIL support only works on Win32 right now, but should be 
easy to convert to Mono/other architectures. 

2. LCC is not an open source compiler, but it is free as long as you do 
not profit from selling it. 


C# (C-sharp)
Mono C# (version 1.0 released today)
http://www.mono-project.com/using/relnotes/1.0.html
===

C++
Microsoft's C++ -- a free download (free as in beer)
http://www.microsoft.com/downloads/details.aspx?FamilyId=272BE09D-40BB-49FD-9CB0-4BFA122FA91Bdisplaylang=en

1. Visual C++ Toolkit 2003
The Microsoft Visual C++ Toolkit 2003 includes the core tools developers 
need to compile and link 
C++-based applications for Windows and the .NET Common Language Runtime
 ? compiler, linker, libraries, and sample code.

2. The Visual C++ Toolkit is a free edition of Microsoft?s professional 
Visual C++ 
optimizing compiler and standard libraries ? the same optimizing compiler 
and standard libraries that ship in Visual Studio .NET 2003 Professional!

3. Are there any restrictions on how I use the Visual C++ Toolkit?
In general, no. You may use the Toolkit to build C++ -based applications, 
and you may redistribute those applications. 
Please read the End User License Agreement (EULA), included with the 
Toolkit, for complete details.
===

FORTRAN
No free news -- so I guess it is F2C.
Lahey  Salford have commercial Fortran compliers.
===

JAVA (useful for Omegahat)
IKVM - a CLR implementation of a language compatible with the Java syntax 
that be used with GNU Classpath and IBM Jikes
http://www.ikvm.net/
===

PERL
PerlSharp
http://taubz.for.net/code/perlsharp/
1. By Joshua Tauberer. PerlSharp is a .NET binding to the Perl 
interpreter. 
It allows .NET programs to embed Perl scripts and manipulate Perl data. 
Perl version 5.8.0 or later on a Unix-like system is required (thus you 
need Mono). 
2. The current release was built on 4/11/2004. 
This is the first release, so I wouldn't place bets on it actually working 
out of the box. 
It is released under the GPL.
===

PYTHON
Python.NET (from http://www.go-mono.com/languages.html#python)
Brian Lloyd is working on linking the Python runtime with the .NET 
runtime. More information on the PS.NET project 
http://www.zope.org/Members/Brian/PythonNet
http://zope.org/Members/Brian/PythonNet/README.html

1. This package does not implement Python as a first-class CLR language 
- it does not produce managed code (IL) from Python code. 
Rather, it is an integration of the C Python engine with the .NET runtime. 

This approach allows you to use CLR services and continue to use existing 
Python C extensions 
while maintaining native execution speeds for Python code.

2. While a solution like Jython provides 'two-way' interoperability, 
this package only provides 'one-way' integration. Meaning, while 
Python can use types and services implemented in .NET, 
managed code cannot generally use classes implemented in Python.

3. Python for .NET is based on Python 2.3.2. 
Note that this is a change from the preview releases, which were based on 
Python 2.2.
===

also 

PYTHON -- Iron Python
Jim Hugunin (author of JPython/Jython and Numeric Python/NumPy) 
is developing IronPython: A fast Python implementation for .NET and 
Mono.
http://ironpython.com/
===

Note: According to DotGNU -- IL, CIL  MSIL all refer to the bytecode 
format that is used to represent compiled programs
http://www.southern-storm.com.au/pnet_faq.html#q1_9

Why CLR?
1. Cross-language 
2. Cross-platform (Mono, DotGNU) -- Linux, Windows  OS X
3. Database access  -- ADO.Net  Provider Factory

Mono ships with support for Postgres, MySql, Sybase, DB2, SqlLite, Tds 
(SQL server protocol) and Oracle databases. 
- Mono.Data.SqliteClient.dll: Sqlite database provider. 
- Mono.Data.SybaseClient.dll: Sybase database provider. 
- Mono.Data.TdsClient.dll: MS SQL server provider. 
- Mono.Data.Tds.dll: MS SQL server provider. 
- ByteFX.Data.dll: MySQL client access library (LGPL) 
- Npgsql.dll: Postgresql client access library (LGPL) 
http://www.mono-project.com/about/faq.html

Jim Callahan
Management, 

[R] how to drop rows from a data.frame

2004-06-30 Thread Peter Wilkinson
here is a snippet of data where I would like to drop all rows that have 
zeros across them, and keep the rest of the rows while maintaining the row 
names (1,2,3, ...10). The idea here is that a row of zeros is an indication 
that the row must be dropped. There will never be the case where there is a 
row(of n columns) with less than 5 zeros in this case(n zeros

I am unsure how to manipulate the data frame to drop rows whiles keeping 
row names.

Peter
the data (imagine separated by tabs):
  SEKH0001  SEKH0002 SEKH0003 SEKH0004 SEKH0005
 [1,] 256.1139  256.1139 256.1139 256.1139 256.1139
 [2,] 283.0741  695.1000 614.5117 453.0342 500.1436
 [3,] 257.3578  305.0818 257.3578 257.3578 257.3578
 [4,]   0.0.   0.   0.   0.
 [5,]   0.0.   0.   0.   0.
 [6,]   0.0.   0.   0.   0.
 [7,]   0.0.   0.   0.   0.
 [8,] 257.  257. 257. 257. 257.
 [9,] 305.7857 2450.0417 335.5428 305.7857 584.2485
[10,]   0.0.   0.   0.   0.
what I want it to look like:
  SEKH0001  SEKH0002 SEKH0003 SEKH0004 SEKH0005
 [1,] 256.1139  256.1139 256.1139 256.1139 256.1139
 [2,] 283.0741  695.1000 614.5117 453.0342 500.1436
 [3,] 257.3578  305.0818 257.3578 257.3578 257.3578
 [8,] 257.  257. 257. 257. 257.
 [9,] 305.7857 2450.0417 335.5428 305.7857 584.2485
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] how to drop rows from a data.frame

2004-06-30 Thread Patrick Connolly
On Wed, 30-Jun-2004 at 11:57PM -0400, Peter Wilkinson wrote:

| here is a snippet of data where I would like to drop all rows that have 
| zeros across them, and keep the rest of the rows while maintaining the row 
| names (1,2,3, ...10). The idea here is that a row of zeros is an indication 
| that the row must be dropped. There will never be the case where there is a 
| row(of n columns) with less than 5 zeros in this case(n zeros
| 
| I am unsure how to manipulate the data frame to drop rows whiles keeping 
| row names.
| 
| Peter
| 
| the data (imagine separated by tabs):
| 
|SEKH0001  SEKH0002 SEKH0003 SEKH0004 SEKH0005
|   [1,] 256.1139  256.1139 256.1139 256.1139 256.1139
|   [2,] 283.0741  695.1000 614.5117 453.0342 500.1436
|   [3,] 257.3578  305.0818 257.3578 257.3578 257.3578
|   [4,]   0.0.   0.   0.   0.
|   [5,]   0.0.   0.   0.   0.
|   [6,]   0.0.   0.   0.   0.
|   [7,]   0.0.   0.   0.   0.
|   [8,] 257.  257. 257. 257. 257.
|   [9,] 305.7857 2450.0417 335.5428 305.7857 584.2485
| [10,]   0.0.   0.   0.   0.

I'm curious to know how you got those row names.  I suspect you really
have a matrix.

If it were a dataframe, it would behave exactly how you are reqesting
-- depending on how you got rid of rows 4:7.

Try as.data.frame(whatever.your.data.is.now)

Then deleting the rows will look like:

  SEKH0001  SEKH0002 SEKH0003 SEKH0004 SEKH0005
1 256.1139  256.1139 256.1139 256.1139 256.1139
2 283.0741  695.1000 614.5117 453.0342 500.1436
3 257.3578  305.0818 257.3578 257.3578 257.3578
8 257.  257. 257. 257. 257.
9 305.7857 2450.0417 335.5428 305.7857 584.2485


HTH


-- 
Patrick Connolly
HortResearch
Mt Albert
Auckland
New Zealand 
Ph: +64-9 815 4200 x 7188
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~
I have the world`s largest collection of seashells. I keep it on all
the beaches of the world ... Perhaps you`ve seen it.  ---Steven Wright 
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] how to drop rows from a data.frame

2004-06-30 Thread Peter Wilkinson
You right its a matrix (I ran an is.matrix() on my object).
Thanks,
Peter
At 12:13 AM 7/1/2004, Patrick Connolly wrote:
On Wed, 30-Jun-2004 at 11:57PM -0400, Peter Wilkinson wrote:
| here is a snippet of data where I would like to drop all rows that have
| zeros across them, and keep the rest of the rows while maintaining the row
| names (1,2,3, ...10). The idea here is that a row of zeros is an 
indication
| that the row must be dropped. There will never be the case where there 
is a
| row(of n columns) with less than 5 zeros in this case(n zeros
|
| I am unsure how to manipulate the data frame to drop rows whiles keeping
| row names.
|
| Peter
|
| the data (imagine separated by tabs):
|
|SEKH0001  SEKH0002 SEKH0003 SEKH0004 SEKH0005
|   [1,] 256.1139  256.1139 256.1139 256.1139 256.1139
|   [2,] 283.0741  695.1000 614.5117 453.0342 500.1436
|   [3,] 257.3578  305.0818 257.3578 257.3578 257.3578
|   [4,]   0.0.   0.   0.   0.
|   [5,]   0.0.   0.   0.   0.
|   [6,]   0.0.   0.   0.   0.
|   [7,]   0.0.   0.   0.   0.
|   [8,] 257.  257. 257. 257. 257.
|   [9,] 305.7857 2450.0417 335.5428 305.7857 584.2485
| [10,]   0.0.   0.   0.   0.

I'm curious to know how you got those row names.  I suspect you really
have a matrix.
If it were a dataframe, it would behave exactly how you are reqesting
-- depending on how you got rid of rows 4:7.
Try as.data.frame(whatever.your.data.is.now)
Then deleting the rows will look like:
  SEKH0001  SEKH0002 SEKH0003 SEKH0004 SEKH0005
1 256.1139  256.1139 256.1139 256.1139 256.1139
2 283.0741  695.1000 614.5117 453.0342 500.1436
3 257.3578  305.0818 257.3578 257.3578 257.3578
8 257.  257. 257. 257. 257.
9 305.7857 2450.0417 335.5428 305.7857 584.2485
HTH
--
Patrick Connolly
HortResearch
Mt Albert
Auckland
New Zealand
Ph: +64-9 815 4200 x 7188
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~
I have the world`s largest collection of seashells. I keep it on all
the beaches of the world ... Perhaps you`ve seen it.  ---Steven Wright
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] how to drop rows from a data.frame

2004-06-30 Thread Gabor Grothendieck

Assuming all the entries are non-negative and non-NA this will do it:

DF[rowSums(DF)  0,]



Peter Wilkinson pwilkinson at videotron.ca writes:

: 
: here is a snippet of data where I would like to drop all rows that have 
: zeros across them, and keep the rest of the rows while maintaining the row 
: names (1,2,3, ...10). The idea here is that a row of zeros is an indication 
: that the row must be dropped. There will never be the case where there is a 
: row(of n columns) with less than 5 zeros in this case(n zeros
: 
: I am unsure how to manipulate the data frame to drop rows whiles keeping 
: row names.
: 
: Peter
: 
: the data (imagine separated by tabs):
: 
:SEKH0001  SEKH0002 SEKH0003 SEKH0004 SEKH0005
:   [1,] 256.1139  256.1139 256.1139 256.1139 256.1139
:   [2,] 283.0741  695.1000 614.5117 453.0342 500.1436
:   [3,] 257.3578  305.0818 257.3578 257.3578 257.3578
:   [4,]   0.0.   0.   0.   0.
:   [5,]   0.0.   0.   0.   0.
:   [6,]   0.0.   0.   0.   0.
:   [7,]   0.0.   0.   0.   0.
:   [8,] 257.  257. 257. 257. 257.
:   [9,] 305.7857 2450.0417 335.5428 305.7857 584.2485
: [10,]   0.0.   0.   0.   0.
: 
: what I want it to look like:
: 
:SEKH0001  SEKH0002 SEKH0003 SEKH0004 SEKH0005
:   [1,] 256.1139  256.1139 256.1139 256.1139 256.1139
:   [2,] 283.0741  695.1000 614.5117 453.0342 500.1436
:   [3,] 257.3578  305.0818 257.3578 257.3578 257.3578
:   [8,] 257.  257. 257. 257. 257.
:   [9,] 305.7857 2450.0417 335.5428 305.7857 584.2485

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] drop rows

2004-06-30 Thread Peter Wilkinson
his ...
h looks like your working with microarray data as well  The actual 
matrix that I am working with is 19000 x 340

Thanks for the help, that was perfect.  I am still getting used to the R 
language way of doing things.I will look into the apply function as you 
have written it.

Peter
At 12:22 AM 7/1/2004, Erin Hodgess wrote:
Hi Peter!
Here is an example:
 x
[,1][,2][,3]   [,4][,5]
 [1,]  2.1632497  0.43219960  0.05329827  0.1484550  2.12996660
 [2,]  0.000  0.  0.  0.000  0.
 [3,] -1.2230673  0.83467155 -0.14820752 -0.1012919 -0.04410457
 [4,] -0.5397403  0.92664487 -0.30390539  0.3105849 -0.69958321
 [5,]  1.0112805  1.13063148 -1.59802451  0.7597861 -0.72821421
 [6,] -1.1170756 -0.05128944  0.02755781 -0.8896866  0.12294861
 [7,]  0.000  0.  0.  0.000  0.
 [8,]  0.7043937  0.82557039 -1.38759266  0.5266536  0.67345991
 [9,]  0.7522765  0.25513348 -1.00076227  0.1141770  1.70003769
[10,]  0.3371948 -1.48590028 -0.67115529 -0.8242699  1.32741665
 #This takes out the rows with ANY zeros
 x[!apply(x,1,function(x)any(x)==0),]
   [,1][,2][,3]   [,4][,5]
[1,]  2.1632497  0.43219960  0.05329827  0.1484550  2.12996660
[2,] -1.2230673  0.83467155 -0.14820752 -0.1012919 -0.04410457
[3,] -0.5397403  0.92664487 -0.30390539  0.3105849 -0.69958321
[4,]  1.0112805  1.13063148 -1.59802451  0.7597861 -0.72821421
[5,] -1.1170756 -0.05128944  0.02755781 -0.8896866  0.12294861
[6,]  0.7043937  0.82557039 -1.38759266  0.5266536  0.67345991
[7,]  0.7522765  0.25513348 -1.00076227  0.1141770  1.70003769
[8,]  0.3371948 -1.48590028 -0.67115529 -0.8242699  1.32741665
 #This takes out the rows with ALL zeros
 x[!apply(x,1,function(x)all(x)==0),]
   [,1][,2][,3]   [,4][,5]
[1,]  2.1632497  0.43219960  0.05329827  0.1484550  2.12996660
[2,] -1.2230673  0.83467155 -0.14820752 -0.1012919 -0.04410457
[3,] -0.5397403  0.92664487 -0.30390539  0.3105849 -0.69958321
[4,]  1.0112805  1.13063148 -1.59802451  0.7597861 -0.72821421
[5,] -1.1170756 -0.05128944  0.02755781 -0.8896866  0.12294861
[6,]  0.7043937  0.82557039 -1.38759266  0.5266536  0.67345991
[7,]  0.7522765  0.25513348 -1.00076227  0.1141770  1.70003769
[8,]  0.3371948 -1.48590028 -0.67115529 -0.8242699  1.32741665

 Hope this helps!
 Sincerely,
Erin Hodgess
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: [EMAIL PROTECTED]

From: Peter Wilkinson [EMAIL PROTECTED]
Subject: [R] how to drop rows from a data.frame
here is a snippet of data where I would like to drop all rows that have
zeros across them, and keep the rest of the rows while maintaining the row
names (1,2,3, ...10). The idea here is that a row of zeros is an indication
that the row must be dropped. There will never be the case where there is a
row(of n columns) with less than 5 zeros in this case(n zeros
I am unsure how to manipulate the data frame to drop rows whiles keeping
row names.
Peter
the data (imagine separated by tabs):
   SEKH0001  SEKH0002 SEKH0003 SEKH0004 SEKH0005
  [1,] 256.1139  256.1139 256.1139 256.1139 256.1139
  [2,] 283.0741  695.1000 614.5117 453.0342 500.1436
  [3,] 257.3578  305.0818 257.3578 257.3578 257.3578
  [4,]   0.0.   0.   0.   0.
  [5,]   0.0.   0.   0.   0.
  [6,]   0.0.   0.   0.   0.
  [7,]   0.0.   0.   0.   0.
  [8,] 257.  257. 257. 257. 257.
  [9,] 305.7857 2450.0417 335.5428 305.7857 584.2485
[10,]   0.0.   0.   0.   0.
what I want it to look like:
   SEKH0001  SEKH0002 SEKH0003 SEKH0004 SEKH0005
  [1,] 256.1139  256.1139 256.1139 256.1139 256.1139
  [2,] 283.0741  695.1000 614.5117 453.0342 500.1436
  [3,] 257.3578  305.0818 257.3578 257.3578 257.3578
  [8,] 257.  257. 257. 257. 257.
  [9,] 305.7857 2450.0417 335.5428 305.7857 584.2485
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] how to drop rows from a data.frame

2004-06-30 Thread Peter Wilkinson
Thanks for everyone's help, there seems to be many ways of solving the 
problem that work well.

Peter
At 12:36 AM 7/1/2004, Gabor Grothendieck wrote:
Assuming all the entries are non-negative and non-NA this will do it:
DF[rowSums(DF)  0,]

Peter Wilkinson pwilkinson at videotron.ca writes:
:
: here is a snippet of data where I would like to drop all rows that have
: zeros across them, and keep the rest of the rows while maintaining the row
: names (1,2,3, ...10). The idea here is that a row of zeros is an indication
: that the row must be dropped. There will never be the case where there is a
: row(of n columns) with less than 5 zeros in this case(n zeros
:
: I am unsure how to manipulate the data frame to drop rows whiles keeping
: row names.
:
: Peter
:
: the data (imagine separated by tabs):
:
:SEKH0001  SEKH0002 SEKH0003 SEKH0004 SEKH0005
:   [1,] 256.1139  256.1139 256.1139 256.1139 256.1139
:   [2,] 283.0741  695.1000 614.5117 453.0342 500.1436
:   [3,] 257.3578  305.0818 257.3578 257.3578 257.3578
:   [4,]   0.0.   0.   0.   0.
:   [5,]   0.0.   0.   0.   0.
:   [6,]   0.0.   0.   0.   0.
:   [7,]   0.0.   0.   0.   0.
:   [8,] 257.  257. 257. 257. 257.
:   [9,] 305.7857 2450.0417 335.5428 305.7857 584.2485
: [10,]   0.0.   0.   0.   0.
:
: what I want it to look like:
:
:SEKH0001  SEKH0002 SEKH0003 SEKH0004 SEKH0005
:   [1,] 256.1139  256.1139 256.1139 256.1139 256.1139
:   [2,] 283.0741  695.1000 614.5117 453.0342 500.1436
:   [3,] 257.3578  305.0818 257.3578 257.3578 257.3578
:   [8,] 257.  257. 257. 257. 257.
:   [9,] 305.7857 2450.0417 335.5428 305.7857 584.2485
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] RGL on Mac OS X

2004-06-30 Thread Matthew Cohen
Scanning various lists for R, I've noticed that a few people have raised the
question of getting the rgl package to run in R on the Mac operating
system.  As far as I can tell (and I must admit to being a novice here), the
problem has to do with the inclusion of the /usr/X11R6/lib in R's
environment, masking one framework with another.  More than one of the
people commenting on the problem seemed to suggest that it should be fairly
simple to fix or at least temporarily hack.  However, it is not clear to me
how to do it, given that I know very little about UNIX and even less about
R.  Could someone please give step-by-step instructions on what I need to do
to get rgl working in OS X?  Thanks a lot.

Matt


https://stat.ethz.ch/pipermail/r-sig-mac/2003-June/000868.html

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html