[R] GLARMA

2006-01-03 Thread 马丹
Hello, 
   I am a new R user and I need R code for GLARMA. 
I will be really thankful if you help me.

Yours sincerely,

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] problem with dse package (was KALMAN FILTER HELP)

2006-01-03 Thread Prof Brian Ripley
This has come up before: it needs a bug fix which Paul Gilbert has already 
implemented (but not yet released).

Please use an informative subject line, and don't SHOUT at us. (All caps 
is regarded as shouting, and BTW the package bundle is dse not DSE.)

On Tue, 3 Jan 2006, Sumanta Basak wrote:

 Currently I'm using DSE package for Kalman Filtering. I have a dataset
 of one dependent variable and seven other independent variables. I'm
 confused at one point. How to declare the input-output series using
 TSdata command. Because the given example at page 37 showing some error.

 rain - matrix(rnorm(86*17), 86,17)
 radar - matrix(rnorm(86*5), 86,5)
 mydata - TSdata(input=radar, output=rain)

 input data:

 Error: evaluation nested too deeply: infinite recursion /
 options(expressions=)?

 Can anyone explain it to me what's going wrong in this? In my data set,
 I have Change in Exchange Rate as my dependent variable and seven
 other economic variables as independent variables. I'm trying to
 forecast Change in Exchange Rate using available dataset of 244
 points. How can declare the input and output dataset in this framework?
 I hope I'm right to explain in this way what ultimately I'm going to do.
 After having a TSdata object, I want to use toSS to convert the TS model
 into state space model, and then use l.SS. Am I right in my thinking?
 Please advice, and many thanks in advance.

 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] GLARMA

2006-01-03 Thread Kjetil Halvorsen
On 12/30/05, Âíµ¤ [EMAIL PROTECTED] wrote:

 Hello,
I am a new R user and I need R code for GLARMA.
 I will be really thankful if you help me.


You should really spell out an acronym like GLARMA.
RSiteSearch(GLARMA) does't give anything. ou could have a look at
package sspir.

Kjetil


Yours sincerely,



 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html



[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Extending a data frame with S4

2006-01-03 Thread hadley wickham
I'm trying to create an extension to data.frame with more complex row
and column names, and have run into some problems:

 setClass(new-data.frame, representation(data.frame))
[1] new-data.frame
Warning message:
old-style ('S3') class data.frame supplied as a superclass of
new-data.frame, but no automatic conversion will be peformed for S3
classes in: .validDataPartClass(clDef, name)

Do I need to be worried about this?

 new(new-data.frame, data.frame())
Error in initialize(value, ...) : initialize method returned an object
of class data.frame instead of the required class new-data.frame

I guess this is related to the warning above.  I presume I can fix
this with an initialize function, but I'm not sure how to go about
referring to the data frame that is the object.
Is there a way to extend a data.frame, or do I need to create an
object that contains the data frame in a slot?

Thanks for your help,

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] A comment about R:

2006-01-03 Thread Rau, Roland
  -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Gabor 
 Grothendieck
 Sent: Monday, January 02, 2006 4:59 PM
 To: Philippe Grosjean
 Cc: Kort, Eric; Kjetil Halvorsen; R-help@stat.math.ethz.ch
 Subject: Re: [R] A comment about R:

 
 Probably what is needed is for someone familiar with both Stata and R
 to create a lexicon in the vein of the Octave to R lexicon
 
http://cran.r-project.org/doc/contrib/R-and-octave-2.txt
 
 to make it easier for Stata users to understand R.  Ditto for 
 SAS and SPSS.
 

IMO this is a very good proposal but I think that the main problem is
not the translation of one function in SPSS/Stata/SAS to the
equivalent in R.
Remembering my first contact with R after using SPSS for some years (and
having some experience with Stata and SAS) was that your mental
framework is different. You think in SPSS-terms (i.e. you expect that
data are automatically a rectangular matrix, functions operate on
columns of this matrix, you have always only one dataset available,
...). This is why jumping from SPSS to Stata is relatively easy. But
to jump from any of the three to R is much more difficult. 
This mental barrier is also the main obstacle for me now when I try to
encourage the use of R to other people who have a similar background as
I had.
What can be done about it? I guess the only answer is investing time
from the user which implies that R will probably never become the
language of choice for casual users. But popularity is probably not
the main goal of the R-Project (it would be rather a nice side-effect).

Just a few thoughts ...

Best,
Roland

+
This mail has been sent through the MPI for Demographic Rese...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] A comment about R:

2006-01-03 Thread Philippe Grosjean
Roland,

Yes, indeed, you are perfectly right. The problem is that R richness 
means R complexity: many different data types, sub-languages like 
regexp or the formula interface, S3/S4 objects, classical versus lattice 
(versus RGL versus iplots) graphs, etc. During translation of R in 
French, I was thinking of a subset of one or two hundreds of functions 
that would be enough for beginners to start with, and to propose a 
translation of that small subset of the online help in French. This is 
still on my todo list, but I must admit it is not an easy task to decide 
which function should be kept in the subset and which should not!

In fact, that idea could be, perhaps, generalized into the whole online 
help. It would be sufficient to add a flag somewhere (perhaps a keyword) 
telling that page is fundamental and to allow filtering index and pages 
  (fundamental only or full help). Even for advanced users, it 
should be nice to have such a filter to display only the two or three 
most important functions in a new packages that proposes perhaps hundred 
online help pages...

Using R Commander is also an interesting experiment. R Commander 
simplifies the use of R down to the manipulation of a single data frame 
(the so-called active dataset) + optionally one or two model objects. 
Just look at all you can do just with one active data frame with R 
Commander, and you will realize that it is perfectly manageable to learn 
R that way.

Best,

Philippe Grosjean


Rau, Roland wrote:
   -Original Message-
 
From: [EMAIL PROTECTED] 
[mailto:[EMAIL PROTECTED] On Behalf Of Gabor 
Grothendieck
Sent: Monday, January 02, 2006 4:59 PM
To: Philippe Grosjean
Cc: Kort, Eric; Kjetil Halvorsen; R-help@stat.math.ethz.ch
Subject: Re: [R] A comment about R:


Probably what is needed is for someone familiar with both Stata and R
to create a lexicon in the vein of the Octave to R lexicon

   http://cran.r-project.org/doc/contrib/R-and-octave-2.txt

to make it easier for Stata users to understand R.  Ditto for 
SAS and SPSS.


 
 IMO this is a very good proposal but I think that the main problem is
 not the translation of one function in SPSS/Stata/SAS to the
 equivalent in R.
 Remembering my first contact with R after using SPSS for some years (and
 having some experience with Stata and SAS) was that your mental
 framework is different. You think in SPSS-terms (i.e. you expect that
 data are automatically a rectangular matrix, functions operate on
 columns of this matrix, you have always only one dataset available,
 ...). This is why jumping from SPSS to Stata is relatively easy. But
 to jump from any of the three to R is much more difficult. 
 This mental barrier is also the main obstacle for me now when I try to
 encourage the use of R to other people who have a similar background as
 I had.
 What can be done about it? I guess the only answer is investing time
 from the user which implies that R will probably never become the
 language of choice for casual users. But popularity is probably not
 the main goal of the R-Project (it would be rather a nice side-effect).
 
 Just a few thoughts ...
 
 Best,
 Roland
 
 +
 This mail has been sent through the MPI for Demographic Rese...{{dropped}}
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] lmer error message

2006-01-03 Thread Abderrahim Oulhaj
Dear All,

I have the following error message when I fitted  lmer to a  binary data with 
the AGQ option:

Error in family$mu.eta(eta) : NAs are not allowed in subscripted assignments
In addition: Warning message:
IRLS iterations for PQL did not converge 

Any help?

Thanks in advance,

Abderrahim


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] A comment about R:

2006-01-03 Thread Peter Flom
 Rau, Roland [EMAIL PROTECTED]   wrote

IMO this is a very good proposal but I think that the main problem is
not the translation of one function in SPSS/Stata/SAS to the
equivalent in R.
Remembering my first contact with R after using SPSS for some years (and
having some experience with Stata and SAS) was that your mental
framework is different. You think in SPSS-terms (i.e. you expect that
data are automatically a rectangular matrix, functions operate on
columns of this matrix, you have always only one dataset available,
...). This is why jumping from SPSS to Stata is relatively easy. But
to jump from any of the three to R is much more difficult. 
This mental barrier is also the main obstacle for me now when I try to
encourage the use of R to other people who have a similar background as
I had.
What can be done about it? I guess the only answer is investing time
from the user which implies that R will probably never become the
language of choice for casual users. But popularity is probably not
the main goal of the R-Project (it would be rather a nice side-effect).




As someone who uses SAS qutie a bit and R somewhat less, I think Roland 
makes some excellent points.  Going from SPSS to SAS (which I once did)
is like going from Spansih to French.  Going from SAS to R (which I am
trying to do) is like going from English to Chinese.

But it's more than that.  

Beyond the obvious differences in the languages is a difference in how
they are written about;
and how they are improved.  SAS documentation is much lengthier than
R's.  Some people like
the terseness of R's help.  Some like the verboseness of SAS's.  SOme of
this difference is doubtless
due to the fact that SAS is commercial, and pays people to write the
documentation.  I have tremednous
appreciation for the unpaid effort that goes into R, and nothing I say
here should be seen as detracting from that.

As to how they are improved, the fact that R is extended (in part) by
packages written by many many different
people is good, becuase it means that the latest techniques can be
written up, often by the people who
invent the techniques (and, again, I appreciate this tremendously), but
it does mean that a) It is hard to know what
is out there at any given time; b) the styles of pacakages difer
somewhat.

In addition, I think the distinction between 'casual user' and serious
user is something of a false dichotomy.
It's really a continuum, or, probably, several continua, that make R
harder or easier for people to learn.

I like R.  I like it a lot.  I like that it's free.  I like that it's
cutting edge.  I like that it can do amazing graphics.
I like that the code is open.  I like that I can write my own functions
in the same language.  And, again,
I am amazed at the amount of time and effort people put into it.

 But I do think that the link in the original post made some good
points, and the writer
of that post is not the only one who has found R difficult to learn.


Peter

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Extending a data frame with S4

2006-01-03 Thread Matthias Kohl
the help page on setOldClass might help you. In particular the section 
Register or Convert?.

Matthias

hadley wickham schrieb:

I'm trying to create an extension to data.frame with more complex row
and column names, and have run into some problems:

  

setClass(new-data.frame, representation(data.frame))


[1] new-data.frame
Warning message:
old-style ('S3') class data.frame supplied as a superclass of
new-data.frame, but no automatic conversion will be peformed for S3
classes in: .validDataPartClass(clDef, name)

Do I need to be worried about this?

  

new(new-data.frame, data.frame())


Error in initialize(value, ...) : initialize method returned an object
of class data.frame instead of the required class new-data.frame

I guess this is related to the warning above.  I presume I can fix
this with an initialize function, but I'm not sure how to go about
referring to the data frame that is the object.
Is there a way to extend a data.frame, or do I need to create an
object that contains the data frame in a slot?

Thanks for your help,

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
  



-- 
StaMatS - Statistik + Mathematik Service
Dipl.Math.(Univ.) Matthias Kohl
www.stamats.de

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] A comment about R:

2006-01-03 Thread Gabor Grothendieck
On 1/3/06, Peter Flom [EMAIL PROTECTED] wrote:
  Rau, Roland [EMAIL PROTECTED]   wrote
 
 IMO this is a very good proposal but I think that the main problem is
 not the translation of one function in SPSS/Stata/SAS to the
 equivalent in R.
 Remembering my first contact with R after using SPSS for some years (and
 having some experience with Stata and SAS) was that your mental
 framework is different. You think in SPSS-terms (i.e. you expect that
 data are automatically a rectangular matrix, functions operate on
 columns of this matrix, you have always only one dataset available,
 ...). This is why jumping from SPSS to Stata is relatively easy. But
 to jump from any of the three to R is much more difficult.
 This mental barrier is also the main obstacle for me now when I try to
 encourage the use of R to other people who have a similar background as
 I had.
 What can be done about it? I guess the only answer is investing time
 from the user which implies that R will probably never become the
 language of choice for casual users. But popularity is probably not
 the main goal of the R-Project (it would be rather a nice side-effect).
 



 As someone who uses SAS qutie a bit and R somewhat less, I think Roland
 makes some excellent points.  Going from SPSS to SAS (which I once did)
 is like going from Spansih to French.  Going from SAS to R (which I am
 trying to do) is like going from English to Chinese.

 But it's more than that.

 Beyond the obvious differences in the languages is a difference in how
 they are written about;
 and how they are improved.  SAS documentation is much lengthier than
 R's.  Some people like
 the terseness of R's help.  Some like the verboseness of SAS's.  SOme of

Note that at least some packages do have vignettes which are lengthier
discussions of the package than the help files, e.g.

   library(zoo)
   vignette(zoo)

 this difference is doubtless
 due to the fact that SAS is commercial, and pays people to write the
 documentation.  I have tremednous
 appreciation for the unpaid effort that goes into R, and nothing I say
 here should be seen as detracting from that.

 As to how they are improved, the fact that R is extended (in part) by
 packages written by many many different
 people is good, becuase it means that the latest techniques can be
 written up, often by the people who
 invent the techniques (and, again, I appreciate this tremendously), but
 it does mean that a) It is hard to know what
 is out there at any given time; b) the styles of pacakages difer
 somewhat.


Regarding (a) note that for certain areas CRAN Task Views
addresses this, at least in part.  See:

 http://cran.r-project.org/src/contrib/Views/

and R-News has a section on changes in CRAN which lists all new
packages since the prior issue of CRAN.   See:

http://cran.r-project.org/doc/Rnews

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] need to know some basic functionality features of R-Proj

2006-01-03 Thread Mohammed Asifulla - CTD , Chennai
Hi,

I am new-comer to statistics and R-Project. I would like to know if these
features can be attained in R-Project.Please help.

1)  beta 1 and Beta 2, or gamma one and gamma two for skewness and kurtosis,
respectively, including standard errors and tests for significance (relative
to values for a Gaussian distribution).
2)  linear correlation
3)  quadratic regression
4)  polynomial regression
5)  moving averages
6)  chi-square for a two-by two table and for an n by m contingency table
7)  moving averages - with various (e.g. exponential) weighting
8)  cubic splines (smoothing, not interpolating)
9)  other types of splines, e.g. 'linear' splines
10) erfc-1  inverse error function complement (i.e. tables of integrals of
the normal (Gaussian) curve, or mathematical approximations)
11) erfcerror function complement
12) Table of significant values for t test at P  0.01 one sided or two
sided - or polynomial approximation
13) Table of significance levels for chi square test
14) Table of significance levels for F distribution  as arising in ANOVA
15) Confidence limits for binomial variables; possibly for multinomial
variables

Thanks and Regards
-Asif

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] S3 vs. S4

2006-01-03 Thread Sean Davis



On 1/1/06 2:07 PM, Erin Hodgess [EMAIL PROTECTED] wrote:

 Dear R People: 
 
 Could someone direct me to some documentation on the
 difference between S3 and S4 classes, please?
 
 For example, why would a person use one as opposed to another?
 Maybe pros and cons of each?

The Bioconductor project has encouraged my use of S4 classes.  S4 allows
creation of data structures that have methods associated with them, so for
data-structure heavy programming, I think S4 might have some advantages, but
I am NOT an expert in the field.

Just one other link that I have found quite useful:

http://www.stat.auckland.ac.nz/S-Workshop/Gentleman/S4Objects.pdf

Sean

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] need to know some basic functionality features of R-Proj

2006-01-03 Thread Sean Davis



On 1/3/06 6:46 AM, Mohammed Asifulla - CTD , Chennai [EMAIL PROTECTED]
wrote:

 Hi,
 
 I am new-comer to statistics and R-Project. I would like to know if these
 features can be attained in R-Project.Please help.
 
 1)  beta 1 and Beta 2, or gamma one and gamma two for skewness and kurtosis,
 respectively, including standard errors and tests for significance (relative
 to values for a Gaussian distribution).
 2)  linear correlation
 3)  quadratic regression
 4)  polynomial regression
 5)  moving averages
 6)  chi-square for a two-by two table and for an n by m contingency table
 7)  moving averages - with various (e.g. exponential) weighting
 8)  cubic splines (smoothing, not interpolating)
 9)  other types of splines, e.g. 'linear' splines
 10) erfc-1  inverse error function complement (i.e. tables of integrals of
 the normal (Gaussian) curve, or mathematical approximations)
 11) erfcerror function complement
 12) Table of significant values for t test at P  0.01 one sided or two
 sided - or polynomial approximation
 13) Table of significance levels for chi square test
 14) Table of significance levels for F distribution  as arising in ANOVA
 15) Confidence limits for binomial variables; possibly for multinomial
 variables

Asif,

It is highly likely that all these can be attained using R.  I think most
(if not all) of those on your list can be done with existing packages; for
those that can't, R is also a full-featured programming language, so you can
write functions to do what you like.  I would suggest starting with the
Introduction to R manual to learn what R can do.  It can be obtained via the
Manuals link at the left side of the R home page:

http://www.r-project.org

Also, if you are posting to the email list, it is quite helpful to read the
posting guide, available as a link at the bottom of all emails from this
list.

Sean

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Q about RSQLite

2006-01-03 Thread Pfaff, Bernhard Dr.
Hello Liu,

this might be caused by NA entries in your SQLite table. Have a look at the
following code:


(test - data.frame(matrix(c(1:10, NA, NA), ncol=2, nrow=6)))
con - dbConnect(SQLite(), dbname = test.db)
dbWriteTable(con, test, test, type=BLOB, overwrite=TRUE)
d1 - dbReadTable(con, test)
dbDisconnect(con)
d1


HTH,
Bernhard  

-Ursprüngliche Nachricht-
Von: Wensui Liu [mailto:[EMAIL PROTECTED] 
Gesendet: Samstag, 31. Dezember 2005 07:09
An: r-help@stat.math.ethz.ch
Betreff: [R] Q about RSQLite

Happy new year, dear listers,

I have a question about Rsqlite.

when I fetch the data out of sqlite database, there is something like '\r\n'
at the end of last column. Here is the example:
   Sepal_Length Sepal_Width Petal_Length Petal_WidthSpecies
1   5.1 3.5  1.4 0.2 setosa\r\n
2   4.9 3.0  1.4 0.2 setosa\r\n
3   4.7 3.2  1.3 0.2 setosa\r\n
4   4.6 3.1  1.5 0.2 setosa\r\n
5   5.0 3.6  1.4 0.2 setosa\r\n
6   5.4 3.9  1.7 0.4 setosa\r\n
7   4.6 3.4  1.4 0.3 setosa\r\n
8   5.0 3.4  1.5 0.2 setosa\r\n
9   4.4 2.9  1.4 0.2 setosa\r\n
10  4.9 3.1  1.5 0.1 setosa\r\n

Any idea?

Thank you so much

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
*
Confidentiality Note: The information contained in this mess...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Q about RSQLite

2006-01-03 Thread bogdan romocea
Check the way you imported the data / the SQLite documentation. The
\r\n that you see (you're on Windows, right?) is used to indicate the
end of the data lines in the source file - \r is a carriage return,
and \n is a new line character.


 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Wensui Liu
 Sent: Saturday, December 31, 2005 1:09 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Q about RSQLite


 Happy new year, dear listers,

 I have a question about Rsqlite.

 when I fetch the data out of sqlite database, there is
 something like '\r\n'
 at the end of last column. Here is the example:
Sepal_Length Sepal_Width Petal_Length Petal_WidthSpecies
 1   5.1 3.5  1.4 0.2 setosa\r\n
 2   4.9 3.0  1.4 0.2 setosa\r\n
 3   4.7 3.2  1.3 0.2 setosa\r\n
 4   4.6 3.1  1.5 0.2 setosa\r\n
 5   5.0 3.6  1.4 0.2 setosa\r\n
 6   5.4 3.9  1.7 0.4 setosa\r\n
 7   4.6 3.4  1.4 0.3 setosa\r\n
 8   5.0 3.4  1.5 0.2 setosa\r\n
 9   4.4 2.9  1.4 0.2 setosa\r\n
 10  4.9 3.1  1.5 0.1 setosa\r\n

 Any idea?

 Thank you so much

   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Bookmarking a page inside r-project.org

2006-01-03 Thread Friedrich . Leisch
 On Tue, 3 Jan 2006 07:29:27 +,
 hadley wickham (hw) wrote:

   A solution would be a content-management system that produced the HTML
   of the site from some other form of input.  Only the output HTML would
   need to be mirrored.  Care to put together such a thing, and import all
   the existing pages into it?

   One way to get around the offline problem is to have a dynamic copy
   somewhere and then spider and save it (eg. with wget -r).  This would
   (obviously) require a server somewhere - but with a post-commit svn
   hook could be kept up to date easily.  However, it is still difficult
   to view changes to the page immediately.

   What assumptions can I make about what tools are available to the
   editors?  Can I assume the standard unix tool chain?

Yes.

   What
   assumptions can I make about the people doing the editing?  How many
   people edit the pages?

For www.R-project.org all of R core have write access, but only a few
actually do it ;-)

   How familiar with html are they?

Hard to tell, let's assume at least basic familiarity with HTML (but
very good familiarity to the concept of markup laguages per se).

   You say many
   of the pages are manually edited, which ones aren't?

Under www.r-project.org I think all are manual.

   How are they
   generated?

on CRAN all package listings are of course auto-generated (mostly
using perl scripts), the mirror list is created using R.

   Are all the pages under
   https://svn.r-project.org/R-project-web/trunk/ ?


No, CRAN is not, as it is pulled together from various sites where
maintainers of binary distributions etc. create their parts - the
CRAN master itself is mirror for the pits and pieces (e.g., windows
R base binaries are mirrored from Duncan Murdoch, windows packages
from Uwe Ligges, etc. etc.).

Best,

-- 
---
Friedrich Leisch 
Institut für Statistik Tel: (+43 1) 58801 10715
Technische Universität WienFax: (+43 1) 58801 10798
Wiedner Hauptstraße 8-10/1071
A-1040 Wien, Austria http://www.ci.tuwien.ac.at/~leisch

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] r: RODBC QUESTION

2006-01-03 Thread roger bos
Clark,

I agree with Vitor that working in R might be easier, but it seems that you
are working in the excel VBA environment and there may be good reason why
you want to do so that we don't know about.  If so, why use Rexcel function
to read the file into excel when you can use VBA code to open the file in
excel and then you can send the data you need to analyse to R using Rexcel?

And of course the great thing about VBA is that if you don't know how to
code what you want to do, you can always record it as a macro and then view
the code (a neat feature that S-PLUS has too).
Good luck and please follow up with more questions if our suggestions are of
no help to you.

Thanks,

Roger



On 12/31/05, Vitor Chagas [EMAIL PROTECTED] wrote:

 Hello Allan,

 You can work in two different ways, from Excel using
 RExcel or from R with RODBC. Personally i prefer
 working from R.

 You can start by giving names to the excel ranges
 (remember to put the var names in the 1st line), then
 run the following code
 to select the excel spreadsheet

 library(RODBC)
 # Select XLS File
 xls.file = choose.files(filters = *.xls)
 workdir = unlist(strsplit(xls.file, ))
 workdir = paste(workdir[-length(workdir)],
 collapse=/)
 setwd(workdir)
 # Connect to XLS Data
 channel - odbcConnectExcel(xls.file)

 after this u can check what tables (ranges) are
 available to work with

 sqlTables(channel)

 if you have a range named «Claims», use something like
 this to load it in to R

 xlClaims = sqlQuery(channel, paste(SELECT * FROM
 Claims))

 and close the connection with

 close(channel)


 Hope it helps, and sorry for my poor english. Best
 regards,

 Vitor


 --- Clark Allan [EMAIL PROTECTED] wrote:

  hello all
 
 
  i have a quick question. i have been using the RODBC
  library (trying to
  read Excel data
   into R but i am doing this by using Rexcel. this is
  probably not the
  correct forum -
  sorry for this).
 
  my code is shown below:
 
  Sub A()
 
  'start the connection to R
  Call RInterface.StartRServer
 
  RInterface.RRun library(RODBC)
  RInterface.RRun A =
  odbcConnectExcel('c:/TRY.xls')
 
  RInterface.RRun q1 = sqlFetch(A, 'Sheet1')
 
  RInterface.RRun odbcClose (A)
 
  Worksheets(out).Activate
 
  Call RInterface.GetArray(q1, Range(A1))
 
  Call RInterface.StopRServer
 
  End Sub
 
 
  i have included four examples below. on the left
  hand side we have the
  data as it appears
   in Excel and on the right hand side we have the
  output from the code
  (outputted to the
   'out' sheet in excel). in the first example the
  code works while in the
  other three
  exampl0es it does not. ('a' is some character) when
  i use the commands
  in r directly everything works correctly (ie missing
  values are treated
  as NA - characters is treated similarly)
 
  can anyone show me how to solve this!
 
  ANOTHER QUESTION: am i allowed to have numeric and
  character values in
  the same column when importing from Excel to R?
  (seems like i cant)
 
  thanking you in advance!
 
  wishing you all a happy new year (in advance)
  /
  allan
 
 
  Y X1  X2  1   6   3
  1 6   3   2   6   2
  2 6   2   3   5   2
  3 5   2
 
 
  Y X1  X2  0
  1 6   3
  2 6   2
  3 a   2
 
 
  Y X1  X2  0
  1 6   3
  2 6   2
  3 a   2
 
 
  Y X1  X2  0
  1 3
  2 6   2
  3 5   2
 __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Labels exceed the plot area

2006-01-03 Thread Kare Edvardsen
If I use cex.lab = 2 and cex.axis = 2 the yaxis label in a plot exceed 
the plot area. How do I get the plot itself smaller to get space for the 
label so I still use cex.lab = 2 and cex.axis = 2?

Kare

-- 
###
Kare Edvardsen [EMAIL PROTECTED]
Norwegian Institute for Air Research (NILU)
Polarmiljosenteret
NO-9296 Tromso   http://www.nilu.no
Swb. +47 77 75 03 75 Dir. +47 77 75 03 90
Fax. +47 77 75 03 76 Mob. +47 90 74 60 69
###

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Package for multiple membership model?

2006-01-03 Thread Brian Perron
Hello all:  
 
I am interested in computing what the multilevel modeling literature calls a 
multiple membership model.  More specifically, I am working with a data set 
involving clients and providers.  The clients are the lower-level units who are 
nested within providers (higher-level).  However, this is not nesting in the 
usual sense, as clients can belong to multple providers, which I understand 
makes this a multiple membership model.  Right now, I would like to keep this 
simple, using only a continuous dependent variable, but would like to also 
extend this to a repeated measures design.  This doesn't seem to be possible 
with the lme package.  Is there something else I could consider?  
Thanks,
Brian
 

NIMH Training Fellow
GWB School of Social Work, PhD Program
Washington University in St. Louis
One Brookings Drive
St. Louis, MO  63130

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] KALMAN FILTER HELP

2006-01-03 Thread Paul Gilbert
Is this happening with the example as you show it, or are you trying to 
print mydata?

There is a bug in the print method for TSdata objects, which I have 
fixed and was intending to put on CRAN in a few days. This bug does give 
the infinite recursion error, but would only happen when you print the 
data by typing

mydata
or
print(mydata)

I don't think the assignment you show would produce this problem, but 
please send me more details if it does. The problem, which will be fixed 
in the next release, is only with the print method. Other things are 
working and you should be able to do model estimation, conversion, and 
plot the data, just not print it.

Paul Gilbert

Sumanta Basak wrote:

 Hi All,

 Currently I’m using DSE package for Kalman Filtering. I have a dataset 
 of one dependent variable and seven other independent variables. I’m 
 confused at one point. How to declare the input-output series using 
 TSdata command. Because the given example at page 37 showing some error.

 rain - matrix(rnorm(86*17), 86,17)

 radar - matrix(rnorm(86*5), 86,5)

 mydata - TSdata(input=radar, output=rain)

 *input data:*

 *Error: evaluation nested too deeply: infinite recursion / 
 options(expressions=)?*

 Can anyone explain it to me what’s going wrong in this? In my data 
 set, I have “Change in Exchange Rate” as my dependent variable and 
 seven other economic variables as independent variables. I’m trying to 
 forecast “Change in Exchange Rate” using available dataset of 244 
 points. How can declare the input and output dataset in this 
 framework? I hope I’m right to explain in this way what ultimately I’m 
 going to do. After having a TSdata object, I want to use toSS to 
 convert the TS model into state space model, and then use l.SS. Am I 
 right in my thinking? Please advice, and many thanks in advance.

 --

 SUMANTA BASAK.

 Analyst.

 Phone No. - 080 - 41989937 (O)

 09886047620 (M)

 Amba Research (India) Pvt Ltd.

 G02 Prestige Loka.

 7/1, Brunton Road.

 Bangalore - 560025.

 India.

 --

 ---
 This e-mail may contain confidential and/or privileged inf...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] how to work on multiple R objects?...

2006-01-03 Thread Constantine Tsardounis
Hello, Happy New Year!...

I am encountering a problem trying to work on the data that I load in R.

I have loaded to R a series of stock data using
(csv files are named e.g. IBM.R)

length.R - length(list.files(., pattern=.R)) # the number of
files with one #column in the directory ./ ending to
.R
for (i in 1:length.R) {
assign(read.csv(list.files(., pattern=.R)[i],
read.csv(list.files(., pattern=.R)[i])))
}

I would like to perform various tasks on all these objects, but I cannot because
 ls(pattern=.R)[1]
is not a list, but a character string!!!:
 typeof(ls(pa=.R)[1])
[1] character

Exempli gratia:
 typeof(ls(pattern=.R)[45])  
[1] character
 ls(pattern=.R)[45]   I do not want that:
[1] wmd.txt.R
 typeof(wmd.txt.R) I want that:
[1] list

so that I can find the mean of the series on all of these files/loaded
objects with a loop that uses the command
mean(wmd.txt.R) instead of mean(wmd.txt.R) that does not work...

Could you help me, please or propose another way to achieve the same result?

Thank you very much for your assistance,

Tsardounis Constantine

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] how to work on multiple R objects?...

2006-01-03 Thread Sean Davis



On 1/3/06 10:24 AM, Constantine Tsardounis [EMAIL PROTECTED]
wrote:

 Hello, Happy New Year!...
 
 I am encountering a problem trying to work on the data that I load in R.
 
 I have loaded to R a series of stock data using
 (csv files are named e.g. IBM.R)
 
 length.R - length(list.files(., pattern=.R)) # the number of
 files with one #column in the directory ./ ending to
 .R
 for (i in 1:length.R) {
 assign(read.csv(list.files(., pattern=.R)[i],
 read.csv(list.files(., pattern=.R)[i])))
 }

 mylist - list()
 for (i in list.files('.',pattern='.R')) {
   mylist[[i]] - read.csv(i)
 }

Sean

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] How to set the size of a rgl window, par3d() ?

2006-01-03 Thread begert
Dear R- Users,

   is there a way to determine the size of
   an rgl window (rgl.open()) either in advance or
   afterwards, (without using the mouse, of course) ?

   Intuitively, one would assume to set the size by:

   library(rgl);
   par3d(viewport=c(0,0,500,500));
   #rgl.open();

   for example. As the parameter 'viewport' is 'readonly'
   this results in an error message:
   Error in par3d(viewport = c(0, 0, 500, 500)) :
   invalid value specified for rgl parameter viewport
   In addition: Warning message:
   parameter viewport cannot be set.

   Any possible workarounds ?

Thanks
Bjoern

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] A comment about R:

2006-01-03 Thread Patrick Burns
I have had an email conversation with the author of the
technical report from which the quote was taken.  I am
formulating a comment to the report that will be posted
with the technical report.

I would be pleased if this thread continued, so I will know
better what I want to say.  Plus I should be able to reference
this thread in the comment.

Regards,

Patrick Burns
[EMAIL PROTECTED]
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and A Guide for the Unwilling S User)

Rau, Roland wrote:

  -Original Message-
  

From: [EMAIL PROTECTED] 
[mailto:[EMAIL PROTECTED] On Behalf Of Gabor 
Grothendieck
Sent: Monday, January 02, 2006 4:59 PM
To: Philippe Grosjean
Cc: Kort, Eric; Kjetil Halvorsen; R-help@stat.math.ethz.ch
Subject: Re: [R] A comment about R:


Probably what is needed is for someone familiar with both Stata and R
to create a lexicon in the vein of the Octave to R lexicon

   http://cran.r-project.org/doc/contrib/R-and-octave-2.txt

to make it easier for Stata users to understand R.  Ditto for 
SAS and SPSS.




IMO this is a very good proposal but I think that the main problem is
not the translation of one function in SPSS/Stata/SAS to the
equivalent in R.
Remembering my first contact with R after using SPSS for some years (and
having some experience with Stata and SAS) was that your mental
framework is different. You think in SPSS-terms (i.e. you expect that
data are automatically a rectangular matrix, functions operate on
columns of this matrix, you have always only one dataset available,
...). This is why jumping from SPSS to Stata is relatively easy. But
to jump from any of the three to R is much more difficult. 
This mental barrier is also the main obstacle for me now when I try to
encourage the use of R to other people who have a similar background as
I had.
What can be done about it? I guess the only answer is investing time
from the user which implies that R will probably never become the
language of choice for casual users. But popularity is probably not
the main goal of the R-Project (it would be rather a nice side-effect).

Just a few thoughts ...

Best,
Roland

+
This mail has been sent through the MPI for Demographic Rese...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html



  


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] A comment about R:

2006-01-03 Thread Thomas Lumley
On Mon, 2 Jan 2006, Philippe Grosjean wrote:

 That said, I think one should interpret Mitchell's paper in a different
 way. Obviously, he is an unconditional and happy Stata user (he even
 wrote a book about graphs programming in Stata). His claim in favor of
 Stata (versus SAS and SPSS, and also, indirectly, versus R) is to be
 interpreted the same way as unconditional lovers of Macintoshes or PCs
 would argue against the other clan. Both architectures are good and have
 strengths and weaknesses. Real arguments are more sentimental, and could
 resume in: The more I use it, the more I like it,... and the aliens are
 bad, ugly and stupid! Would this apply to Stata versus R? I don't know
 Stata at all, but I imagine it could be the case from what I read in
 Mitchell's paper...


I think there are good reasons why Stata is becoming much more popular in 
epidemiology and biostatistics [and I'm not particularly prejudiced 
against R]. In my experience people who like R also like Stata, though 
clearly the reverse is not necessarily true.

Stata, like R, is readily programmable.  Users can -- and do -- write 
and distribute programs that look just like the built-in routines.  There 
is an active and helpful mailing list. However, Stata programming is very 
different from R programming, since it is macro-based (think Tcl/Tk) 
rather than function-based.

Stata is also easier to learn: it has a very consistent syntax and even 
better documentation than R.  We use Stata for all our service course 
teaching, and despite the fact that it is command-line based rather than 
GUI the students were no more unhappy than when SPSS was used for the 
lowest-level courses and Egret for the higher-level service courses. 
[Stata now has a GUI but it is awful and quite a lot of students prefer 
the command-line]


-thomas

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] bookmarking a page inside r-project.org

2006-01-03 Thread bogdan romocea
In fact it's just as easy in Internet Explorer: right-click + Open in
New Window, or Shift-Click, followed by Ctrl+D. Or, right-click + Add
to Favorites.


 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of
 Charles Annis, P.E.
 Sent: Monday, January 02, 2006 8:15 PM
 To: 'Jonathan Baron'; r-help@stat.math.ethz.ch
 Subject: Re: [R] bookmarking a page inside r-project.org


 You can do something similar with Microsoft's browser but it
 isn't quite as
 easy as Foxfire:

 Right-click on the frame and choose Properties.  Then
 highlight and copy the
 URL and paste into the address window and click Go.

 Then save the page.



 Charles Annis, P.E.

 [EMAIL PROTECTED]
 phone: 561-352-9699
 eFax:  614-455-3265
 http://www.StatisticalEngineering.com


 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Jonathan Baron
 Sent: Monday, January 02, 2006 7:45 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] bookmarking a page inside r-project.org

 I'm replying to:
 https://stat.ethz.ch/pipermail/r-help/2006-January/083823.html

 In Firefox (a browser), right click on the frame.  Then you get a
 menu that has bookmark as one of the options.  Firefox is
 available from http://www.mozilla.org.

 Jon
 --
 Jonathan Baron, Professor of Psychology, University of Pennsylvania
 Home page: http://www.sas.upenn.edu/~baron

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Package for multiple membership model?

2006-01-03 Thread Shige Song
Souds like a model with cross-classified random effects. Lme4 can handle
this easily.

Shige

On 1/3/06, Brian Perron [EMAIL PROTECTED] wrote:

 Hello all:

 I am interested in computing what the multilevel modeling literature calls
 a multiple membership model.  More specifically, I am working with a data
 set involving clients and providers.  The clients are the lower-level units
 who are nested within providers (higher-level).  However, this is not
 nesting in the usual sense, as clients can belong to multple providers,
 which I understand makes this a multiple membership model.  Right now, I
 would like to keep this simple, using only a continuous dependent variable,
 but would like to also extend this to a repeated measures design.  This
 doesn't seem to be possible with the lme package.  Is there something else I
 could consider?
 Thanks,
 Brian


 NIMH Training Fellow
 GWB School of Social Work, PhD Program
 Washington University in St. Louis
 One Brookings Drive
 St. Louis, MO  63130

 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Package for multiple membership model?

2006-01-03 Thread Thomas Lumley

On Tue, 3 Jan 2006, Brian Perron wrote:

 Hello all:

 I am interested in computing what the multilevel modeling literature 
 calls a multiple membership model.  More specifically, I am working with 
 a data set involving clients and providers.  The clients are the 
 lower-level units who are nested within providers (higher-level). 
 However, this is not nesting in the usual sense, as clients can belong 
 to multple providers, which I understand makes this a multiple 
 membership model.  Right now, I would like to keep this simple, using 
 only a continuous dependent variable, but would like to also extend this 
 to a repeated measures design.  This doesn't seem to be possible with 
 the lme package.  Is there something else I could consider? Thanks,

I think you want lmer() in the lme4  Matrix packages. It allows crossed 
random effects.

-thomas

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] under (and over) dispersion in Poisson regression

2006-01-03 Thread John Sorkin
I am trying to use Poisson regression to model count data. My results are 
suggestive of under dispersion (0.79). How close to one does one want the 
measure of dispersion to be before one accepts the results of the analysis?

I know that there is no definitive answer to my question, but I would like to 
get some sense of general practice.
Thanks,
John

John Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
Baltimore VA Medical Center GRECC and
University of Maryland School of Medicine Claude Pepper OAIC

University of Maryland School of Medicine
Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524

410-605-7119 
- NOTE NEW EMAIL ADDRESS:
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Package for multiple membership model?

2006-01-03 Thread Peter Dalgaard
Brian Perron [EMAIL PROTECTED] writes:

 Hello all:  
  
 I am interested in computing what the multilevel modeling literature
 calls a multiple membership model. More specifically, I am working
 with a data set involving clients and providers. The clients are the
 lower-level units who are nested within providers (higher-level).
 However, this is not nesting in the usual sense, as clients can
 belong to multple providers, which I understand makes this a
 multiple membership model. Right now, I would like to keep this
 simple, using only a continuous dependent variable, but would like
 to also extend this to a repeated measures design. This doesn't seem
 to be possible with the lme package. Is there something else I could
 consider? Thanks, Brian

You could take a look at the lmer() function in the lme4/Matrix
packages - see the Rnews 2005/1 article. One potential problem is that
for repeated measurements, it is not (currently?) as strong on
correlation structure as lme().

You can actually deal with crossed random effects in lme() too, it
just gets a little more complicated, involving things like

library(nlme)
data(Assay)
as1 - lme(logDens~sample*dilut, data=Assay,
   random=pdBlocked(list(
 pdIdent(~1),
 pdIdent(~sample-1),
 pdIdent(~dilut-1

as2 - lme(logDens~sample*dilut, data=Assay,
   random=list(Block=pdBlocked(list(
 pdIdent(~1),
 pdIdent(~sample-1))),dilut=~1))

as3 - lme(logDens~sample*dilut, data=Assay,
   random=list(Block=~1,
 Block=pdIdent(~sample-1),
 dilut=~1))

which all fit the same model (but get the DF wrong in three different
ways...)

This is slightly different from your example because the crossed
factors are nested in Block, but you can always fake a nesting using

one - rep(1, length(logDens)) #or whatever 
lme(, random=list(one=~))

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] A comment about R:

2006-01-03 Thread John Fox
Dear Peter et al.,

It's not reasonable to argue with someone's experience -- that is, if people
tell me that they found R harder to learn than SAS, say, then I believe them
-- but that's not my experience in teaching relatively inexperienced
students to use statistical software. A few points:

(1) Casual and initial  use of statistical software is easier through a GUI,
so it's not reasonable, for example, to compare learning to use SPSS via its
GUI to learning R via commands.

(2) I don't believe that it's hard to teach a useful initial subset of R
commands. Which commands are in the subset will depend somewhat on what one
is trying to do. I believe that there are several examples of this approach,
including my R and S-PLUS Companion to Applied Regression. Likewise,
starting with a simple modus operandi, such as working with a single
attached data frame, can cut through a lot of the complexity. Once someone
is comfortable with basic use of R, expanding knowledge of functions,
packages, and other ways of handling data comes naturally. 

(3) I don't find R less uniform than SAS or SPSS, particularly in the way
that statistical models are handled. Moreover, trying to do something
innovative or non-standard in SAS is relatively difficult (in my
experience), and even harder in SPSS. I'm less familiar with Stata, but
uniformity seems one of its strengths. (The Stata scripting language puts me
off, however.)

(4) Not everyone has the same experience and thinks in the same way. I've
used many different statistical packages and computing environments, and
have learned quite a few programming languages (most of which I can no
longer use). Of these, I found APL and R the easiest to learn, and Lisp
(Lisp-Stat) the hardest. Sometimes, though, it's worth expending the effort
to learn something that's difficult -- I feel that I got a lot out of
learning to program in Lisp, for example.

(5) The essential point is that how hard one finds it to learn something is
a function of the intrinsic difficulty of the thing, the person's previous
experience, preferred modes of thinking, etc., and how learning is
approached.

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Peter Flom
 Sent: Tuesday, January 03, 2006 6:28 AM
 To: [EMAIL PROTECTED]; [EMAIL PROTECTED]
 Cc: R-help@stat.math.ethz.ch
 Subject: Re: [R] A comment about R:
 
  Rau, Roland [EMAIL PROTECTED]   wrote
 
 IMO this is a very good proposal but I think that the main 
 problem is not the translation of one function in 
 SPSS/Stata/SAS to the equivalent in R.
 Remembering my first contact with R after using SPSS for some 
 years (and having some experience with Stata and SAS) was 
 that your mental framework is different. You think in 
 SPSS-terms (i.e. you expect that data are automatically a 
 rectangular matrix, functions operate on columns of this 
 matrix, you have always only one dataset available, ...). 
 This is why jumping from SPSS to Stata is relatively easy. 
 But to jump from any of the three to R is much more difficult. 
 This mental barrier is also the main obstacle for me now when 
 I try to encourage the use of R to other people who have a 
 similar background as I had.
 What can be done about it? I guess the only answer is 
 investing time from the user which implies that R will 
 probably never become the language of choice for casual 
 users. But popularity is probably not the main goal of the 
 R-Project (it would be rather a nice side-effect).
 
 
 
 
 As someone who uses SAS qutie a bit and R somewhat less, I 
 think Roland makes some excellent points.  Going from SPSS to 
 SAS (which I once did) is like going from Spansih to French.  
 Going from SAS to R (which I am trying to do) is like going 
 from English to Chinese.
 
 But it's more than that.  
 
 Beyond the obvious differences in the languages is a 
 difference in how they are written about; and how they are 
 improved.  SAS documentation is much lengthier than R's.  
 Some people like the terseness of R's help.  Some like the 
 verboseness of SAS's.  SOme of this difference is doubtless 
 due to the fact that SAS is commercial, and pays people to 
 write the documentation.  I have tremednous appreciation for 
 the unpaid effort that goes into R, and nothing I say here 
 should be seen as detracting from that.
 
 As to how they are improved, the fact that R is extended (in 
 part) by packages written by many many different people is 
 good, becuase it means that the latest techniques can be 
 written up, often by the people who invent the techniques 
 (and, again, I appreciate this tremendously), but it does 
 mean that a) It is hard to know what is out there at any 
 given time; b) the styles of pacakages difer somewhat.
 

Re: [R] A comment about R:

2006-01-03 Thread Peter Flom
 John Fox [EMAIL PROTECTED] 1/3/2006 9:35 am  as always,
raises some excellent points.  I have some responses, interspersed


It's not reasonable to argue with someone's experience -- that is, if
people
tell me that they found R harder to learn than SAS, say, then I believe
them
-- but that's not my experience in teaching relatively inexperienced
students to use statistical software. A few points:


A lot of this probably has to do with what you learned first.  I
learned SAS long
before I learned R.  Had it been reversed, I would probably find SAS
hard.  


(1) Casual and initial  use of statistical software is easier through a
GUI,
so it's not reasonable, for example, to compare learning to use SPSS
via its
GUI to learning R via commands.


True, but I was comparing SAS and R, and this originally started
with STATA and R, and all 3 of those are command driven.


(4) Not everyone has the same experience and thinks in the same way.
I've
used many different statistical packages and computing environments,
and
have learned quite a few programming languages (most of which I can no
longer use). Of these, I found APL and R the easiest to learn, and
Lisp
(Lisp-Stat) the hardest. Sometimes, though, it's worth expending the
effort
to learn something that's difficult -- I feel that I got a lot out of
learning to program in Lisp, for example.


This is, I think, a big part of it.  I think that R would be a lot
easier to learn for
someone who has learned some other computer language.  I have not.  

I agree that learning something difficult can often be worth it.


Peter

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] cox model

2006-01-03 Thread billemont
I'm a french medicine student and i work on oncology. I work about
treatment oh breast cancer. I have 3 sub group of patient. I made some
kaplan meyer survival curve, and i made a cox model.

On the survival curve, on the last observations there is a crossmatch of
the different survival curve at 150 month .
I study the validity of my model by study of residual by  study the
proportional risk. So i used Coxzph formula, but the global test is p
0.05, so my model is not a  proportional risk.

Do you know how i can cut the cox model analysis before the 150 month,
wich are the time where the curves are  crossing?

Thank you for your Help

Dr Billemont

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] How to set the size of a rgl window, par3d() ?

2006-01-03 Thread Duncan Murdoch
On 1/3/2006 10:30 AM, begert wrote:
 Dear R- Users,
 
is there a way to determine the size of
an rgl window (rgl.open()) either in advance or
afterwards, (without using the mouse, of course) ?
 
Intuitively, one would assume to set the size by:
 
library(rgl);
par3d(viewport=c(0,0,500,500));
#rgl.open();
 
for example. As the parameter 'viewport' is 'readonly'
this results in an error message:
Error in par3d(viewport = c(0, 0, 500, 500)) :
invalid value specified for rgl parameter viewport
In addition: Warning message:
parameter viewport cannot be set.
 
Any possible workarounds ?

Not that I know of.  This is handled by OpenGL and the windowing system; 
rgl just queries OpenGL to give the par3d(viewport) response.

It would take a bit of time to add this, because it would need to be 
added for all 3 output devices (Windows, X11, OSX).

Duncan Murdoch

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Summary functions to dataframe

2006-01-03 Thread Mike Bock
I have written a few different summary functions. I want to calculate
the statistics by groups and I am having trouble getting the output as a
dataframe. I have attached one example with a small dataset that
calculates summary stats and percentiles, I have others that calculate
upper confidence limits etc. I would like the output to be converted to
a dataframe with one of the columns as the grouping variable. This seems
simple but my attempts with do.call(cbind) and rbind have not worked
so I have concluded I a missing something obvious. Any help is
appreciated.

Thanks,
Mike



areas - structure (list(N_Type = structure(c(4, 1, 4, 1, 1, 4, 1, 4, 4,
1, 4, 1, 4, 1, 4, 1, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 4, 1,
4, 1, 4, 1, 4, 1, 4, 1, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1,
4, 1, 4, 4, 1, 4, 1, 2, 1, 2, 1, 4, 1, 4, 1, 4, 1, 4, 1, 1, 4,
1, 4, 1, 4, 1, 4, 4, 1, 4, 1, 2, 1, 2, 1, 1, 4, 1, 4, 4, 1, 4,
1, 4, 1, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 4, 1,
4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4,
1, 4, 1, 4, 1, 1, 4, 1, 4, 2, 1, 2, 1, 1, 4, 1, 4, 1, 4, 4, 1,
4, 1), .Label = c(All, Inside 370, Not Applicable, Outside 370
), class = factor), AdRes = c(23.7, 23.7, 42.4, 42.4, 630,
630, 990, 990, 72.85, 72.85, 70.6, 70.6, 10, 10, 21.7, 21.7,
171.66, 171.66, 306, 306, 62.1, 62.1, 53.25, 53.25, 208, 208,
64.8, 64.8, 87.3, 87.3, 356, 356, 25.8, 25.8, 156, 156, 166,
166, 135.5, 135.5, 170.5, 170.5, 203, 203, 227.5, 227.5, 224,
224, 123, 123, 140.66, 140.66, 142.5, 142.5, 44.65, 44.65, 50.3,
50.3, 1320, 1320, 577, 577, 71.1, 71.1, 411, 411, 104, 104, 122,
122, 201, 201, 230, 230, 192, 192, 304, 304, 184.5, 184.5, 350,
350, 536, 536, 470.5, 470.5, 172, 172, 166, 166, 205, 205, 595,
595, 227.5, 227.5, 9.1, 9.1, 14.6, 14.6, 10.9, 10.9, 11.1, 11.1,
313.5, 313.5, 53.8, 53.8, 29.8, 29.8, 29.5, 29.5, 34.05, 34.05,
21.8, 21.8, 385.5, 385.5, 541, 541, 168, 168, 119, 119, 376,
376, 91.9, 91.9, 97.76, 97.76, 164, 164, 244, 244, 303.5, 303.5,
388, 388, 59.8, 59.8, 227.5, 227.5, 165, 165, 19.15, 19.15, 651,
651, 195, 195, 190, 190, 164, 164, 190, 190, 334, 334)), .Names =
c(N_Type,
AdRes), row.names = c(8956, 8957, 8972, 8973, 8974,
8975, 8976, 8977, 8978, 8979, 8980, 8981, 8982,
8983, 8984, 8985, 9159, 9160, 9175, 9176, 9177,
9178, 9185, 9186, 9201, 9202, 9203, 9204, 9205,
9206, 9207, 9208, 9209, 9210, 9217, 9218, 9233,
9234, 9241, 9242, 9261, 9262, 9277, 9278, 9285,
9286, 9301, 9302, 9309, 9310, 9329, 9330, 9345,
9346, 9353, 9354, 9369, 9370, 9371, 9372, 9373,
9374, 9410, 9411, 9412, 9413, 9414, 9415, 9422,
9423, 9424, 9425, 9426, 9427, 9428, 9429, 9430,
9431, 9432, 9433, 9434, 9435, 9436, 9437, 9444,
9445, 9452, 9453, 9454, 9455, 9456, 9457, 9458,
9459, 9460, 9461, 9468, 9469, 9470, 9471, 9472,
9473, 9474, 9475, 9476, 9477, 9478, 9479, 9480,
9481, 9488, 9489, 9496, 9497, 9498, 9499, 9720,
9721, 9722, 9723, 9724, 9725, 9726, 9727, 9728,
9729, 9730, 9731, 9732, 9733, 9734, 9735, 9736,
9737, 9738, 9739, 9740, 9741, 9742, 9743, 9744,
9745, 9746, 9747, 9748, 9749, 9750, 9751, 9752,
9753, 9754, 9755, 9756, 9757, 9758, 9759, 9760,
9761), class = data.frame)


Pstats - function(x)
{
Max = max(x)
Min = min(x)
AMean = mean(x)
AStdev = sd(x)
Samples - length(x)
p10 - quantile(x,0.1,na.rm = TRUE, names = FALSE)
p20 - quantile(x,0.2,na.rm = TRUE, names = FALSE)
p30 - quantile(x,0.3,na.rm = TRUE, names = FALSE)
p40 - quantile(x,0.4,na.rm = TRUE, names = FALSE)
p50 - quantile(x,0.5,na.rm = TRUE, names = FALSE)
p60 - quantile(x,0.6,na.rm = TRUE, names = FALSE)
p70 - quantile(x,0.7,na.rm = TRUE, names = FALSE)
p80 - quantile(x,0.8,na.rm = TRUE, names = FALSE)
p90 - quantile(x,0.9,na.rm = TRUE, names = FALSE)
Result - data.frame(Samples,AMean,AStdev,
Min,Max,p10,p20,p30,p40,p50,p60,p70,p80,p90)
return(Result)
#write.table(Result, file = Results.csv, sep = ,,row.names =
FALSE)
}

attach(areas)
res - by(areas, N_Type, function (x)
  (Pstats(AdRes))) 

#need to convert res to a dataframe



Michael Bock, PhD
ENVIRON International Corporation
136 Commercial Street, Suite 402
Portland, ME 04101
phone: 207.347.4413
fax: 207.347.4384




This message contains information that may be confidential, ...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] A comment about R:

2006-01-03 Thread JRG
On 3 Jan 2006 at 7:35, Thomas Lumley wrote:

 On Mon, 2 Jan 2006, Philippe Grosjean wrote:
 
  That said, I think one should interpret Mitchell's paper in a different
  way. Obviously, he is an unconditional and happy Stata user (he even
  wrote a book about graphs programming in Stata). His claim in favor of
  Stata (versus SAS and SPSS, and also, indirectly, versus R) is to be
  interpreted the same way as unconditional lovers of Macintoshes or PCs
  would argue against the other clan. Both architectures are good and have
  strengths and weaknesses. Real arguments are more sentimental, and could
  resume in: The more I use it, the more I like it,... and the aliens are
  bad, ugly and stupid! Would this apply to Stata versus R? I don't know
  Stata at all, but I imagine it could be the case from what I read in
  Mitchell's paper...
 
 
 I think there are good reasons why Stata is becoming much more popular in 
 epidemiology and biostatistics [and I'm not particularly prejudiced 
 against R]. In my experience people who like R also like Stata, though 
 clearly the reverse is not necessarily true.
 
 Stata, like R, is readily programmable.  Users can -- and do -- write 
 and distribute programs that look just like the built-in routines.  There 
 is an active and helpful mailing list. However, Stata programming is very 
 different from R programming, since it is macro-based (think Tcl/Tk) 
 rather than function-based.
 
 Stata is also easier to learn: it has a very consistent syntax and even 
 better documentation than R.  We use Stata for all our service course 
 teaching, and despite the fact that it is command-line based rather than 
 GUI the students were no more unhappy than when SPSS was used for the 
 lowest-level courses and Egret for the higher-level service courses. 
 [Stata now has a GUI but it is awful and quite a lot of students prefer 
 the command-line]
 
 
   -thomas
 

I'll offer a Second to Thomas's motion.

I like R but I find Stata much easier to teach in service courses.  For most of 
my students, the Stata learning curve is much more 
tolerable than that of R (at a reduction in capability, of course).  I state on 
Day 1 that I think R is the world's best package, 
and that Stata is my choice for a very acceptable compromise --- for most 
purposes.  A few students go on to write their own Stata 
programs, and a few go on to learn R and love it.  

But the vast majority of my students learn enough Stata to get through the 
courses, and afterward they do whatever their advisor 
wants them to do (the First Law of Graduate School).  For a sizable fraction 
(maybe 25%), that also proves to be Stata, as there is 
a solid core of Stata users among the faculty here.

I'l also agree that Stata's GUI is ghastly; most of my students (both during 
courses and any later use) quickly adapt to using 
Stata's command line, and they use it quite effectively.

---JRG

John R. Gleason
Associate Professor

Syracuse University
430 Huntington Hall  Voice:   315-443-3107
Syracuse, NY 13244-2340  USA FAX: 315-443-4085

PGP public key at keyservers

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] A comment about R:

2006-01-03 Thread Peter Dalgaard
Patrick Burns [EMAIL PROTECTED] writes:

 I have had an email conversation with the author of the
 technical report from which the quote was taken.  I am
 formulating a comment to the report that will be posted
 with the technical report.
 
 I would be pleased if this thread continued, so I will know
 better what I want to say.  Plus I should be able to reference
 this thread in the comment.

One thing that is often overlooked, and hasn't yet been mentioned in
the thread, is how much *simpler* R can be for certain completely
basic tasks of practical or pedagogical relevance: Calculate a simple
derived statistic, confidence intervals from estimate and SE,
percentage points of the binomial distribution - using dbinom or from
the formula, take the sum of each of 10 random samples from a set of
numbers, etc. This is where other packages get stuck in the
procedure+dataset mindset.

For much the same reason, those packages make you tend to treat
practical data analysis as something distinct from theoretical
understanding of the methods: You just don't use SAS or SPSS or Stata
to illustrate the concept of a random sample by setting up a small
simulation study as the first thing you do in a statistics class,
whereas you could quite conceivably do it in R. (What *is* the
equivalent of rnorm(25) in those languages, actually?)

Even when using SAS in teaching, I sometimes fire up R just to
calculate simple things like

  pbar - (p1+p2)/2
  sqrt(pbar*(1-pbar))

which you need to cheat SAS Analyst's sample size calculator to work
with proportions rather than means. SAS leaves you no way to do this
short of setting up a new data set. The Windows calculator will do it,
of course, but the students can't see what you are doing then.


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] For loop gets exponentially slower as dataset gets larger...

2006-01-03 Thread r user
I am running R 2.1.1 in a Microsoft Windows XP environment.
   
  I have a matrix with three vectors (“columns”) and ~2 million “rows”.  The 
three vectors are date_, id, and price.  The data is ordered (sorted) by code 
and date_.
   
  (The matrix contains daily prices for several thousand stocks, and has ~2 
million “rows”. If a stock did not trade on a particular date, its price is set 
to “NA”)
   
  I wish to add a fourth vector that is “next_price”. (“Next price” is the 
current price as long as the current price is not “NA”.  If the current price 
is NA, the “next_price” is the next price that the security with this same ID 
trades.  If the stock does not trade again,  “next_price” is set to NA.)
   
  I wrote the following loop to calculate next_price.  It works as intended, 
but I have one problem.  When I have only 10,000 rows of data, the calculations 
are very fast.  However, when I run the loop on the full 2 million rows, it 
seems to take ~ 1 second per row.
   
  Why is this happening?  What can I do to speed the calculations when running 
the loop on the full 2 million rows?
   
  (I am not running low on memory, but I am maxing out my CPU at 100%)
   
  Here is my code and some sample data:
   
  data- data[order(data$code,data$date_),] 
  l-dim(data)[1]
  w-3
  data[l,w+1]-NA
   
  for (i in (l-1):(1)){
  
data[i,w+1]-ifelse(is.na(data[i,w])==F,data[i,w],ifelse(data[i,2]==data[i+1,2],data[i+1,w+1],NA))
  }
   
   
  date  id price next_price
  6/24/20051635444.7838 444.7838
  6/27/20051635448.4756 448.4756
  6/28/20051635455.4161 455.4161
  6/29/20051635454.6658 454.6658
  6/30/20051635453.9155 453.9155
  7/1/2005  1635453.3153 453.3153
  7/4/2005  1635NA  453.9155
  7/5/2005  1635453.9155 453.9155
  7/6/2005  1635453.0152 453.0152
  7/7/2005  1635452.8651 452.8651
  7/8/2005  1635456.0163 456.0163
  12/19/2005  1635442.6982 442.6982
  12/20/2005  1635446.5159 446.5159
  12/21/2005  1635452.4714 452.4714
  12/22/2005  1635451.074   451.074
  12/23/2005  1635454.6453 454.6453
  12/27/2005  1635NA  NA
  12/28/2005  1635NA  NA
  12/1/2003188166.1562   66.1562
  12/2/2003188164.9192   64.9192
  12/3/2003188166.0078   66.0078
  12/4/2003188165.8098   65.8098
  12/5/2003188164.1275   64.1275
  12/8/2003188164.8697   64.8697
  12/9/2003188163.5337   63.5337
  12/10/2003  188162.9399   62.9399


-

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] under (and over) dispersion in Poisson regression

2006-01-03 Thread Prof Brian Ripley
This most often indicates a problem with dispersion estimate.  See the 
cautionary tale in MASS4 chapter 7.  If you have a reliable dispersion
estimate that low for genuine counts, they are either not independent or 
not Poisson (for example, limited), and one would want to find out what is 
going on.

On Tue, 3 Jan 2006, John Sorkin wrote:

 I am trying to use Poisson regression to model count data. My results 
 are suggestive of under dispersion (0.79). How close to one does one 
 want the measure of dispersion to be before one accepts the results of 
 the analysis?

 I know that there is no definitive answer to my question, but I would 
 like to get some sense of general practice.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Bootstrap w/ Clustered Data

2006-01-03 Thread Peter Muhlberger
Looks like I may have found a function that addresses my needs.  Bootcov in
Design handles bootstrapping from clustered data and will save the
coefficients.  I'm not entirely sure it handles clusters the way I'd like,
but I'm going through the code.  If it doesn't, it looks easily
re-writeable.  As far as I can tell, boot in package boot would do clusters
only if the estimation function passed to it pastes together data based on
the clusters boot selects.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Summary functions to dataframe

2006-01-03 Thread Gabor Grothendieck
Try this:

Pstats - function(x) c(Max = max(x),
   Min = min(x),
   AMean = mean(x),
   AStdev = sd(x),
   Samples = length(x),
   quantile(x, 1:9/10, na.rm = TRUE))

res - with(areas, by(AdRes, N_Type, Pstats))
do.call(rbind, res)

Also, check out summaryBy in the doBy package at
http://genetics.agrsci.dk/~sorenh/misc/index.html



On 1/3/06, Mike Bock [EMAIL PROTECTED] wrote:
 I have written a few different summary functions. I want to calculate
 the statistics by groups and I am having trouble getting the output as a
 dataframe. I have attached one example with a small dataset that
 calculates summary stats and percentiles, I have others that calculate
 upper confidence limits etc. I would like the output to be converted to
 a dataframe with one of the columns as the grouping variable. This seems
 simple but my attempts with do.call(cbind) and rbind have not worked
 so I have concluded I a missing something obvious. Any help is
 appreciated.

 Thanks,
 Mike



 areas - structure (list(N_Type = structure(c(4, 1, 4, 1, 1, 4, 1, 4, 4,
 1, 4, 1, 4, 1, 4, 1, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 4, 1,
 4, 1, 4, 1, 4, 1, 4, 1, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1,
 4, 1, 4, 4, 1, 4, 1, 2, 1, 2, 1, 4, 1, 4, 1, 4, 1, 4, 1, 1, 4,
 1, 4, 1, 4, 1, 4, 4, 1, 4, 1, 2, 1, 2, 1, 1, 4, 1, 4, 4, 1, 4,
 1, 4, 1, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 4, 1,
 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4, 1, 4,
 1, 4, 1, 4, 1, 1, 4, 1, 4, 2, 1, 2, 1, 1, 4, 1, 4, 1, 4, 4, 1,
 4, 1), .Label = c(All, Inside 370, Not Applicable, Outside 370
 ), class = factor), AdRes = c(23.7, 23.7, 42.4, 42.4, 630,
 630, 990, 990, 72.85, 72.85, 70.6, 70.6, 10, 10, 21.7, 21.7,
 171.66, 171.66, 306, 306, 62.1, 62.1, 53.25, 53.25, 208, 208,
 64.8, 64.8, 87.3, 87.3, 356, 356, 25.8, 25.8, 156, 156, 166,
 166, 135.5, 135.5, 170.5, 170.5, 203, 203, 227.5, 227.5, 224,
 224, 123, 123, 140.66, 140.66, 142.5, 142.5, 44.65, 44.65, 50.3,
 50.3, 1320, 1320, 577, 577, 71.1, 71.1, 411, 411, 104, 104, 122,
 122, 201, 201, 230, 230, 192, 192, 304, 304, 184.5, 184.5, 350,
 350, 536, 536, 470.5, 470.5, 172, 172, 166, 166, 205, 205, 595,
 595, 227.5, 227.5, 9.1, 9.1, 14.6, 14.6, 10.9, 10.9, 11.1, 11.1,
 313.5, 313.5, 53.8, 53.8, 29.8, 29.8, 29.5, 29.5, 34.05, 34.05,
 21.8, 21.8, 385.5, 385.5, 541, 541, 168, 168, 119, 119, 376,
 376, 91.9, 91.9, 97.76, 97.76, 164, 164, 244, 244, 303.5, 303.5,
 388, 388, 59.8, 59.8, 227.5, 227.5, 165, 165, 19.15, 19.15, 651,
 651, 195, 195, 190, 190, 164, 164, 190, 190, 334, 334)), .Names =
 c(N_Type,
 AdRes), row.names = c(8956, 8957, 8972, 8973, 8974,
 8975, 8976, 8977, 8978, 8979, 8980, 8981, 8982,
 8983, 8984, 8985, 9159, 9160, 9175, 9176, 9177,
 9178, 9185, 9186, 9201, 9202, 9203, 9204, 9205,
 9206, 9207, 9208, 9209, 9210, 9217, 9218, 9233,
 9234, 9241, 9242, 9261, 9262, 9277, 9278, 9285,
 9286, 9301, 9302, 9309, 9310, 9329, 9330, 9345,
 9346, 9353, 9354, 9369, 9370, 9371, 9372, 9373,
 9374, 9410, 9411, 9412, 9413, 9414, 9415, 9422,
 9423, 9424, 9425, 9426, 9427, 9428, 9429, 9430,
 9431, 9432, 9433, 9434, 9435, 9436, 9437, 9444,
 9445, 9452, 9453, 9454, 9455, 9456, 9457, 9458,
 9459, 9460, 9461, 9468, 9469, 9470, 9471, 9472,
 9473, 9474, 9475, 9476, 9477, 9478, 9479, 9480,
 9481, 9488, 9489, 9496, 9497, 9498, 9499, 9720,
 9721, 9722, 9723, 9724, 9725, 9726, 9727, 9728,
 9729, 9730, 9731, 9732, 9733, 9734, 9735, 9736,
 9737, 9738, 9739, 9740, 9741, 9742, 9743, 9744,
 9745, 9746, 9747, 9748, 9749, 9750, 9751, 9752,
 9753, 9754, 9755, 9756, 9757, 9758, 9759, 9760,
 9761), class = data.frame)


 Pstats - function(x)
{
Max = max(x)
Min = min(x)
AMean = mean(x)
AStdev = sd(x)
Samples - length(x)
p10 - quantile(x,0.1,na.rm = TRUE, names = FALSE)
p20 - quantile(x,0.2,na.rm = TRUE, names = FALSE)
p30 - quantile(x,0.3,na.rm = TRUE, names = FALSE)
p40 - quantile(x,0.4,na.rm = TRUE, names = FALSE)
p50 - quantile(x,0.5,na.rm = TRUE, names = FALSE)
p60 - quantile(x,0.6,na.rm = TRUE, names = FALSE)
p70 - quantile(x,0.7,na.rm = TRUE, names = FALSE)
p80 - quantile(x,0.8,na.rm = TRUE, names = FALSE)
p90 - quantile(x,0.9,na.rm = TRUE, names = FALSE)
Result - data.frame(Samples,AMean,AStdev,
 Min,Max,p10,p20,p30,p40,p50,p60,p70,p80,p90)
return(Result)
#write.table(Result, file = Results.csv, sep = ,,row.names =
 FALSE)
}

 attach(areas)
 res - by(areas, N_Type, function (x)
  (Pstats(AdRes)))

 #need to convert res to a dataframe



 Michael Bock, PhD
 ENVIRON International Corporation
 136 Commercial Street, Suite 402
 Portland, ME 04101
 phone: 207.347.4413
 fax: 207.347.4384




 This message contains information that may be confidential, ...{{dropped}}

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE 

Re: [R] A comment about R:

2006-01-03 Thread Wensui Liu
Another big difference between R and other computing language such as
SPSS/SAS/STATA.
You can easily get a job using SPSS/SAS/STATA. But it is extremely difficult
to find a job using R. ^_^.

On 03 Jan 2006 17:53:40 +0100, Peter Dalgaard [EMAIL PROTECTED]
wrote:

 Patrick Burns [EMAIL PROTECTED] writes:

  I have had an email conversation with the author of the
  technical report from which the quote was taken.  I am
  formulating a comment to the report that will be posted
  with the technical report.
 
  I would be pleased if this thread continued, so I will know
  better what I want to say.  Plus I should be able to reference
  this thread in the comment.

 One thing that is often overlooked, and hasn't yet been mentioned in
 the thread, is how much *simpler* R can be for certain completely
 basic tasks of practical or pedagogical relevance: Calculate a simple
 derived statistic, confidence intervals from estimate and SE,
 percentage points of the binomial distribution - using dbinom or from
 the formula, take the sum of each of 10 random samples from a set of
 numbers, etc. This is where other packages get stuck in the
 procedure+dataset mindset.

 For much the same reason, those packages make you tend to treat
 practical data analysis as something distinct from theoretical
 understanding of the methods: You just don't use SAS or SPSS or Stata
 to illustrate the concept of a random sample by setting up a small
 simulation study as the first thing you do in a statistics class,
 whereas you could quite conceivably do it in R. (What *is* the
 equivalent of rnorm(25) in those languages, actually?)

 Even when using SAS in teaching, I sometimes fire up R just to
 calculate simple things like

   pbar - (p1+p2)/2
   sqrt(pbar*(1-pbar))

 which you need to cheat SAS Analyst's sample size calculator to work
 with proportions rather than means. SAS leaves you no way to do this
 short of setting up a new data set. The Windows calculator will do it,
 of course, but the students can't see what you are doing then.


 --
O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45)
 35327918
 ~~ - ([EMAIL PROTECTED])  FAX: (+45)
 35327907

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html




--
WenSui Liu
(http://statcompute.blogspot.com)
Senior Decision Support Analyst
Health Policy and Clinical Effectiveness
Cincinnati Children Hospital Medical Center

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] A comment about R:

2006-01-03 Thread Ben Fairbank
One implicit point in Kjetil's message is the difficulty of learning
enough of R to make its use a natural and desired first choice
alternative, which I see as the point at which real progress and
learning commence with any new language.  I agree that the long learning
curve is a serious problem, and in the past I have discussed, off list,
with one of the very senior contributors to this list the possibility of
splitting the list into sections for newcomers and for advanced users.
He gave some very cogent reasons for not splitting, such as the
possibility of newcomers' getting bad advice from others only slightly
more advanced than themselves.  And yet I suspect that a newcomers'
section would encourage the kind of mutually helpful collegiality among
newcomers that now characterizes the exchanges of the more experienced
users on this list.  I know that I have occasionally been reluctant to
post issues that seem too elementary or trivial to vex the others on the
list with and so have stumbled around for an hour or so seeking the
solution to a simple problem.  Had I the counsel of others similarly
situated progress might have been far faster.  Have other newcomers or
occasional users had the same experience?

Is it time to reconsider splitting this list into two sections?
Certainly the volume of traffic could justify it.

Ben Fairbank

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Kjetil Halvorsen
Sent: Sunday, January 01, 2006 8:37 AM
To: R-help@stat.math.ethz.ch
Subject: [R] A comment about R:


Readers of this list might be interested in the following commenta about
R.


In a recent report, by Michael N. Mitchell
http://www.ats.ucla.edu/stat/technicalreports/
says about R:


Perhaps the most notable exception to this discussion is R, a language
for statistical computing and graphics. R is free to download under the
terms of the GNU General Public License (see http://www.r-project.
org/). Our web site has resources on R and I have tried, sometimes in
great earnest, to learn and understand R. I have learned and used a
number of statistical packages (well over 10) and a number of
programming languages (over 5), and I regret to say that I have had
enormous diffculties learning and using R. I know that R has a great fan
base composed of skilled and excellent statisticians, and that includes
many people from the UCLA statistics department. However, I feel like R
is not so much of a statistical package as much as it is a statistical
programming environment that has many new and cutting edge features. For
me learning R has been very diffcult and I have had a very hard time
finding answers to many questions about using it. Since the R community
tends to be composed of experts deeply enmeshed in R, I often felt that
I was missing half of the pieces of the puzzle when reading information
about the use of R { it often feels like there is an assumption that
readers are also experts in R. I often found the documentation for R
quite sparse and many essential terms or constructs were used but not
defined or cross-referenced. While there are mailing lists regarding R
where people can ask questions, there is no offcial technical support.
Because R is free and is based on the contributions of the R community,
it is extremely extensible and programmable and I have been told that it
has many cutting edge features, some not available anywhere else.
Although R is free, it may be more costly in terms of your time to
learn, use, and obtain support for it. My feeling is that R is much more
suited to the sort of statistician who is oriented towards working very
deeply with it. I think R is the kind of package that you really need to
become immersed in (like a foreign language) and then need to use on a
regular basis. I think that it is much more diffcult to use it casually
as compared to SAS, Stata or SPSS. But by devoting time and effort to it
would give you access to a programming environment where you can write R
programs and collaborate with others who are also using R. Those who are
able to access its power, even at an applied level, would be able to
access tools that may not be found in other packages, but this might
come with a serious investment of time to suffciently use R and maintain
your skills with R.


Kjetil

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] A comment about R:

2006-01-03 Thread Peter Flom
 Ben Fairbank [EMAIL PROTECTED] 1/3/2006 12:42 pm  wrote

One implicit point in Kjetil's message is the difficulty of learning
enough of R to make its use a natural and desired first choice
alternative, which I see as the point at which real progress and
learning commence with any new language.  I agree that the long
learning
curve is a serious problem, and in the past I have discussed, off
list,
with one of the very senior contributors to this list the possibility
of
splitting the list into sections for newcomers and for advanced users.
He gave some very cogent reasons for not splitting, such as the
possibility of newcomers' getting bad advice from others only slightly
more advanced than themselves.  And yet I suspect that a newcomers'
section would encourage the kind of mutually helpful collegiality
among
newcomers that now characterizes the exchanges of the more experienced
users on this list.  I know that I have occasionally been reluctant to
post issues that seem too elementary or trivial to vex the others on
the
list with and so have stumbled around for an hour or so seeking the
solution to a simple problem.  Had I the counsel of others similarly
situated progress might have been far faster.  Have other newcomers or
occasional users had the same experience?


I, for one, have had this experience.  I am usually hesitant to post
elementary questions here.

However, I think that the 'cogent reasons' given by 'one of the very
senior contributors' are valid.
I think that  a 'newcomers list' would only really be useful if it
included some experts who could respond,
out of generosity.  I don't think the R community lacks generosity -
obviously not, given all the thousands of 
hours people have spent writing the language and all the packages and
so on.  

But these generous people have different abilities and get pleasure in
different ways.  Some people get a thrill
out of answering complex questions that require them to come up with
novel solutions involving complex code.
Some people get a thrill out of helping newbies over the humps. 
Dividing the lists might help the experts, as much as it helps the
beginners. 


Peter

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] A comment about R:

2006-01-03 Thread Berton Gunter
U

I cannot say how easy or hard R is to learn, but in response to the UCLA
commentary:

 However, I 
 feel like R
 is not so much of a statistical package as much as it is a statistical
 programming environment that has many new and cutting edge 
 features. 

Please note: the first sentence of the Preface of THE Green Book
(PROGRAMMING WITH DATA: A GUIDE TO THE S LANGUAGE) by John Chambers, the
inventor of the S Language, explicitly states:
 
S is a programming language and environment for all kinds of computing
involving data.

I think this says that R is **not** meant to be a statistical package in the
conventional sense and should not be considered one. As computing involving
data is a complex and frequently messy business on both technical
(statistics), practical (messy data), and aesthetic (graphics, tables)
levels, it is perhaps to be expected that a programming language and
environment for all kinds of computing involving data  is complex.
Personally, I find that (Chambers's next sentence) R's ability To turn
ideas into software, quickly and faithfully, to be a boon. But, then again,
I'm a statistical professional and not a casual user.

Cheers,

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
The business of the statistician is to catalyze the scientific learning
process.  - George E. P. Box

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] A comment about R:

2006-01-03 Thread Patrick Burns
Wensui Liu wrote:

Another big difference between R and other computing language such as
SPSS/SAS/STATA.
You can easily get a job using SPSS/SAS/STATA. But it is extremely difficult
to find a job using R. ^_^.
  


Actually in finance it is getting easier all the time for
knowledge of R to be a significant benefit.

Patrick Burns
[EMAIL PROTECTED]
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and A Guide for the Unwilling S User)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] A comment about R:

2006-01-03 Thread Kort, Eric


Berton Gunter writes
 U
 
 I cannot say how easy or hard R is to learn, but in response to the
UCLA
 commentary:
 
  However, I
  feel like R
  is not so much of a statistical package as much as it is a
statistical
  programming environment that has many new and cutting edge
  features.
 
 Please note: the first sentence of the Preface of THE Green Book
 (PROGRAMMING WITH DATA: A GUIDE TO THE S LANGUAGE) by John Chambers,
the
 inventor of the S Language, explicitly states:
 
 S is a programming language and environment for all kinds of
computing
 involving data.
 
 I think this says that R is **not** meant to be a statistical package
in
 the
 conventional sense and should not be considered one. As computing
 involving
 data is a complex and frequently messy business on both technical
 (statistics), practical (messy data), and aesthetic (graphics, tables)
 levels, it is perhaps to be expected that a programming language and
 environment for all kinds of computing involving data  is complex.
 Personally, I find that (Chambers's next sentence) R's ability To
turn
 ideas into software, quickly and faithfully, to be a boon. snip

Right.  

So in 2 months I will finish my MD program here in the U.S.  I also have
a master's degree in Epidemiology (in which we used SAS)--but that
hardly qualifies me as statistics expert.  Nonetheless, I have learned
to use R out of necessity without undue difficulty.  So have multiple of
my colleagues around me with MDs, PhDs, and Master's degrees.  We do
mainly microarray analysis, so the availability of a rapidly developing
and customizable toolset (BioC packages) is essential to our work.

And, in the same vein of others' comments, R's nuts and bolts
characteristics make me think, learn, and improve.  And the fear of
getting Ripleyed on the mailing list also makes me think, read, and
improve before submitting half baked questions to the list.

So in sum, I use R because it encourages thoughtful analysis, it is
flexible and extensible, and it is free.  I feel that these are
strengths of the environment, not weaknesses.  So if an individual finds
another tool better suited for their work that is obviously just fine,
but I hardly think these characteristics of R are grounds for criticism,
excellent proposals for evolution of documentation and mailing lists
notwithstanding.

-Eric
This email message, including any attachments, is for the so...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] For loop gets exponentially slower as dataset gets larger...

2006-01-03 Thread bogdan romocea
Your 2-million loop is overkill, because apparently in the (vast)
majority of cases you don't need to loop at all. You could try
something like this:
1. Split the price by id, e.g.
price.list - split(price,id)
For each id,
2a. When price is not NA, assign it to next price _without_ using a
for loop - e.g.
next.price[!is.na(price)] - price[!is.na(price)]
2b. Use a for loop only when price is NA, but even then work with
vectors as much as you can, for example (untested)
for (i in setdiff(which(is.na(price)),length(price))) {
remaining.prices - price[(i+1):length(price)]
of.interest - head(remaining.prices[!is.na(remaining.prices)],1)
if (class(of.interest) == logical) next.price[i] - NA else
next.price[i] - of.interest
}
To run (2a) and (2b) you could use lapply(); to paste the bits
together try do.call(rbind,price.list). You might also want to take
a look at ?Rprof and check the archives for efficiency suggestions.


 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of r user
 Sent: Tuesday, January 03, 2006 11:59 AM
 To: rhelp
 Subject: [R] For loop gets exponentially slower as dataset
 gets larger...


 I am running R 2.1.1 in a Microsoft Windows XP environment.

   I have a matrix with three vectors (columns) and ~2
 million rows.  The three vectors are date_, id, and price.
 The data is ordered (sorted) by code and date_.

   (The matrix contains daily prices for several thousand
 stocks, and has ~2 million rows. If a stock did not trade
 on a particular date, its price is set to NA)

   I wish to add a fourth vector that is next_price. (Next
 price is the current price as long as the current price is
 not NA.  If the current price is NA, the next_price is
 the next price that the security with this same ID trades.
 If the stock does not trade again,  next_price is set to NA.)

   I wrote the following loop to calculate next_price.  It
 works as intended, but I have one problem.  When I have only
 10,000 rows of data, the calculations are very fast.
 However, when I run the loop on the full 2 million rows, it
 seems to take ~ 1 second per row.

   Why is this happening?  What can I do to speed the
 calculations when running the loop on the full 2 million rows?

   (I am not running low on memory, but I am maxing out my CPU at 100%)

   Here is my code and some sample data:

   data- data[order(data$code,data$date_),]
   l-dim(data)[1]
   w-3
   data[l,w+1]-NA

   for (i in (l-1):(1)){

 data[i,w+1]-ifelse(is.na(data[i,w])==F,data[i,w],ifelse(data[
 i,2]==data[i+1,2],data[i+1,w+1],NA))
   }


   date  id price next_price
   6/24/20051635444.7838 444.7838
   6/27/20051635448.4756 448.4756
   6/28/20051635455.4161 455.4161
   6/29/20051635454.6658 454.6658
   6/30/20051635453.9155 453.9155
   7/1/2005  1635453.3153 453.3153
   7/4/2005  1635NA  453.9155
   7/5/2005  1635453.9155 453.9155
   7/6/2005  1635453.0152 453.0152
   7/7/2005  1635452.8651 452.8651
   7/8/2005  1635456.0163 456.0163
   12/19/2005  1635442.6982 442.6982
   12/20/2005  1635446.5159 446.5159
   12/21/2005  1635452.4714 452.4714
   12/22/2005  1635451.074   451.074
   12/23/2005  1635454.6453 454.6453
   12/27/2005  1635NA  NA
   12/28/2005  1635NA  NA
   12/1/2003188166.1562   66.1562
   12/2/2003188164.9192   64.9192
   12/3/2003188166.0078   66.0078
   12/4/2003188165.8098   65.8098
   12/5/2003188164.1275   64.1275
   12/8/2003188164.8697   64.8697
   12/9/2003188163.5337   63.5337
   12/10/2003  188162.9399   62.9399

   
 -

   [[alternative HTML version deleted]]



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] A comment about R:

2006-01-03 Thread Thomas Lumley
On Tue, 3 Jan 2006, Peter Dalgaard wrote:
 One thing that is often overlooked, and hasn't yet been mentioned in
 the thread, is how much *simpler* R can be for certain completely
 basic tasks of practical or pedagogical relevance: Calculate a simple
 derived statistic, confidence intervals from estimate and SE,
 percentage points of the binomial distribution - using dbinom or from
 the formula, take the sum of each of 10 random samples from a set of
 numbers, etc. This is where other packages get stuck in the
 procedure+dataset mindset.

Some of these things are actually fairly straightforward in Stata. For 
example, Stata will give confidence intervals and tests for linear 
combinations of coefficients and even (using symbolic differentiation and 
the delta method) for nonlinear combinations.  The first is available in 
packages in R, the second is in S Programming but doesn't seem to be 
packaged.

. di Binomial(10,4,0.2)
.12087388

Taking the sum of each of ten random samples, or other things of that 
sort, obviously requires creating a new data set, but again there are 
facilities to automate this.  I have, for example, computed bootstrap 
confidence intervals for ratio or difference of medians in a service 
course using Stata.  It would be easier in R, but not that much easier.


 For much the same reason, those packages make you tend to treat
 practical data analysis as something distinct from theoretical
 understanding of the methods: You just don't use SAS or SPSS or Stata
 to illustrate the concept of a random sample by setting up a small
 simulation study as the first thing you do in a statistics class,
 whereas you could quite conceivably do it in R. (What *is* the
 equivalent of rnorm(25) in those languages, actually?)

set obs 25
gen x = invnorm(uniform())

[This does create a new data set, of course]

 Even when using SAS in teaching, I sometimes fire up R just to
 calculate simple things like

  pbar - (p1+p2)/2
  sqrt(pbar*(1-pbar))

local pbar=(0.3+0.5)/2
display sqrt(`pbar'*(1-`pbar'))

Now, I still prefer R both for data analysis and (even more so) for 
programming. There are some things that it is genuinely difficult to 
program in Stata -- and as evidence that this isn't just my ignorance of 
the best approaches, the language was substantially reworked in both 
versions 8 and 9 to allow the vendor to implement better graphics and
linear mixed models.

On the question of which system really is easier to learn I can only 
comment that this isn't the only question where education, as a field, 
would benefit from some good randomized controlled trials.

-thomas

Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] R fortunes candidate? (was A comment about R)

2006-01-03 Thread Berton Gunter

A candidate for the fortunes package? 

(Perhaps the highest honor one can receive: being verbified :-) )

  And the fear of
 getting Ripleyed on the mailing list also makes me think, read, and
 improve before submitting half baked questions to the list.
 

   -- Eric Kort



Cheers,
Bert

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] For loop gets exponentially slower as dataset gets larger...

2006-01-03 Thread Gabor Grothendieck
Accepting this stacked representation for the
moment try this.  When reordering the dates do it
in reverse order.  Then loop over all codes
applying the zoo function na.locf to the the
prices for that code.  locf stands for last
observation carried forward.  Since our dates
are reversed it will bring the next one
backwards. Finally sort back into ascending order.

library(zoo) # needed for na.locf which also works for non-zoo objects
data - data[order(data$code, - as.numeric(data$date_)),]
attach(data)
next_price - price
for(i in unique(code)) next_price[code==i] - na.locf(price[code==i], na.rm=F)
data$next_price - next_price
data - data[order(data$code, data$date_),]
detach()

Here it is again but this time we represent it as
a list of zoo objects with one component per code.
In the code below we split the data on code and
apply f to do that.  Note that na.locf replaces
NAs with the last observation carried forward so
by reversing the data, using na.locf and reversing
the data again we get the effect.

library(zoo)
f - function(x) {
z - zoo(x$price, x$date_)
next_price - rev(na.locf(rev(coredata(z)), na.rm = FALSE))
merge(z, next_price)
}
z - lapply(split(data, data$code), f)


On 1/3/06, r user [EMAIL PROTECTED] wrote:
 I am running R 2.1.1 in a Microsoft Windows XP environment.

  I have a matrix with three vectors (columns) and ~2 million rows.  The 
 three vectors are date_, id, and price.  The data is ordered (sorted) by code 
 and date_.

  (The matrix contains daily prices for several thousand stocks, and has ~2 
 million rows. If a stock did not trade on a particular date, its price is 
 set to NA)

  I wish to add a fourth vector that is next_price. (Next price is the 
 current price as long as the current price is not NA.  If the current price 
 is NA, the next_price is the next price that the security with this same ID 
 trades.  If the stock does not trade again,  next_price is set to NA.)

  I wrote the following loop to calculate next_price.  It works as intended, 
 but I have one problem.  When I have only 10,000 rows of data, the 
 calculations are very fast.  However, when I run the loop on the full 2 
 million rows, it seems to take ~ 1 second per row.

  Why is this happening?  What can I do to speed the calculations when running 
 the loop on the full 2 million rows?

  (I am not running low on memory, but I am maxing out my CPU at 100%)

  Here is my code and some sample data:

  data- data[order(data$code,data$date_),]
  l-dim(data)[1]
  w-3
  data[l,w+1]-NA

  for (i in (l-1):(1)){
  
 data[i,w+1]-ifelse(is.na(data[i,w])==F,data[i,w],ifelse(data[i,2]==data[i+1,2],data[i+1,w+1],NA))
  }


  date  id price next_price
  6/24/20051635444.7838 444.7838
  6/27/20051635448.4756 448.4756
  6/28/20051635455.4161 455.4161
  6/29/20051635454.6658 454.6658
  6/30/20051635453.9155 453.9155
  7/1/2005  1635453.3153 453.3153
  7/4/2005  1635NA  453.9155
  7/5/2005  1635453.9155 453.9155
  7/6/2005  1635453.0152 453.0152
  7/7/2005  1635452.8651 452.8651
  7/8/2005  1635456.0163 456.0163
  12/19/2005  1635442.6982 442.6982
  12/20/2005  1635446.5159 446.5159
  12/21/2005  1635452.4714 452.4714
  12/22/2005  1635451.074   451.074
  12/23/2005  1635454.6453 454.6453
  12/27/2005  1635NA  NA
  12/28/2005  1635NA  NA
  12/1/2003188166.1562   66.1562
  12/2/2003188164.9192   64.9192
  12/3/2003188166.0078   66.0078
  12/4/2003188165.8098   65.8098
  12/5/2003188164.1275   64.1275
  12/8/2003188164.8697   64.8697
  12/9/2003188163.5337   63.5337
  12/10/2003  188162.9399   62.9399


 -

[[alternative HTML version deleted]]



 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] A comment about R:

2006-01-03 Thread roger bos
As others have pointed out, since R is more of a programming language than a
statistical package, yes, it is _harder_ to learn.  I would say its easier
to learn than C++, harder to learn than VBA, and on par with learning Java,
but that's all debatable.

One thing that makes R slightly more intimidating than it has to be is that
once a noob decides to download R, install it and open it, he gets a
semi-blank screen and that's it.  Eventually he may/may not find out that
what he needs to do next is to decide which text editor he wants to use.
They all have their pluses and minuses.  Some can be as intimidating as R
itself.  God help him if he tries to learn Xemacs at the same time as
learning R.

I learned C++ and other langauges before/concurrently with learning R
(actually S+), but I have to admit it was still not easy.  Its been a long
road for me, but I hardly ever use spreadsheets anymore.  However, getting
the casual users to do things in R instead of a spreadsheet is not going to
be easy and I am not sure that that is the goal.

I am not sure how relevant this comment is, but there is something about a
product being free that makes it appear less valuable.  At the company I
used to work for a group of people tried to persuad the managers to buy S+
licenses for them all. Whenever I would tell them that they could download R
right now for _free_ I would just get blank stares.

Thanks,

Roger



On 1/3/06, Kort, Eric [EMAIL PROTECTED] wrote:



 Berton Gunter writes
  U
 
  I cannot say how easy or hard R is to learn, but in response to the
 UCLA
  commentary:
 
   However, I
   feel like R
   is not so much of a statistical package as much as it is a
 statistical
   programming environment that has many new and cutting edge
   features.
 
  Please note: the first sentence of the Preface of THE Green Book
  (PROGRAMMING WITH DATA: A GUIDE TO THE S LANGUAGE) by John Chambers,
 the
  inventor of the S Language, explicitly states:
 
  S is a programming language and environment for all kinds of
 computing
  involving data.
 
  I think this says that R is **not** meant to be a statistical package
 in
  the
  conventional sense and should not be considered one. As computing
  involving
  data is a complex and frequently messy business on both technical
  (statistics), practical (messy data), and aesthetic (graphics, tables)
  levels, it is perhaps to be expected that a programming language and
  environment for all kinds of computing involving data  is complex.
  Personally, I find that (Chambers's next sentence) R's ability To
 turn
  ideas into software, quickly and faithfully, to be a boon. snip

 Right.

 So in 2 months I will finish my MD program here in the U.S.  I also have
 a master's degree in Epidemiology (in which we used SAS)--but that
 hardly qualifies me as statistics expert.  Nonetheless, I have learned
 to use R out of necessity without undue difficulty.  So have multiple of
 my colleagues around me with MDs, PhDs, and Master's degrees.  We do
 mainly microarray analysis, so the availability of a rapidly developing
 and customizable toolset (BioC packages) is essential to our work.

 And, in the same vein of others' comments, R's nuts and bolts
 characteristics make me think, learn, and improve.  And the fear of
 getting Ripleyed on the mailing list also makes me think, read, and
 improve before submitting half baked questions to the list.

 So in sum, I use R because it encourages thoughtful analysis, it is
 flexible and extensible, and it is free.  I feel that these are
 strengths of the environment, not weaknesses.  So if an individual finds
 another tool better suited for their work that is obviously just fine,
 but I hardly think these characteristics of R are grounds for criticism,
 excellent proposals for evolution of documentation and mailing lists
 notwithstanding.

 -Eric
 This email message, including any attachments, is for the so...{{dropped}}

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] A comment about R:

2006-01-03 Thread Gabor Grothendieck
On 1/3/06, Thomas Lumley [EMAIL PROTECTED] wrote:
 On Tue, 3 Jan 2006, Peter Dalgaard wrote:
  One thing that is often overlooked, and hasn't yet been mentioned in
  the thread, is how much *simpler* R can be for certain completely
  basic tasks of practical or pedagogical relevance: Calculate a simple
  derived statistic, confidence intervals from estimate and SE,
  percentage points of the binomial distribution - using dbinom or from
  the formula, take the sum of each of 10 random samples from a set of
  numbers, etc. This is where other packages get stuck in the
  procedure+dataset mindset.

 Some of these things are actually fairly straightforward in Stata. For

In fact there are some things that are very easy
to do in Stata and can be done in R but only with more difficulty.
For example, consider this introductory session in Stata:

http://www.stata.com/capabilities/session.html

Looking at the first few queries,
see how easy it is to take the top few in Stata whereas in R one would
have a complex use of order.  Its not hard in R to write a function
that would make it just as easy but its not available off the top
of one's head though RSiteSearch(sort.data.frame) will find one
if one knew what to search for.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] randomForest - classifier switch

2006-01-03 Thread Stephen Choularton
Hi
 
I am trying to use randomForest for classification. I am using this
code:
 
 set.seed(71)
 rf.model - randomForest(similarity ~ ., data=set1[1:100,],
importance=TRUE, proximity=TRUE)
Warning message: 
The response has five or fewer unique values.  Are you sure you want to
do regression? in: randomForest.default(m, y, ...) 
 rf.model
 
Call:
 randomForest(x = similarity ~ ., data = set1[1:100, ], importance =
TRUE,  proximity = TRUE) 
   Type of random forest: regression
 Number of trees: 500
No. of variables tried at each split: 10
 
  Mean of squared residuals: 0.1159130
% Var explained: 50.8

 
As you can see I get a regression model.  How can I make sure I get a
classification model?
 
Thanks .
 
Stephen

-- 



2/01/2006
 

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] randomForest - classifier switch

2006-01-03 Thread Liaw, Andy
From: Stephen Choularton
 
 Hi
  
 I am trying to use randomForest for classification. I am using this
 code:
  
  set.seed(71)
  rf.model - randomForest(similarity ~ ., data=set1[1:100,],
 importance=TRUE, proximity=TRUE)
 Warning message: 
 The response has five or fewer unique values.  Are you sure 
 you want to
 do regression? in: randomForest.default(m, y, ...) 
  rf.model
  
 Call:
  randomForest(x = similarity ~ ., data = set1[1:100, ], importance =
 TRUE,  proximity = TRUE) 
Type of random forest: regression
  Number of trees: 500
 No. of variables tried at each split: 10
  
   Mean of squared residuals: 0.1159130
 % Var explained: 50.8
 
  
 As you can see I get a regression model.  How can I make sure I get a
 classification model?

By making sure your response variable is a factor, e.g.,

  set1$similarity - as.factor(set1$similarity)

Andy

  
 Thanks .
  
 Stephen
 
 -- 
 
 
 
 2/01/2006
  
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] newbie R question

2006-01-03 Thread Mark Leeds
I'm sorry to bother everyone with a stupid
question but, when I am at an R prompt in Windows,
is there a way to see what packages
you already have installed from the R site so
that you can just do library(name_of_package)
and it will work.
 
I've looked at help etc but I can't find
a command like this. Maybe there
isn't one which is fine.
 
 Mark 


**
This email and any files transmitted with it are confidentia...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] p-value of Logrank-Test

2006-01-03 Thread Verena Hoffmann
Hello!

I want to compare two Kaplan-Meier-Curves by using the Logrank-Test:

logrank(Surv(time[b], status[b]) ~ group[b])

This way I only get the value of the test-statistic, but not the p-value.

Does anybody know how I can get the p-value?

Thanks in advance!

Verena Hoffmann

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] newbie R question

2006-01-03 Thread Peter Dalgaard
Mark Leeds [EMAIL PROTECTED] writes:

 I'm sorry to bother everyone with a stupid
 question but, when I am at an R prompt in Windows,
 is there a way to see what packages
 you already have installed from the R site so
 that you can just do library(name_of_package)
 and it will work.
  
 I've looked at help etc but I can't find
 a command like this. Maybe there
 isn't one which is fine.

Just library() (w/no arguments)

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] newbie R question

2006-01-03 Thread Eric Kort
On Tue, 2006-01-03 at 16:07 -0500, Mark Leeds wrote:
 I'm sorry to bother everyone with a stupid
 question but, when I am at an R prompt in Windows,
 is there a way to see what packages
 you already have installed from the R site so
 that you can just do library(name_of_package)
 and it will work.
  
 I've looked at help etc but I can't find
 a command like this. Maybe there
 isn't one which is fine.

library()


HTH,
Eric

This email message, including any attachments, is for the so...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] newbie R question

2006-01-03 Thread Roger Bivand
On Tue, 3 Jan 2006, Mark Leeds wrote:

 I'm sorry to bother everyone with a stupid
 question but, when I am at an R prompt in Windows,
 is there a way to see what packages
 you already have installed from the R site so
 that you can just do library(name_of_package)
 and it will work.
  
 I've looked at help etc but I can't find
 a command like this. Maybe there
 isn't one which is fine.

library()

  
  Mark 
 
 
 **
 This email and any files transmitted with it are confidentia...{{dropped}}
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
 

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] newbie R question

2006-01-03 Thread Prof Brian Ripley
On Tue, 3 Jan 2006, Mark Leeds wrote:

 I'm sorry to bother everyone with a stupid
 question but, when I am at an R prompt in Windows,
 is there a way to see what packages
 you already have installed from the R site so
 that you can just do library(name_of_package)
 and it will work.

 I've looked at help etc but I can't find
 a command like this. Maybe there
 isn't one which is fine.

library() (no arguments) lists all the installed packages (by library).

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] newbie R question

2006-01-03 Thread Mark Leeds
Thanks to all. I didn't
Realize that you
Got so many packages
Automatically.
 
I've used S+ for roughly
10 years on and off and
I am starting to switch over
( was finally forced to because my new company preferred
me to use R for cost. I am the only user ) 
and it's unbelievable what has been done
in R by everyone. Truly
amazing. You should
all be quite proud
about what you have created.

   Mark

-Original Message-
From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, January 03, 2006 4:16 PM
To: Mark Leeds
Cc: R-Stat Help
Subject: Re: [R] newbie R question

On Tue, 3 Jan 2006, Mark Leeds wrote:

 I'm sorry to bother everyone with a stupid
 question but, when I am at an R prompt in Windows,
 is there a way to see what packages
 you already have installed from the R site so
 that you can just do library(name_of_package)
 and it will work.

 I've looked at help etc but I can't find
 a command like this. Maybe there
 isn't one which is fine.

library() (no arguments) lists all the installed packages (by library).

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595


**
This email and any files transmitted with it are confidentia...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] p-value of Logrank-Test

2006-01-03 Thread Thomas Lumley
On Tue, 3 Jan 2006, Verena Hoffmann wrote:

 Hello!

 I want to compare two Kaplan-Meier-Curves by using the Logrank-Test:

 logrank(Surv(time[b], status[b]) ~ group[b])

 This way I only get the value of the test-statistic, but not the p-value.

 Does anybody know how I can get the p-value?


You don't say where you found the logrank() function, but

a)  The survdiff() function in the survival package gives p-values as well 
as test statistic for the logrank test

b) The test statistic presumably has a  chisquare null distribution, so 
pchisq() would turn it into a p-value.

-thomas

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] mixed effects models - negative binomial family?

2006-01-03 Thread Elizabeth Lawson
Have you tried nlme?
   
  I tried to something similar.
   
  Here is the code that I used for a negative binomial random effects model
   
   
  library(nlme)
  mydata-read.table(C:\\Plx\\plx.all\\plxall.txt,header=TRUE)
  loglike = 
function(PLX_NRX,PD4_42D,GAT_34D,VIS_42D,MSL_42D,SPE_ROL,XM2_DUM,THX_DUM,b0,b1,b2,b3,b4,b5,b6,b7,alpha){
  lambda = exp(b0 + 
b1*GAT_34D+b2*VIS_42D+b3*MSL_42D+b4*PD4_42D+b5*SPE_ROL+b6*XM2_DUM+b7*THX_DUM)
  y=round(PLX_NRX)
  y - table(y)
   freq - as.vector(y)
  count - as.numeric(names(y))
   count - count[!(freq  1)]
  freq - freq[!(freq  1)]
  n - length(count)
  df - n - 1
  
   
   
   
   df - df - 2
  xbar - weighted.mean(count, freq)
  s2 - var(rep(count, freq))
  p - xbar/s2
  alpha - xbar^2/(s2 - xbar)
   
  ( dnbinom(y,alpha,(alpha/(alpha+lambda))) )
}
   
   
  
plx.nlme-nlme(PLX_NRX~loglike(PLX_NRX,PD4_42D,GAT_34D,VIS_42D,MSL_42D,SPE_ROL,XM2_DUM,THX_DUM,b0,b1,b2,b3,b4,b5,b6,b7,alpha),
  data=mydata,
  fixed=list(b0 + b1+b2+b3+b4+b5+b6+b7+alpha~1),
  random=b0~1|menum,
   
  start=c(b0=0,b1=0,b2=0,b3=0,b4=0,b5=0,b6=0,b7=0,alpha=5)
  )
   
  I am not sure that this is what you are looking for , but I hope this helps!

   
  Elizabeth Lawson
Constantinos Antoniou [EMAIL PROTECTED] wrote:
  Hello all,

I would like to fit a mixed effects model, but my response is of the 
negative binomial (or overdispersed poisson) family. The only (?) 
package that looks like it can do this is glmm.ADMB (but it cannot 
run on Mac OS X - please correct me if I am wrong!) [1]

I think that glmmML {glmmML}, lmer {Matrix}, and glmmPQL {MASS} do 
not provide this family (i.e. nbinom, or overdispersed poisson). Is 
there any other package that offers this functionality?

Thanking you in advance,

Costas


[1] Yes, I know I can use this on another OS. But it is kind of a 
nuisance, as I have my whole workflow setup on a mac, including emacs 
+ess, the data etc etc. It will be non-trivial to start moving/ 
syncing files between 1 computers, in order to use this package...

--
Constantinos Antoniou, Ph.D.
Department of Transportation Planning and Engineering
National Technical University of Athens
5, Iroon Polytechniou str. GR-15773, Athens, Greece

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
  



-

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] All possible subsets model selection using AICc

2006-01-03 Thread Matt Williamson
Hello List,
I was wondering if a package or piece of code exists that will allow all
possible subsets regression model selection within program R.  I have
already looked at step(AIC) which does not test differing combinations
of variables within a model as far as I can tell.  In addition I tried
to use the leaps command, but that does not use the criterion I am
looking for.  Any help or advice would be greatly appreciated.
Thanks
Matt Williamson
 
Matthew Williamson
Graduate Research Assistant
Department of Fishery and Wildlife Biology
Colorado State University, Fort Collins, CO 80523
Office: (970)491-5790
Cell:(970)412-0442

We are now confronted by the fact...that wars are no longer won;...all
wars are lost by all who wage them; the only difference between
participants is the degree and kind of losses they sustain.
...Science has so sharpened the fighter's sword that it is impossible
for him to cut his enemy without cutting himself.
--Aldo Leopold
 

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] All possible subsets model selection using AICc

2006-01-03 Thread Thomas Lumley
On Tue, 3 Jan 2006, Matt Williamson wrote:

 Hello List,
 I was wondering if a package or piece of code exists that will allow all
 possible subsets regression model selection within program R.  I have
 already looked at step(AIC) which does not test differing combinations
 of variables within a model as far as I can tell.  In addition I tried
 to use the leaps command, but that does not use the criterion I am
 looking for.

leaps() or regsubsets() in the leaps package almost certainly do use the 
criterion you are looking for (even though you don't tell us what that 
criterion is).

These functions produce one or more best models of each size, and for 
models of the same size all the commonly-used criteria reduce to ranking 
by residual sum of squares, which is what leaps() and regsubsets() do.


-thomas

Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Including random effects in logistic regression.

2006-01-03 Thread Jason Marshal
I'm trying to analyse some data using logistic regression in R, but I 
want to include random effects in the model.  The glm function 
appears not to have options for including random effects, and the lme 
and nlme documentation indicates that these functions are for 
continuous, not dichotomous, response variables.  Are there options 
in R for this type of analysis?

Jason Marshal
Bariloche, Argentina

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] abline in log-log plot

2006-01-03 Thread Jeffrey Ross-Ibarra
I'm working with a scatterplot of data, using plot() with log=xy to get
log-log axes.  But I can't get the regression line to plot correctly.  I use
abline(lm(log(Y)~log(X))) and ge a line that looks like the correct slope,
but the Y-intercept is messed up.  I haven't changed the y-axis other than
to use the log-transformation.  I can get the non-log regression line to
plot as a curve on the log-log axes by using abline(lm(Y~X),unt=T), but I
just want to plot the straight regression line of log(Y)~log(X).

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Connectivity across a grid above a variable surface

2006-01-03 Thread Waichler, Scott R
Hi, 
I'm looking for ideas or packages with relevant algorithms for
calculating the connectivity across a grid, where connectivity is
defined as the minimum amount of cross-sectional area along a continuous
path.  The upper boundary of the cross-sectional area is a fixed
elevation, and the lower boundary is a gridded surface of variable
elevation.  My variable elevation surface represents the top of an
impermeable geologic layer.  I would like to represent the degree to
which a fluid could flow from one end of my grid to another, above the
surface and below the fixed level.  I don't need to derive information
about path lengths and hydraulic gradient, but if I could, that would be
a plus.  A groundwater flow model would provide the exact answer, but
I'm looking for something more approximate and faster.  

My grids are such that there are many dead-end flow paths, where the
bottom boundary rises to meet the top boundary and the cross-sectional
area available for flow pinches out.  In plan view, fluid can enter all
along one boundary and leave all along the opposite boundary, but flow
connectivity across the grid varies between bottom boundary scenarios.

Scott Waichler
Pacific Northwest National Laboratory
scott.waichler _at_ pnl.gov

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] all possible combinations of list elements

2006-01-03 Thread Eberhard F Morgenroth
I have a list as follows

P - list(A  = c(CS, CX),
  B  = 1:4,
  Y  = c(4, 9))

I now would like to prepare a new list where the rows of the new list
provide all possible combinations of the elements in the orginal list.
Thus, the result should be the following

CS  1   4
CS  1   9
CS  2   4
CS  2   9
CS  3   4
CS  3   9
CS  4   4
CS  4   9
CX  1   4
CX  1   9
CX  2   4
CX  2   9
CX  3   4
CX  3   9
CX  4   4
CX  4   9

Is there a simple routine in R to create this list of all possible
combinations? The routine will be part of a function with the list P
as an input. P will not always have the same number of elements and
each element in the list P may have different numbers of values.

Thanks,
Eberhard Morgenroth
 
Eberhard Morgenroth, Assistant Professor of Environmental Engineering 
University of Illinois at Urbana-Champaign 
3219 Newmark Civil Engineering Laboratory, MC-250 
205 North Mathews Avenue, Urbana, IL 61801, USA 
Email: [EMAIL PROTECTED] 
http://cee.uiuc.edu/research/morgenroth

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] all possible combinations of list elements

2006-01-03 Thread Marc Schwartz
On Tue, 2006-01-03 at 18:57 -0600, Eberhard F Morgenroth wrote:
 I have a list as follows
 
 P - list(A  = c(CS, CX),
   B  = 1:4,
   Y  = c(4, 9))
 
 I now would like to prepare a new list where the rows of the new list
 provide all possible combinations of the elements in the orginal list.
 Thus, the result should be the following
 
 CS1   4
 CS1   9
 CS2   4
 CS2   9
 CS3   4
 CS3   9
 CS4   4
 CS4   9
 CX1   4
 CX1   9
 CX2   4
 CX2   9
 CX3   4
 CX3   9
 CX4   4
 CX4   9
 
 Is there a simple routine in R to create this list of all possible
 combinations? The routine will be part of a function with the list P
 as an input. P will not always have the same number of elements and
 each element in the list P may have different numbers of values.

See ?expand.grid

 expand.grid(P)
A B Y
1  CS 1 4
2  CX 1 4
3  CS 2 4
4  CX 2 4
5  CS 3 4
6  CX 3 4
7  CS 4 4
8  CX 4 4
9  CS 1 9
10 CX 1 9
11 CS 2 9
12 CX 2 9
13 CS 3 9
14 CX 3 9
15 CS 4 9
16 CX 4 9

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] A comment about R:

2006-01-03 Thread Bob Green

Hello,


Unlike most posts on the R mailing list I feel qualified to comment on 
this one.  For about 3 months I have been trying to learn use R,  after 
having  used various versions of SPSS for about  10 years.


I think it is far too simplistic to ascribe non-use of R to laziness.  This 
may well be the case for some, however, I have read 5-6 books on R, waded 
through on-line resources,  read the documentation and asked multiple 
questions via e-mails - and still find even some of the basics very difficult.

There are several reasons for this:

1. For some tasks R is extremely user-unfriendly.  Some comparative examples:

(a) In running a chi-square analysis in SPSS the following syntax is included

/STATISTIC=CHISQ
   /CELLS= COUNT EXPECTED ROW COLUMN TOTAL RESID .

this produces expected and observed counts, row  column percentages, 
residuals, chi-square  Fisher's exact  test + other output.

In R, it is a herculean task to produce similar output . It certainly, 
can't be produced in 2 lines as far as I can tell.

(b)  in SPSS if I want to compare multiple variables by a single dependent 
variable this is readily performed

CROSSTABS
   /TABLES=baserdis  baserenh  basersoc baseradd socbest disbest entbest 
addbest worsdis worsphy by group

I used the chi-square example again, but the same applies for a t-test. I 
started looking into how  to do something similar in R, with the t-test 
command but gave up. R does force the user to take a more considered 
approach to analysis.

(c) To obtain a correlation matrix in R with the correlation  p-value is 
no simple task -

In SPSS this is obtained via:

GET
   FILE='D:\a study\data\dat\key data\master data.sav'.
NONPAR CORR
   /VARIABLES= goodnum badnum good5 bad5 avfreq avdayamt
   /PRINT=KENDALL TWOTAIL
   /MISSING=PAIRWISE .

In R something like this is required -

  by(mydat, mydat$group, function(x) {
+ nm - names(x)
+ rho - matrix(, 6, 2)
+ rho.nm - matrix(, 6, 2)
+ k - 1
+ for(i in 2:4) {
+ for(j in (i + 1):5) {
+ x.i - x[, i]
+ x.j - x[, j]
+ ct - cor.test(x.i, x.j, method=c(kendall) , alternative =c(two-sided))
+ rho[k, 1] - ct$estimate
+ rho[k, 2] - round(ct$p-value, 3)
+ rho.nm[k, ] - c(nm[i], nm[j])
+ k - k + 1
+ }
+ }
+ rho - cbind(as.data.frame(rho.nm), as.data.frame(rho))
+ names(rho) - c(freq.i, freq.j, cor, p-value)
+ rho
+ })

2) It is not always clear what the output produced by R, is. The 
Mann-Whitney U-test is a good example. In R, it seems a standardised value 
is obtained. I was advised that it is easy enough to check this as R is 
open-source, but at least for me, I don't believe I would understand this 
code anyway. It is confusing when comparative programs such as R and SPSS 
produce dis-similar results. For the user it is important to be able to 
fairly easily reconcile such differences, to engender confidence in results.

3) I find the help files in R quite difficult to understand.  For example, 
see help(t.test).  It is almost assumed by the examples that you know what 
to do. Personally, I would find some form of simple decision tree easier 
-e.g. If you want to perform a t-test with the dependent variable in one 
column and the dependent use in another use t.test(AVFREQ~GROUP) . If you 
want to perform a t-test with the dependent variable in separate columns 
(each column representing a different group) use - t.test(AVFREQ1, AVFREQ2) .

4) My initial approach to using R, was to run commands I had used commonly 
in SPSS and compare the results. I have only got as far  as basic ANOVA. 
This has been time-consuming and at times it has been difficult to obtain 
advice. Some people on the R list have been extremely generous with their 
time and knowledge, and I have much appreciated this assistance. At other 
times I see responses met  with something like arrogance. With the 
sophistication of R, there is also an elitism.  This is a barrier to R 
being more widely accepted and used.

5) differences in terminology - this is just part of the learning process, 
but I still found it took quite some time to work out simple commands and 
what different analyses were called.

6) system administrators may be wary of freeware.

No doubt for the sophisticated user, my comments may seem trite and easily 
resolved, however I believe my comments have some relevance as to why R is 
not more readily used or accepted.


Bob Green

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Including random effects in logistic regression.

2006-01-03 Thread Doran, Harold
Jason

the lmer() function in the Matrix package is what you will need. 


-Original Message-
From:   [EMAIL PROTECTED] on behalf of Jason Marshal
Sent:   Tue 1/3/2006 8:10 AM
To: r-help@stat.math.ethz.ch
Cc: 
Subject:[R] Including random effects in logistic regression.

I'm trying to analyse some data using logistic regression in R, but I 
want to include random effects in the model.  The glm function 
appears not to have options for including random effects, and the lme 
and nlme documentation indicates that these functions are for 
continuous, not dichotomous, response variables.  Are there options 
in R for this type of analysis?

Jason Marshal
Bariloche, Argentina

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html




[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Glimmix and glm

2006-01-03 Thread Spencer Graves
  I'm not certain what you are asking.  I just got 10 hits for 
'RSiteSearch(Glimmix)'.  Seven of them mentioned SAS PROC GLIMMIX:

http://finzi.psych.upenn.edu/R/Rhelp02a/archive/65945.html
http://finzi.psych.upenn.edu/R/Rhelp02a/archive/65954.html
http://finzi.psych.upenn.edu/R/Rhelp02a/archive/53310.html
http://finzi.psych.upenn.edu/R/Rhelp02a/archive/53311.html
http://finzi.psych.upenn.edu/R/Rhelp02a/archive/65935.html
http://finzi.psych.upenn.edu/R/Rhelp02a/archive/65949.html
http://finzi.psych.upenn.edu/R/Rhelp02a/archive/57731.html

  If you'd like more help from this group, PLEASE do read the posting 
guide! www.R-project.org/posting-guide.html.  Anecdotal evidence 
suggests that posts that conform more closely to the suggestions there 
tend to get quicker more useful replies.

  Best Wishes,
  Spencer Graves

[EMAIL PROTECTED] wrote:

 Hello.
 
 Some months age an e-mail was posted in which a comparison between Glimmix 
 and glm was discussed. I have not been able to find that e-mail on the R 
 archive. Does anyone recall the date of the above e-mail?
 
 Thank you very much.
 
 ***
 Antonio Paredes
 USDA- Center for Veterinary Biologics
 Biometrics Unit
 510 South 17th Street, Suite 104
 Ames, IA 50010
 (515) 232-5785
 
  
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] A comment about R: (sort.data.frame)

2006-01-03 Thread Michael Prager

Gabor Grothendieck wrote on 1/3/2006 2:37 PM:

Looking at the first few queries,
see how easy it is to take the top few in Stata whereas in R one would
have a complex use of order.  Its not hard in R to write a function
that would make it just as easy but its not available off the top
of one's head though RSiteSearch(sort.data.frame) will find one
if one knew what to search for.
  

Yes, R has a few peculiar gaps.  As to sort.data.frame(), it should be 
added to R base, in my opinion.  It is silly to make people download 
code for such a basic operation.

MHP

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Looking for packages to do Feature Selection and Classification

2006-01-03 Thread Frank Duan
Hi All,

Sorry if this is a repost (a quick browse didn't give me the answer).

I wonder if there are packages that can do the feature selection and
classification at the same time. For instance, I am using SVM to classify my
samples, but it's easy to get overfitted if using all of the features. Thus,
it is necessary to select good features to build an optimum hyperplane
(?). Here is a simple example: Suppose I have 100 useful features and 100
useless features (or noise features), I want the SVM to give me the
same results when 1) using only 100 useful features or 2) using all 200
features.

Any suggestions or point me to a reference?

Thanks in advance!

Frank

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] newbie where to look question

2006-01-03 Thread Mark Leeds
I don't want to bother anyone
with specific questions
because I am a R newbie and
I see that there is TON ( emphasis on
Ton ) of documentation
out there but could
someone just tell me the
best placed to look/read
for learning about ( for R-2-2.1 in Windows )
 
.Rprofile
.REnviron.
.Rdata
 
.First function ( analogous to the one in Splus ).
 
Analog of Splus Chapter 
 
Basically, I want to learn how to start R so that
my own source code and various
packages are already available when
I start up and how to make separate .Data ( I used to
do this in Splus with Splus Chapter ) directories etc.
 
I am willing to fight through it and try to figure it out
myself but there's so much stuff on the net in terms
of threads etc that I might be helped by knowing the best
place to start to learn. Thanks.
 
 Mark
 
 
 


**
This email and any files transmitted with it are confidentia...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] newbie where to look question

2006-01-03 Thread Dirk Eddelbuettel

On 3 January 2006 at 22:56, Mark Leeds wrote:
| out there but could
| someone just tell me the
| best placed to look/read
| for learning about ( for R-2-2.1 in Windows )
|  
| .Rprofile
| .REnviron.
| .Rdata

?Startup

| .First function ( analogous to the one in Splus ).

?.First
  
| Analog of Splus Chapter 

Not sure. Running

RSiteSearch(S-Plus Chapter)

leads to the R Data Import/Export manual, and

RSiteSearch(SPlus Chapter)

has some hits too.

| Basically, I want to learn how to start R so that
| my own source code and various
| packages are already available when
| I start up and how to make separate .Data ( I used to
| do this in Splus with Splus Chapter ) directories etc.

That's done a little differently here but I do not know of a good migration
guide for users with prior S-Plus experience.

Hth, Dirk

-- 
Hell, there are no rules here - we're trying to accomplish something. 
  -- Thomas A. Edison

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] unexpected false convergence

2006-01-03 Thread Spencer Graves
  I replicated your 'false convergence' using R 2.2.0:
  sessionInfo()
R version 2.2.0, 2005-10-06, i386-pc-mingw32

attached base packages:
[1] methods   stats graphics  grDevices utils datasets
[7] base

other attached packages:
 nlme MASS
3.1-66 7.2-23

  Since the error message said, Error in lme.formula, I listed the 
code for lme.formula and traced it using debug(lme.formula),  The 
function glmmPQL calls lme.formula repeatedly.  The function 
lme.formula in turn calls nlminb when it's available, though it used 
to call optim.  The fifth time lme.formula was called, nlminb 
returned the error message false convergence (8).

  Under R 2.2, nlminb is part of the base package.  I'm not 
certain, but I don't think it was available in base under R 2.1.1.

  I think this explains the problem, but not how to fix it.  I tried 
modifying the code fo lme.formula to force it to call optim, but 
this generated a different error.  I am therefore copying Professors 
Bates  Ripley in case one of them might want to look at this.

  hope this helps.
  spencer graves

Jack Tanner wrote:
 I've come into some code that produces different results under R 2.1.1 and R 
 2.2.1. I'm really unfamiliar with the libraries in question (MASS and nlme), 
 so I don't know if this is a bug in my code, or a regression in R. If it's a 
 bug on my end, I'd appreciate any advice on potential causes and relevant 
 documentation.
 
 The code:
 
 score-c(1,8,1,3,4,4,2,5,3,6,0,3,1,5,0,5,1,11,1,2,4,5,2,4,1,6,1,2,8,16,5,16,3,15,3,12,4,9,2,4,1,8,2,6,4,11,2,9,3,17,2,6)
 id-rep(1:13,rep(4,13))
 test-gl(2,1,52,labels=c(pre,post))
 coder-gl(2,2,52,labels=c(two,three))
 il-data.frame(id,score,test,coder)
 attach(il)
 cs1-corSymm(value=c(.396,.786,.718,.639,.665,.849),form=~1|id)
 cs1-Initialize(cs1,data=il)
 run-glmmPQL(score~test+coder, 
 random=~1|id,family=poisson,data=il,correlation=cs1)
 
 The output under R 2.2.1, which leaves the run object (last line of the 
 code) undefined:
 
 iteration 1
 iteration 2
 iteration 3
 iteration 4
 Error in lme.formula(fixed = zz ~ test + coder, random = ~1 | id, data = 
 list( :
 false convergence (8)
 
 Under R 2.1.1, I get exactly 4 iterations as well, but no false 
 convergence message, and run is defined.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] newbie where to look question

2006-01-03 Thread Marc Schwartz
On Tue, 2006-01-03 at 22:56 -0500, Mark Leeds wrote: 
 I don't want to bother anyone
 with specific questions
 because I am a R newbie and
 I see that there is TON ( emphasis on
 Ton ) of documentation
 out there but could
 someone just tell me the
 best placed to look/read
 for learning about ( for R-2-2.1 in Windows )
  
 .Rprofile
 .REnviron.
 .Rdata
  
 .First function ( analogous to the one in Splus ).
  
 Analog of Splus Chapter 
  
 Basically, I want to learn how to start R so that
 my own source code and various
 packages are already available when
 I start up and how to make separate .Data ( I used to
 do this in Splus with Splus Chapter ) directories etc.
  
 I am willing to fight through it and try to figure it out
 myself but there's so much stuff on the net in terms
 of threads etc that I might be helped by knowing the best
 place to start to learn. Thanks.


Mark,

One of the best places to start looking is actually the R e-mail list
Posting Guide, for which there is a link at the bottom of every e-mail
that comes through the list:

  http://www.r-project.org/posting-guide.html

Much of what you want to cover is in An Introduction to R, which is
available from the menus in the Windows version or online at the main R
web site under Manuals. 

Additional information on your specific questions are available
using ?Startup and ?.First from within an R session.

For Chapters, see ?save and ?load, which I believe will provide for
parallel functionality in a fashion.


The main R FAQ:

http://cran.r-project.org/doc/FAQ/R-FAQ.html

and the R Windows FAQ:

http://cran.r-project.org/bin/windows/base/rw-FAQ.html

are good resources as well. If you are transitioning from S-PLUS, you
might want to pay particular attention to section 3.3 of the main R FAQ
on the differences between R and S-PLUS.

Finally, thanks to Andy Liaw and Jon Baron, there is there RSiteSearch()
function, which will enable you to search the e-mail list archives and
documentation online from within an R session. See ?RSiteSearch.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Questions about cbind

2006-01-03 Thread Vincent Deng
Dear R-helpers

I have a stupid question about cbind function. Suppose I have a
dataframe like this
Frame:
A 10
C 20
B 40

and a numeric matrix like this
Matrix:
A 1
B 2
C 3

cbind(Frame[,2],Matrix[,1]) simply binds these two columns without
checking the order, I mean, the result will be
A 10 1
B 20 2
C 30 3

rather than
A 10 1
B 30 2
C 20 3

So my problem is: Is there any solution for R to bind columns with
correct order?

Many thanks

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Questions about cbind

2006-01-03 Thread Marc Schwartz
On Wed, 2006-01-04 at 13:28 +0800, Vincent Deng wrote:
 Dear R-helpers
 
 I have a stupid question about cbind function. Suppose I have a
 dataframe like this
 Frame:
 A 10
 C 20
 B 40
 
 and a numeric matrix like this
 Matrix:
 A 1
 B 2
 C 3
 
 cbind(Frame[,2],Matrix[,1]) simply binds these two columns without
 checking the order, I mean, the result will be
 A 10 1
 B 20 2
 C 30 3
 
 rather than
 A 10 1
 B 30 2
 C 20 3
 
 So my problem is: Is there any solution for R to bind columns with
 correct order?
 
 Many thanks

I presume that either the '40' in the first expression of Frame or the
'30's in the second and third outputs are typos?

See ?merge, which will perform SQL-like 'join' operations using a
primary key:

 Frame
  V1 V2
1  A 10
2  C 20
3  B 40


Note that despite the name, this is not a matrix, but also a data frame.
A matrix can only have one data type, while a data frame can have more
than one.

 Matrix
  V1 V2
1  A  1
2  B  2
3  C  3


 merge(Frame, Matrix, by = V1)
  V1 V2.x V2.y
1  A   101
2  B   402
3  C   203


HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] silly, extracting the value of C from the results of somers2

2006-01-03 Thread ahimsa campos arceiz
Sorry I have a very simple question:

I used somers2 function from Design package:

 z- somers2(x,y, weights=w)

results are:

z
 CDxynMissing
 0.88  0.76  5000.00

Now I want to call only the value of C to be used in further analyses, but I 
fail to do it. I have tried:

 z$C
NULL
 z[,C]
Error in z[,C]: incorrect number of dimensions

and some other silly things. If I do 
list(z)
[[1]]
  CDxynMissing
 0.88  0.76  5000.00

Can somebody tell me how can I obtain just the value of c?

Thank you useRs,

very gratefull

Ahimsa

-- 
Ahimsa Campos Arceiz
The University Museum,
The University of Tokyo
Hongo 7-3-1, Bunkyo-ku,
Tokyo 113-0033
phone +81-(0)3-5841-2824

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] silly, extracting the value of C from the results of somers2

2006-01-03 Thread P Ehlers


ahimsa campos arceiz wrote:
 Sorry I have a very simple question:
 
 I used somers2 function from Design package:
 
 
z- somers2(x,y, weights=w)
 
 
 results are:
 
 
z
 
  CDxynMissing
  0.88  0.76  5000.00
 
 Now I want to call only the value of C to be used in further analyses, but I 
 fail to do it. I have tried:
 
 
z$C
 
 NULL
 
z[,C]
 
 Error in z[,C]: incorrect number of dimensions
 
 and some other silly things. If I do 
 
list(z)
 
 [[1]]
   CDxynMissing
  0.88  0.76  5000.00
 
 Can somebody tell me how can I obtain just the value of c?

(I think that somers2() is in package:Hmisc.)
The help page clearly says that somers2 returns a vector and
there's an example on the help page that does _exactly_ what you ask!

z[C]  or  z[1]

Peter Ehlers

 
 Thank you useRs,
 
 very gratefull
 
 Ahimsa


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html