date:20070827

[R] Solving equations involving banded matrices

2007-08-27 Thread Anup Nandialath

Dear friends,

I'm looking for a function which solves the system of equations Ax=B where A is 
a positive definite banded matrix. I know that the command solve can be used to 
arrive at a solution. But does it work as well with banded matrices?

In GAUSS the command "bandsolpd" achieves this. So effectively, I guess my 
question is whether there is a mirror command in R for the same.

Thanks in advance
Regards

Anup
   
-

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Excel

2007-08-27 Thread Moshe Olshansky

As far as I understand, changing the format changes
the way data is displayed by Excel but this does not
change the data itself - if while reading the data
Excel decided that it was a date, it is being
converted to an integer (the number of days since
January 1, 1900 - and they mistakenly think that 1900
was a leap year) and it is stored this way.

--- David Scott <[EMAIL PROTECTED]> wrote:

> On Tue, 28 Aug 2007, Robert A LaBudde wrote:
> 
> > If you format the column as "Text", you won't have
> this problem. By
> > leaving the cells as "General", you leave it up to
> Excel to guess at
> > the correct interpretation.
> >
> 
> Not true actually. I had converted the column to
> Text because I saw the 
> interpretation as a date in the .xls file. I saved
> the .csv file *after* 
> the column had been converted to Text. Looking at
> the .csv file in a text 
> editor, the entry is correct.
> 
> I have just rechecked this.
> 
> On reopening the .csv using Excel, the entry AUG2699
> had been interpreted 
> as a date, and was showing as Aug-99. Most bizarre
> is that the NHI value 
> of AUG1838 has *not* been interpreted as a date.
> 
> David Scott
> 
> 
> > You will note that the conversion to a date occurs
> immediately in
> > Excel when you enter the value. There are many
> formats to enter dates.
> >
> > Either pre-format the column as Text, or prefix
> the individual entry
> > with an ' to indicate text.
> >
> > A similar problem occurs in R's read.table()
> function when a factor
> > has levels that can be interpreted as numbers.
> >
> > At 10:11 PM 8/27/2007, David wrote:
> >
> >> A common process when data is obtained in an
> Excel spreadsheet is to save
> >> the spreadsheet as a .csv file then read it into
> R. Experienced users
> >> might have learned to be wary of dates (as I
> have) but possibly have not
> >> experienced what just happened to me. I thought I
> might just share it with
> >> r-help as a cautionary tale.
> >>
> >> I received an Excel file giving patient details.
> Each patient had an ID
> >> code in the form of three letters followed by
> four digits. (Actually a New
> >> Zealand National Health Identification.) I saved
> the .xls file as .csv.
> >> Then I opened up the .csv (with Excel) to look at
> it. In the column of ID
> >> codes I saw: Aug-99. Clicking on that entry it
> showed 1/08/2699.
> >>
> >> In a column of character data, Excel had
> interpreted AUG2699 as a date.
> >>
> >> The .csv did not actually have a date in that
> cell, but if I had saved the
> >> .csv file it would have.
> >>
> >> David Scott
> >
> >
>

> > Robert A. LaBudde, PhD, PAS, Dpl. ACAFS  e-mail:
> [EMAIL PROTECTED]
> > Least Cost Formulations, Ltd.URL:
> http://lcfltd.com/
> > 824 Timberlake Drive Tel:
> 757-467-0954
> > Virginia Beach, VA 23464-3239Fax:
> 757-467-2947
> >
> > "Vere scire est per causas scire"
> >
> > __
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained,
> reproducible code.
> >
> 
>
_
> David Scott   Department of Statistics, Tamaki Campus
>   The University of Auckland, PB 92019
>   Auckland 1142,NEW ZEALAND
> Phone: +64 9 373 7599 ext 86830   Fax: +64 9 373 7000
> Email:[EMAIL PROTECTED]
> 
> Graduate Officer, Department of Statistics
> Director of Consulting, Department of Statistics
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained,
> reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Excel

2007-08-27 Thread David Scott

On Tue, 28 Aug 2007, Robert A LaBudde wrote:

> If you format the column as "Text", you won't have this problem. By
> leaving the cells as "General", you leave it up to Excel to guess at
> the correct interpretation.
>

Not true actually. I had converted the column to Text because I saw the 
interpretation as a date in the .xls file. I saved the .csv file *after* 
the column had been converted to Text. Looking at the .csv file in a text 
editor, the entry is correct.

I have just rechecked this.

On reopening the .csv using Excel, the entry AUG2699 had been interpreted 
as a date, and was showing as Aug-99. Most bizarre is that the NHI value 
of AUG1838 has *not* been interpreted as a date.

David Scott


> You will note that the conversion to a date occurs immediately in
> Excel when you enter the value. There are many formats to enter dates.
>
> Either pre-format the column as Text, or prefix the individual entry
> with an ' to indicate text.
>
> A similar problem occurs in R's read.table() function when a factor
> has levels that can be interpreted as numbers.
>
> At 10:11 PM 8/27/2007, David wrote:
>
>> A common process when data is obtained in an Excel spreadsheet is to save
>> the spreadsheet as a .csv file then read it into R. Experienced users
>> might have learned to be wary of dates (as I have) but possibly have not
>> experienced what just happened to me. I thought I might just share it with
>> r-help as a cautionary tale.
>>
>> I received an Excel file giving patient details. Each patient had an ID
>> code in the form of three letters followed by four digits. (Actually a New
>> Zealand National Health Identification.) I saved the .xls file as .csv.
>> Then I opened up the .csv (with Excel) to look at it. In the column of ID
>> codes I saw: Aug-99. Clicking on that entry it showed 1/08/2699.
>>
>> In a column of character data, Excel had interpreted AUG2699 as a date.
>>
>> The .csv did not actually have a date in that cell, but if I had saved the
>> .csv file it would have.
>>
>> David Scott
>
> 
> Robert A. LaBudde, PhD, PAS, Dpl. ACAFS  e-mail: [EMAIL PROTECTED]
> Least Cost Formulations, Ltd.URL: http://lcfltd.com/
> 824 Timberlake Drive Tel: 757-467-0954
> Virginia Beach, VA 23464-3239Fax: 757-467-2947
>
> "Vere scire est per causas scire"
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

_
David Scott Department of Statistics, Tamaki Campus
The University of Auckland, PB 92019
Auckland 1142,NEW ZEALAND
Phone: +64 9 373 7599 ext 86830 Fax: +64 9 373 7000
Email:  [EMAIL PROTECTED]

Graduate Officer, Department of Statistics
Director of Consulting, Department of Statistics

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Excel

2007-08-27 Thread Robert A LaBudde

If you format the column as "Text", you won't have this problem. By 
leaving the cells as "General", you leave it up to Excel to guess at 
the correct interpretation.

You will note that the conversion to a date occurs immediately in 
Excel when you enter the value. There are many formats to enter dates.

Either pre-format the column as Text, or prefix the individual entry 
with an ' to indicate text.

A similar problem occurs in R's read.table() function when a factor 
has levels that can be interpreted as numbers.

At 10:11 PM 8/27/2007, David wrote:

>A common process when data is obtained in an Excel spreadsheet is to save
>the spreadsheet as a .csv file then read it into R. Experienced users
>might have learned to be wary of dates (as I have) but possibly have not
>experienced what just happened to me. I thought I might just share it with
>r-help as a cautionary tale.
>
>I received an Excel file giving patient details. Each patient had an ID
>code in the form of three letters followed by four digits. (Actually a New
>Zealand National Health Identification.) I saved the .xls file as .csv.
>Then I opened up the .csv (with Excel) to look at it. In the column of ID
>codes I saw: Aug-99. Clicking on that entry it showed 1/08/2699.
>
>In a column of character data, Excel had interpreted AUG2699 as a date.
>
>The .csv did not actually have a date in that cell, but if I had saved the
>.csv file it would have.
>
>David Scott

Robert A. LaBudde, PhD, PAS, Dpl. ACAFS  e-mail: [EMAIL PROTECTED]
Least Cost Formulations, Ltd.URL: http://lcfltd.com/
824 Timberlake Drive Tel: 757-467-0954
Virginia Beach, VA 23464-3239Fax: 757-467-2947

"Vere scire est per causas scire"

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Problem with lme using glht for multiple comparisons

2007-08-27 Thread Christian Kost

Hi everyone,

I am new to R and have a question that relates to unplanned post-hoc 
comparisons using the multcomp package after a mixed effects model. I couldn't 
find the answer to it in the archive or in any manual. 

I have a dataset in which several plants have been treated in a particular way 
and a continuous response variable has been measured depending on several 
leaves per plant. I am now interested in the effect of the treatment depending 
on the age of the leaves examined. So the dataset (L1) consists of a continuous 
response variable (EFN), a fixed factor (Leafage), and a random factor (Plant).

I have set up the following mixed effects model, which works fine:


> > LM<-lme(EFN~Leafage,L1,~1|Plant)
>   

Now all I want to do is a post-hoc analysis (multiple comparisons) for the 
fixed factor EFN. I tried the following code. According to the documentation 
this should work:


> > Post <- glht(LM, linfct = mcp(Leafage = "Tukey"))
>   

However, I get this error message and don't know what to do:

Error in mcp2matrix(model, linfct = linfct) : 
Factor(s) Leafage have been specified in ‘linfct’ but cannot be found 
in ‘model’!


The factor is specified, right? So what is the problem? If I do the same with 
an normal Anova (command: aov), it works. What is the problem with the lme 
command?

Thank you very much in advance for your help.

Cheers,


Christian

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Excel

2007-08-27 Thread Moshe Olshansky

This is very consistent with Microsoft's philosophy:
they know better than you what you want to do.

--- David Scott <[EMAIL PROTECTED]> wrote:

> 
> A common process when data is obtained in an Excel
> spreadsheet is to save 
> the spreadsheet as a .csv file then read it into R.
> Experienced users 
> might have learned to be wary of dates (as I have)
> but possibly have not 
> experienced what just happened to me. I thought I
> might just share it with 
> r-help as a cautionary tale.
> 
> I received an Excel file giving patient details.
> Each patient had an ID 
> code in the form of three letters followed by four
> digits. (Actually a New 
> Zealand National Health Identification.) I saved the
> .xls file as .csv. 
> Then I opened up the .csv (with Excel) to look at
> it. In the column of ID 
> codes I saw: Aug-99. Clicking on that entry it
> showed 1/08/2699.
> 
> In a column of character data, Excel had interpreted
> AUG2699 as a date.
> 
> The .csv did not actually have a date in that cell,
> but if I had saved the 
> .csv file it would have.
> 
> David Scott
> 
>
_
> David Scott   Department of Statistics, Tamaki Campus
>   The University of Auckland, PB 92019
>   Auckland 1142,NEW ZEALAND
> Phone: +64 9 373 7599 ext 86830   Fax: +64 9 373 7000
> Email:[EMAIL PROTECTED]
> 
> Graduate Officer, Department of Statistics
> Director of Consulting, Department of Statistics
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained,
> reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Excel

2007-08-27 Thread David Scott


A common process when data is obtained in an Excel spreadsheet is to save 
the spreadsheet as a .csv file then read it into R. Experienced users 
might have learned to be wary of dates (as I have) but possibly have not 
experienced what just happened to me. I thought I might just share it with 
r-help as a cautionary tale.

I received an Excel file giving patient details. Each patient had an ID 
code in the form of three letters followed by four digits. (Actually a New 
Zealand National Health Identification.) I saved the .xls file as .csv. 
Then I opened up the .csv (with Excel) to look at it. In the column of ID 
codes I saw: Aug-99. Clicking on that entry it showed 1/08/2699.

In a column of character data, Excel had interpreted AUG2699 as a date.

The .csv did not actually have a date in that cell, but if I had saved the 
.csv file it would have.

David Scott

_
David Scott Department of Statistics, Tamaki Campus
The University of Auckland, PB 92019
Auckland 1142,NEW ZEALAND
Phone: +64 9 373 7599 ext 86830 Fax: +64 9 373 7000
Email:  [EMAIL PROTECTED]

Graduate Officer, Department of Statistics
Director of Consulting, Department of Statistics

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to provide argument when opening RGui from an external application

2007-08-27 Thread Sébastien

Thanks everyone. I actually thought about ?Rscript.exe but, having used 
only Rgui, I thought it was a instruction specific to this interface. I 
will look into it.

Sebastien

Gabor Grothendieck a écrit :
> There are also some batch files that can be used with Rscript on XP and info
> in the README here:
>
>http://batchfiles.googlecode.com
>
>
> On 8/26/07, Sébastien <[EMAIL PROTECTED]> wrote:
>   
>> Thanks for your reply.
>> When you say "look into Rscript.exe", do you have a specific document in
>> mind ? I tried to google it but could not find much... I forgot to
>> mention in my first email that I am working under the Windows XP
>> environment.
>>
>> Prof Brian Ripley a écrit :
>> 
>>> Look into Rscript.exe (on Windows), which is a flexible way to run
>>> scripts.  Neither using a GUI nor using source() are recommended.
>>>
>>> On Fri, 24 Aug 2007, Sébastien wrote:
>>>
>>>   
 Dear R-users,

 I have written a small application (in visual basic) that automatically
 generate some R scripts. I would like to execute these scripts when my
 application is being closed.
 My problem is that I don't know how to pass the
 'source(c:/.../myscript.r)' instruction when I programmatically start
 RGui. Tinn-R is capable of doing such things, so I guess there must be a
 way to pass arguments to RGui.

 Any advice or link to relevant references would be greatly appreciated.

 Sebastien

>> __
>> R-help@stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>> 
>
>
>   

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] validate (package Design): error message "subscript out of bounds"

2007-08-27 Thread Frank E Harrell Jr

Wentzel-Larsen, Tore wrote:
> Dear R users 
> 
> I use Windows XP, R2.5.1 (I have read the posting guide, I have 
> contacted the package maintainer first, it is not homework).
> 
> In a research project on renal cell carcinoma we want to compute 
> Harrell's c index, with optimism correction, for a multivariate 
> Cox regression and also for some univariate Cox models.
> For some of these univariate models I have encountered an error
> message (and no result produced) from the function validate i 
> Frank Harrell's Design package:
> 
> Error in Xb(x[, xcol, drop = FALSE], coef, non.slopes, non.slopes.in.x,  : 
> subscript out of bounds
> 
> The following is an artificial example wherein I have been able to 
> reproduce this error message (actual data has been changed to preserve
> confidentiality):

I could not reproduce the error on R 2.5.1 on linux using version 2.0-12 
of Design (you did not provide this information).

Your code involved a good deal of extra typing.  Here is a streamlined 
version:

bc <- data.frame(time1 = c(9,24,28,43,58,62,66,107,116,118,123,
127,129,131,137,138,139,140,148,169,176,179,188,196,210,218,

bc

library(Design)

dd <- with(bc, datadist(bc1, age, adjto.cat='first'))
options(datadist = 'dd')

f <- cph(Surv(time1,status1) ~ bc1,
  data = bc, x=TRUE, y=TRUE, surv=TRUE)
anova(f)
f
summary(f)

val <- validate(f, B=200, dxy=TRUE)

I don't get much value of putting the type of an object as part of the 
object's name, as information within objects defines the object type/class.

There is little reason to validate a one degree of freedom model.

Frank

> 
> library(Design)
> 
> # an example data frame:
> frame.bc <- data.frame(time1 = c(9,24,28,43,58,62,66,107,116,118,123,
>   127,129,131,137,138,139,140,148,169,176,179,188,196,210,218,
>   1,1,1,2,2,3,4,8,23,32,33,34,43,44,48,51,52,54,59,59,60,60,62,
>   65,65,68,70,72,73,74,81,84,88,98,99,106,107,115,115,117,119,
>   120,122,122,122,122,126,128,130,135,136,136,138,149,151,154,
>   157,159,161,164,164,164,166,172,172,176,179,180,183,183,184,
>   187,190,197,201,201,203,203,203,209,210,214,219,227,233,4,18,
>   49,113,147,1,1,2,2,2,2,2,3,4,6,6,6,6,6,6,6,6,9,9,9,9,9,10,10,
>   10,11,12,12,12,13,14,14,17,18,18,19,19,20,20,21,21,21,21,22,23,
>   23,24,28,28,29,29,32,34,35,38,38,48,48,52,52,54,54,56,64,67,67,
>   69,70,70,72,84,88,90,114,115,140,142,154,171,195),
>   status1 = c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
>   0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
>   0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
>   0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
>   1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
>   1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
>   1,1,1,1,1),
>   bc1 = factor(c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
>   2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,
>   2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,
>   2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,
>   2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,
>   2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2),
>   labels=c('bc.1','bc.2')),
>   age = c(58,68,23,20,50,43,41,69,20,48,19,27,39,20,65,49,70,59,31,43,25,
>   61,60,45,34,59,32,58,30,62,26,44,52,29,40,57,33,18,50,50,55,51,38,34,
>   69,56,67,38,66,21,48,39,62,62,29,68,66,19,60,39,55,42,24,29,56,61,40,
>   52,19,40,33,67,66,51,48,63,60,58,68,60,53,20,45,62,37,38,61,63,43,67,
>   49,39,43,67,49,69,32,37,32,63,33,47,66,39,23,57,26,61,20,49,69,30,40,
>   29,38,66,60,69,69,44,65,25,41,53,18,55,45,59,49,27,51,29,67,26,24,26,
>   47,23,50,27,35,45,32,26,45,45,63,39,39,22,38,27,31,27,49,65,66,49,39,
>   21,51,49,55,63,19,26,50,21,24,34,65,33,55,33,36,53,48,25,54,58,60,34,
>   47,23,34,60,39,34,22,30,41,55,64,48,34,54))
> frame.bc
> 
> # preparing for a simple univariate Cox regression:
> dd.bc <- datadist(frame.bc[, c('bc1','age')], adjto.cat='first')
> options(datadist = 'dd.bc')
> 
> # a univariate Cox regression:
> cph.bc <- cph(formula = Surv(time1,status1)~bc1,
>   data = frame.bc, x=TRUE, y=TRUE, surv=TRUE)
> anova(cph.bc)
> cph.bc
> summary(cph.bc)
> 
> # the validate command for the Cox model:
> val.cph.bc <- validate(cph.bc, B=200, dxy=TRUE , pr=TRUE)
> 
> --
> Output from the validate command:
> 
>training   test
> Dxy   -0.124360 -0.1423409
> R2 1.00  1.000
> Slope  1.00  0.7919584
> D  0.016791  0.0147536
> U -0.002395  0.0006448
> Q  0.019186  0.0141088
>training   test
> Dxy   -0.191875 -0.1423409
> R2 1.00  1.000
> Slope  1.00  0.8936724
> D  0.022397  0.0147536
> U -0.002339  0.0001367
> Q  0.024

Re: [R] subset question

2007-08-27 Thread jim holtman

Here is one way of checking to see if a row contains a particular
value and setting the contents of a new column:

n <- 20
# create test data
x <- 
data.frame(sample(letters,n),sample(letters,n),sample(letters,n),sample(letters,n))
# add a column indicating if the row contains 'a', 'b' or 'c'
x$a <- apply(x[, 1:4], 1, function(.row) any(.row %in% c('a','b','c'))) + 0


On 8/27/07, Kirsten Beyer <[EMAIL PROTECTED]> wrote:
> I would like to code records in a dataset with a 1 if any of the
> columns 9-67 contain a particular code, and zero if they don't.  I've
> been working with "subset" and it seems that something like
> subset(data, data[9:67]--"12345") would work, but I have been
> unsuccessful so far.  It seems like a simple problem - any help is
> appreciated!
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Calculating diameters of cirkels in a picture.

2007-08-27 Thread Moshe Olshansky

Hi Bart,

Let's assume that you situation was simpler - you have
a BW (Black and White) image containing circles (in
white) and you need to find the diameter of each
circle (and of course to know how many circles you
have). This can be done with labeling of connected
components. You say that two pixels are neighbors if
they have common edge (4-connectivity) or at least a
common vertex (8-connectivity). So now you can treat
your image (white pixels) as a graph (with edges
connecting any two neighbors). Then each connected
component of that graph corresponds to a circle. There
exists a well know algorithm to do this. It takes the
original BW image (where every image pixel has the
value of 1 and background pixel the value of 0) and
produces an image where every background pixel still
has the value of 0, every pixel of the first connected
component has the value of 1, every pixel of the
second connected component has the value of 2, etc.
So no you can process each connected component (circle
in your case) separately.
Basically this is all you need. You can either count
the number of pixels having the value of k to find the
area (and then the diameter) or just take (maximal x
value) - (minimal x value) + 1.
In your case it can happen that after you convert your
image into BW image some circles will have holes
inside with some small objects inside these holes, and
you do not want to consider these small objects as
additional circles. So I thought of using
morphological closing to get rid of small holes, but
as I wrote in the following note you do not need this.
When you get the BW image take the complimentary one
(i.e. background pixels have the value of 1 and image
pixels the value of 0). Label the connected components
of the background. Only one of them is real background
- all others are inside circles. Real background
touches the image boundaries. Now go to the original
BW image and give all the pixels outside the "real"
background the value of 1. Now all your circles are
full (no holes) and you can proceed as above.

Best regards,

Moshe.

--- Bartjoosen <[EMAIL PROTECTED]> wrote:

> 
> Hi All,
> 
> I really like to thank you for the answers, while I
> was searching for some
> edge detection and clustering algorithms, Moshe came
> with a simple but
> effective solution: use the area to find the
> diameter!
> 
> But I tried Moshe's solution, but I couldn't figure
> out what you mean with
> morphological closing and the labeling to split the
> images.
> Could you please clarify this a bit?
> 
> Thanks for your support
> 
> 
> Bart
> 
> 
> Moshe Olshansky-2 wrote:
> > 
> > Hi Bart,
> > 
> > One more comment:
> > 
> > You do not really need the morphological closing
> to
> > close the "holes" inside the circles. Another
> > possibility is to reverse the black-and-withe
> picture,
> > i.e. make the holes and background be 1 and the
> > circles 0, label the connected components and then
> > only the component which touches the boundaries is
> the
> > background while all other components are "holes"
> and
> > you can make them white (1) in the original
> > black-and-white image.
> > 
> > --- Moshe Olshansky <[EMAIL PROTECTED]> wrote:
> > 
> >> Hi Bart,
> >> 
> >> I have never used image processing software in R
> (I
> >> was doing this with Matlab), but here is what I
> >> would
> >> have done algorithmically:
> >> 1) convert the picture to gray-scale
> >> 2) find a threshold value which separates the
> >> circles
> >> from the background and convert your image to
> black
> >> and white
> >> 3) if the circles are far apart use morphological
> >> closing to fill in small holes inside the circles
> >> (may
> >> be do this several times)
> >> 4) use labeling to split the image into connected
> >> components
> >> 5) for each connected component get it's area
> (the
> >> number of pixels) and use the formula S = Pi*R^2
> to
> >> find the approximate radii.
> >> 
> >> Regards,
> >> 
> >> Moshe.
> >> 
> >> --- Julian Burgos <[EMAIL PROTECTED]>
> wrote:
> >> 
> >> > Hi Bart,
> >> > 
> >> > If you only have 36 circles, the fastest way
> would
> >> > be to use some image 
> >> > processing software and measure the circles "by
> >> > hand".  One option is to 
> >> > use ImageJ, which you can download here
> >> > 
> >> > http://rsb.info.nih.gov/ij/
> >> > 
> >> > Julian
> >> > 
> >> > Bart Joosen wrote:
> >> > > Hi,
> >> > >
> >> > > Maybe this is more a programming questions
> than
> >> a
> >> > specific R-project question, but maybe there is
> >> > someone who can point me in the right
> direction.
> >> > >
> >> > > I have a picture of cirkels which I took with
> a
> >> > digital camera.
> >> > > Now I want to use the diameter of the cirkels
> on
> >> > the picture for analysis in R.
> >> > > I can use pixmap to import the picture, but
> how
> >> do
> >> > I find the outside cirkels and calculate the
> >> > diameter?
> >> > > I pointed out that I can use the edci
> package,
> >> but
> >> > then I need to preprocess the dat

Re: [R] oddity with method definition

2007-08-27 Thread Thomas Lumley

On Mon, 27 Aug 2007, Faheem Mitha wrote:

>
> Just wondered about this curious behaviour. I'm trying to learn about
> classes. Basically setMethod works the first time, but does not seem to
> work the second time.
> Faheem.
> *
> setClass("foo", representation(x="numeric"))
>
> bar <- function(object)
>   {
> return(0)
>   }
>
> bar.foo <- function(object)
>   {
> print([EMAIL PROTECTED])
>   }
> setMethod("bar", "foo", bar.foo)
>
> bar(f)
>
> # bar(f) gives 1.

Not for me. It gives
> bar(f)
Error: object "f" not found
Error in bar(f) : error in evaluating the argument 'object' in selecting a
method for function 'bar'

However, if I do
f = new("foo", x= 1)
first, it gives 1.

> bar <- function(object)
>   {
> return(0)
>   }

Here you have masked the generic bar() with a new function bar(). Redefining 
bar() is the problem, not the second setMethod().

> bar.foo <- function(object)
>   {
> print([EMAIL PROTECTED])
>   }
> setMethod("bar", "foo", bar.foo)

Because there was a generic bar(), even though it is overwritten by the new 
bar(), setMethod() doesn't automatically create another generic.

> f = new("foo", x= 1)
>
> bar(f)
>
> # bar(f) gives 0, not 1.
>

Because bar() isn't a generic function
> bar
function(object)
   {
 return(0)
   }

If you had used setGeneric() before setMethod(), as recommended, your example 
would have done what you expected, but it would still have wiped out any 
previous methods for bar() -- eg, try
  setMethod("bar","baz", function(object) print("baz"))
before you redefine bar(), and notice that getMethod("bar","baz") no longer 
finds it.

-thomas

Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] oddity with method definition

2007-08-27 Thread Duncan Murdoch

On 27/08/2007 5:47 PM, Faheem Mitha wrote:
> Just wondered about this curious behaviour. I'm trying to learn about 
> classes. Basically setMethod works the first time, but does not seem to 
> work the second time.
>  Faheem.
> *
> setClass("foo", representation(x="numeric"))
> 
> bar <- function(object)
>{
>  return(0)
>}
> 
> bar.foo <- function(object)
>{
>  print([EMAIL PROTECTED])
>}
> setMethod("bar", "foo", bar.foo)

This changes the definition of bar:  now it becomes a generic function 
instead of a simple function.

> 
> bar(f)
> 
> # bar(f) gives 1.

(You forgot the f = new("foo", x= 1) line, but that's somewhat obvious.)
> 
> bar <- function(object)
>{
>  return(0)
>}

Now bar is a regular function again.
> 
> bar.foo <- function(object)
>{
>  print([EMAIL PROTECTED])
>}
> setMethod("bar", "foo", bar.foo)

Now the generic would call that method, but you've wiped out the generic.

> 
> f = new("foo", x= 1)
> 
> bar(f)
> 
> # bar(f) gives 0, not 1.

The problem is that setting a method on a regular function automagically 
creates a generic for it, but redefining a function doesn't remove the 
generic.  It's still there, somewhere in R's insides, and if you could 
find it to call it your method would get called.  But you're calling the 
plain old bar() instead.

This behaviour makes more sense if you think about generics in other 
packages.  There's a generic called "show" in the methods package.  But 
you can define your own function called "show", and in your workspace, 
you'd want to call that, not the one from methods.

I'd recommend using setGeneric() to create a generic, rather than 
depending on the automatic creation, to avoid this kind of confusion.

Duncan Murdoch

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] oddity with method definition

2007-08-27 Thread Faheem Mitha


Just wondered about this curious behaviour. I'm trying to learn about 
classes. Basically setMethod works the first time, but does not seem to 
work the second time.
 Faheem.
*
setClass("foo", representation(x="numeric"))

bar <- function(object)
   {
 return(0)
   }

bar.foo <- function(object)
   {
 print([EMAIL PROTECTED])
   }
setMethod("bar", "foo", bar.foo)

bar(f)

# bar(f) gives 1.

bar <- function(object)
   {
 return(0)
   }

bar.foo <- function(object)
   {
 print([EMAIL PROTECTED])
   }
setMethod("bar", "foo", bar.foo)

f = new("foo", x= 1)

bar(f)

# bar(f) gives 0, not 1.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] grouping scat1d/rug and plotting to 2 axes

2007-08-27 Thread Frank E Harrell Jr

Mike wrote:
> Hi,
> 
> I'm wondering if anybody can offer a bit of  guidance on how to add a 
> couple of features to a plot. 
> 
> I'm using Frank Harrell's Design library to model some survival data in 
> R (2.3.1, windows platform).  I'm fairly comfortable with the survival 
> modeling in Design, but am still at a frustratingly low level of 
> competence when it comes to creating anything beyond simple plots in R.
> 
> A simplified version of the model is:
> 
> fit <- cph(Surv(survtime,deceased) ~ rcs(smw,4), 
> data=survdata,x=T,y=T,surv=T )
> 
> And the basic plot is:
> 
> plot(fit,smw=NA, fun=function(x) 1/(1+exp(-x)))

or plot(fit, smw=NA, fun=plogis).  But what does the logistic model have 
to do with the Cox model you fitted?  You can instead do plot(fit, 
smw=NA, time=1) to plot estimated 1-year survival prob.

> 
> I know that if I add
> 
> scat1d(smw)
> 
> I get a nice jittered rug plot of all values of the predictor smw on the 
> top axis.
> 
> What I'd like to do, however, is to plot on bottom axis the values of 
> smw for only those participants who are alive, and then on the top axis, 
> plot the values of smw for those who are deceased.  I'd appreciate any 
> tips as to how I might approach this.

That isn't so well defined because of variable follow-up time.  I would 
not get very much out of such a plot.

Frank

> 
> Thanks,
> 
> Mike Babyak
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 


-- 
Frank E Harrell Jr   Professor and Chair   School of Medicine
  Department of Biostatistics   Vanderbilt University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Rmpi and x86

2007-08-27 Thread Edna Bell

Dear R Gurus:

Is there a problem with Rmpi on x86 with SUSE 10.1, please?

I've tried everything and it still won't load.

Has anyone else dealt with this please?

Thanks,
Edna Bell

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] use apply function with which

2007-08-27 Thread Charles C. Berry

On Mon, 27 Aug 2007, [EMAIL PROTECTED] wrote:

> Dear R-users,
>
> For a data frame (say in this example X) I want to look up the
> corresponding value in a 'look-up data frame' (in this example Y). The
> for-loop works but is very time-consuming because 'X' in reality is very
> big.
> Therefore I would like to have a solution with apply. However, I do not
> succeed. Any suggestions?
>
> Thanks in advance,
>
> Hanneke
>
> c1=c('a','a','b')
> c2=c('j','k','k')
>
> V1=c('a','a','a','a','b','b','b','b'))

You have a syntax error in the previous line - '))'


> V2=c('i','j','k','l','i','j','k','l')
> V3=c(4,3,2,1,8,5,2,-1)
>
>
> X=NULL
> X$c1=c1
> X$c2=c2
> X=as.data.frame(X)
> Y=NULL
> Y$V1=V1
> Y$V2=V2
> Y$V3=V3
> Y=as.data.frame(Y)
>
> result=NULL
> for (i in 1:dim(X)[1])
> {
> result=rbind(result, Y$V3[which(Y$V1==as.character(X[i,]$c1) &
> Y$V2==as.character(X[i,]$c2))])
> }
>
> ###
> which.search=function(X,Y,c1,c2,V1,V2,V3)
> Y$V3[which(Y$V1==as.character(X$c1) & Y$V2==as.character(X$c2))]
>
> apply(X,1,which.search,X=X,Y=Y,c1='c1',c2='c2',V1='V1',V2='V2',V3='V3')

^^^^...

You use X twice in this expression. If you delete 'X=X,' and revise 
which.search to

  which.search <- function( X, Y, c1, c2, V1, V2, V3 )
 Y$V3[ which( Y$V1==as.character( X[c1] ) &
  Y$V2 == as.character( X[ c2 ] ) ) ]

to get rid of the $ operator which is deprecated for atomic vectors,

(and fix the above syntax error) then this expression agrees with 'result'

If you know that the matches are unique (only one row in Y will match any 
row of X), then

match( paste( X$c1, X$c2 ) , paste( Y$V1, Y$V2 ))

will be fast.

If nrow(Y) is small,

which(
outer(Y$V1, as.character(X$c1), "==" ) &
outer(Y$V2, as.character(X$c2), "==" ),
  arr.ind = TRUE )

will also be quick.


Otherwise something like

unlist( lapply( paste( X$c1, X$c2 ), match, paste( Y$V1, Y$V2 )) )

may be a good bet.


Please learn to use the space key to format your code in a more readable 
fashion!


HTH,

Chuck

>
> ###
>> sessionInfo()
> R version 2.5.1 (2007-06-27)
> i386-pc-mingw32
>
> locale:
> LC_COLLATE=Dutch_Netherlands.1252;LC_CTYPE=Dutch_Netherlands.1252;LC_MONETARY=Dutch_Netherlands.1252;LC_NUMERIC=C;LC_TIME=Dutch_Netherlands.1252
>
> attached base packages:
> [1] "stats" "graphics"  "grDevices" "utils" "datasets"  "methods"
> "base"
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Charles C. Berry(858) 534-2098
 Dept of Family/Preventive Medicine
E mailto:[EMAIL PROTECTED]  UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] grouping scat1d/rug and plotting to 2 axes

2007-08-27 Thread Mike

Hi,

I'm wondering if anybody can offer a bit of  guidance on how to add a 
couple of features to a plot. 

I'm using Frank Harrell's Design library to model some survival data in 
R (2.3.1, windows platform).  I'm fairly comfortable with the survival 
modeling in Design, but am still at a frustratingly low level of 
competence when it comes to creating anything beyond simple plots in R.

A simplified version of the model is:

fit <- cph(Surv(survtime,deceased) ~ rcs(smw,4), 
data=survdata,x=T,y=T,surv=T )

And the basic plot is:

plot(fit,smw=NA, fun=function(x) 1/(1+exp(-x)))

I know that if I add

scat1d(smw)

I get a nice jittered rug plot of all values of the predictor smw on the 
top axis.

What I'd like to do, however, is to plot on bottom axis the values of 
smw for only those participants who are alive, and then on the top axis, 
plot the values of smw for those who are deceased.  I'd appreciate any 
tips as to how I might approach this.

Thanks,

Mike Babyak

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] use apply function with which

2007-08-27 Thread schuurmans

Dear R-users,

For a data frame (say in this example X) I want to look up the
corresponding value in a 'look-up data frame' (in this example Y). The
for-loop works but is very time-consuming because 'X' in reality is very
big.
Therefore I would like to have a solution with apply. However, I do not
succeed. Any suggestions?

Thanks in advance,

Hanneke

c1=c('a','a','b')
c2=c('j','k','k')

V1=c('a','a','a','a','b','b','b','b'))
V2=c('i','j','k','l','i','j','k','l')
V3=c(4,3,2,1,8,5,2,-1)


X=NULL
X$c1=c1
X$c2=c2
X=as.data.frame(X)
Y=NULL
Y$V1=V1
Y$V2=V2
Y$V3=V3
Y=as.data.frame(Y)

result=NULL
for (i in 1:dim(X)[1])
{
result=rbind(result, Y$V3[which(Y$V1==as.character(X[i,]$c1) &
Y$V2==as.character(X[i,]$c2))])
}

###
which.search=function(X,Y,c1,c2,V1,V2,V3)
Y$V3[which(Y$V1==as.character(X$c1) & Y$V2==as.character(X$c2))]

apply(X,1,which.search,X=X,Y=Y,c1='c1',c2='c2',V1='V1',V2='V2',V3='V3')

###
> sessionInfo()
R version 2.5.1 (2007-06-27)
i386-pc-mingw32

locale:
LC_COLLATE=Dutch_Netherlands.1252;LC_CTYPE=Dutch_Netherlands.1252;LC_MONETARY=Dutch_Netherlands.1252;LC_NUMERIC=C;LC_TIME=Dutch_Netherlands.1252

attached base packages:
[1] "stats" "graphics"  "grDevices" "utils" "datasets"  "methods" 
 "base"

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to include bar values in a barplot?

2007-08-27 Thread Frank E Harrell Jr

Donatas G. wrote:
> On Tuesday 07 August 2007 22:09:52 Donatas G. wrote:
>> How do I include bar values in a barplot (or other R graphics, where this
>> could be applicable)?
>>
>> To make sure I am clear I am attaching a barplot created with
>> OpenOffice.org which has barplot values written on top of each barplot.
> 
> Here is the barplot mentioned above:
> http://dg.lapas.info/wp-content/barplot-with-values.jpg
> 
> it appeaars that this list does not allow attachments...
> 
That is a TERRIBLE graphic.  Can't we finally leave this subject alone?

Frank Harrell

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Max vs summary inconsistency

2007-08-27 Thread Adam D. I. Kramer



On Mon, 27 Aug 2007, François Pinard wrote:


summary(m)

   Min. 1st Qu.  MedianMean 3rd Qu.Max.
  1   13000   26280   25890   38550   50910 

max(m)

[1] 50912



...it seems to me like max() and summary(m)[6] ought to return the same
number.  Am I doing something wrong?


Some may say that you did not scrutinize the documentation enough, as
"summary" artificially limits the number of significant digits.


Indeed, several have said so in private email as well as email to the list.
Thanks to all, apologies for my lack of scrutiny.


However, this question reoccurs often and regularly in these mailing
lists, so at last, maybe something should be done about it, beyond
documenting how it works.  Overall, too many users got mislead, that one
may not so bluntly assert they are all wrong.


I would agree, and not only because I was misled: Several people are
scrutinizing the RESPONSE of summary()'s output, and noticing it is
incorrect.

However, it is very VERY likely that many more are NOT scrutinizing it, and
as such are forming false beliefs about their data sets, which may be
subsequently published or used in further analyses.

Taking a small step in the implementation of summary() to potentially
prevent the publication of incorrect data seems worthwhile. Certainly, any
researcher should check their output in many ways, but it makes no sense to
me that summary() would round its output to 4 significant digits by default.


For example, resorting to scientific notation whenever non significant
zero digits would have otherwise been printed.  This should clarify a bit
that the printing precision got artificially limited.


I think this is a great solution, though I'm not sure whether scripts that
use summary() would break if passed a number in scientific notation.

That said, scripts that use summary() are probably assuming that the number
reported is maximally precise, and thus are making the same mistake I
did...and thus should indeed break!

--
Adam Kramer
Ph.D. Student, Social Psychology
University of Oregon
[EMAIL PROTECTED]__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Max vs summary inconsistency

2007-08-27 Thread François Pinard

[Adam D. I. Kramer]

>I'm having the following questionable behavior:

>> summary(m)
>Min. 1st Qu.  MedianMean 3rd Qu.Max.
>   1   13000   26280   25890   38550   50910 
>> max(m)
>[1] 50912

>...it seems to me like max() and summary(m)[6] ought to return the same
>number.  Am I doing something wrong?

Some may say that you did not scrutinize the documentation enough, as 
"summary" artificially limits the number of significant digits.

However, this question reoccurs often and regularly in these mailing 
lists, so at last, maybe something should be done about it, beyond 
documenting how it works.  Overall, too many users got mislead, that one 
may not so bluntly assert they are all wrong.

For example, resorting to scientific notation whenever non significant 
zero digits would have otherwise been printed.  This should clarify 
a bit that the printing precision got artificially limited.

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] subset using noncontiguous variables by name (not index)

2007-08-27 Thread Muenchen, Robert A (Bob)

Thanks for helping me see why R doesn't have the "obvious"! -Bob

> -Original Message-
> From: Thomas Lumley [mailto:[EMAIL PROTECTED]
> Sent: Monday, August 27, 2007 2:12 PM
> To: Muenchen, Robert A (Bob)
> Subject: RE: [R] subset using noncontiguous variables by name (not
> index)
> 
> On Mon, 27 Aug 2007, Muenchen, Robert A (Bob) wrote:
> 
> > Thomas, that's a good point. I was thinking of anscombe[x1::y1]
> making
> > it clear which one, but you would then want just x1::y1 to have
> > unambiguous meaning on its own, which is impossible.
> >
> > As for x1:xN, it's unambiguous on its own.
> 
> 
> It actually isn't. We already have a meaning. Consider
>x1<-4
>xN<-6
>x1:xN
> It also breaks R's argument passing rules by treating x1 as string
> rather than a name.
> 
> What would be unambiguous at the moment is "x1":"x4", provided there
> was a sufficiently precise set of rules on what was allowed. Consider
>   "x1":"x-1"(negative?)
>   "x1":"x3.14"  (non-integer?)
>   "x3.12":"x3.14" (is the prefix x or x3.?)
>   "x1":"X4" (the prefix changes)
>   "01":"14" (is the prefix empty or 0?)
>   "x09":"xA2" (is this illegal decimal or legal hexadecimal?)
>   "IL23R1":"IL23R4" (what is the prefix?)
>   "x1a":"x4a"(infix numbering?)
> 
> 
> 
>   -thomas
> 
> Thomas Lumley Assoc. Professor, Biostatistics
> [EMAIL PROTECTED] University of Washington, Seattle
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Max vs summary inconsistency

2007-08-27 Thread Thomas Lumley

On Mon, 27 Aug 2007, Adam D. I. Kramer wrote:

> Hello,
>
> I'm having the following questionable behavior:
>
>> summary(m)
>Min. 1st Qu.  MedianMean 3rd Qu.Max.
>   1   13000   26280   25890   38550   50910
>> max(m)
> [1] 50912
>
>> typeof(m)
> [1] "integer"
>> class(m)
> [1] "integer"
>
> ...it seems to me like max() and summary(m)[6] ought to return the same
> number. Am I doing something wrong?
>

They do return the same number, they just print it differently. summary() 
prints four significant digits by default.

  -thomas

Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] subset question

2007-08-27 Thread Kirsten Beyer

I would like to code records in a dataset with a 1 if any of the
columns 9-67 contain a particular code, and zero if they don't.  I've
been working with "subset" and it seems that something like
subset(data, data[9:67]--"12345") would work, but I have been
unsuccessful so far.  It seems like a simple problem - any help is
appreciated!

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Max vs summary inconsistency

2007-08-27 Thread Adam D. I. Kramer

Hello,

I'm having the following questionable behavior:

> summary(m)
Min. 1st Qu.  MedianMean 3rd Qu.Max.
   1   13000   26280   25890   38550   50910 
> max(m)
[1] 50912

> typeof(m)
[1] "integer"
> class(m)
[1] "integer"

...it seems to me like max() and summary(m)[6] ought to return the same
number. Am I doing something wrong?

I'm running R 2.5.1 (2007-06-27), installed on MacOSX from the dmg file
found on CRAN.

--
Adam D. I. Kramer
Ph.D. Student, University of Oregon
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Formatting Sweave in R-News

2007-08-27 Thread Arjun Narayan

Dear Paul,

I stand corrected. Your solution was the right way. The following code now
works:
(Apparently I still need to specify the width command as my pdf is
incorrectly sized by default)

\begin{figure*}[b]
\begin{center}
\includegraphics[width=8in]{generatedPDF.pdf}
\end{center}
\end{figure*}

There is a full explanation in the template.tex file which can be found in
the RNews tutorial here: http://cran.r-project.org/doc/Rnews/template.tex

Thank you for your time.

Best regards,
Arjun

>
>
> Try putting your image in a figure* environment (should go full width of
> the page).
>
> Paul
>
> Dr Paul Murrell
> Department of Statistics
> The University of Auckland
> Private Bag 92019
> Auckland
> New Zealand
> 64 9 3737599 x85392
> [EMAIL PROTECTED]
> http://www.stat.auckland.ac.nz/~paul/
>

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Formatting Sweave in R-News

2007-08-27 Thread Arjun Narayan

>
> Thank you Paul for your response. Unfortunately that did not work. A
> figure environment frames it neatly, but still contained in only one column.
> I have tried various methods, but they all seem to not work, or if the
> solutions involve manually setting the size, the grey column separator still
> runs through the middle of the page.
>
> I know a solution exists, because on page 21, Vol 1/1 of R-News, there is
> an image that spans both columns. Do you know where I could get the Rnw
> source files for R-news articles? That would at least allow me to trawl for
> a solution.
>
> Best regards,
> Arjun
>
> On 8/22/07, Paul Murrell <[EMAIL PROTECTED]> wrote:
> >
> > Hi
> >
> >
> > Arjun Ravi Narayan wrote:
> > > Hi,
> > >
> > > I am editing a document for submission to the R-news newsletter, and
> > > in my article my Sweave code inserts a dynamically generated PDF
> > > report that my R program generates.
> > >
> > > However, when I insert the PDF using the following Sweave code:
> > >
> > > \newpage
> > > \includegraphics[scale=1.0]{\Sexpr{print(location)}}
> > > \newpage
> > >
> > > (in tex this looks like):
> > > \newpage
> > > \includegraphics[scale=1.0]{/home/arjun/sample.pdf}
> > > \newpage
> >
> >
> > Try putting your image in a figure* environment (should go full width of
> > the page).
> >
> > Paul
> >
> >
> > >
> > > However, the r-news style package over-rides everything that I can set
> > > (including using the minipage option) to make my included PDF small
> > > sized. Part of the problem is that the R-news style specifies a
> > > two-column formatting, and so the PDF is shrunk to fit in one column.
> > > How can I, for just one page, over-ride the styles to include the PDF?
> > > Even if I hard-hack the graphics to be scaled up in size, that does
> > > not get rid of the vertical line that in between the two columns, and
> > > thus breaking my image.
> > >
> > > I realise that this is not an R problem, but more a latex problem, but
> > > I am hoping that somebody has faced similar problems with the Rnews
> > > styles and has an idea on how to do this.
> > >
> > >
> > > Thank you,
> > >
> > > Yours sincerely,
> >
> >
> > --
> > Dr Paul Murrell
> > Department of Statistics
> > The University of Auckland
> > Private Bag 92019
> > Auckland
> > New Zealand
> > 64 9 3737599 x85392
> > [EMAIL PROTECTED]
> > http://www.stat.auckland.ac.nz/~paul/
> >
>
>

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] FW: subset using noncontiguous variables by name (not index)

2007-08-27 Thread Muenchen, Robert A (Bob)

Thomas, that's a good point. I was thinking of anscombe[x1::y1] making
it clear which one, but you would then want just x1::y1 to have
unambiguous meaning on its own, which is impossible.

As for x1:xN, it's unambiguous on its own. I thought one of the great
advantages of R was that it could use different methods so that a new
operator would not be needed. The colon operator would just have a new
method for when stringN appeared. One that would be very useful & have
obvious meaning. 

Thanks,
Bob

> -Original Message-
> From: Thomas Lumley [mailto:[EMAIL PROTECTED]
> Sent: Monday, August 27, 2007 10:25 AM
> To: Muenchen, Robert A (Bob)
> Cc: r-help@stat.math.ethz.ch
> Subject: Re: [R] subset using noncontiguous variables by name (not
> index)
> 
> On Mon, 27 Aug 2007, Muenchen, Robert A (Bob) wrote:
> 
> > Gabor, That works great!
> >
> > I think this would be a very helpful addition to the main R
> > distribution. Perhaps with a single colon representing numerical
> order
> > (exactly as you have written it) and two colons representing the
> order
> > of the variables as they appear in the data frame (your first
> example).
> > That's analogous to SAS' x1-xN, which you know gets those N
> variables,
> > and a--z, which selects an unknown number of variables a through z.
> How
> > many that is depends upon their order in the data frame. That would
> not
> > only be very useful in general, but it would also make transitioning
> to
> > R from SAS or SPSS less confusing.
> >
> > Is R still being extended in such basic ways, or does that muck up
> > existing programs too much?
> >
> 
> In principle base R can be extended like that, but a strong case is
> needed
> for non-standard evaluation rules and for depleting the restricted
> supply
> of short binary operator names.
> 
> The reason for subset() and its behaviour is that 'variables as they
> appear the in data frame' is typically ambiguous -- which data frame?
> In
> SPSS you have only one and in SAS there is a default one, so there is
> no
> ambiguity in X1--Y2, but in R it needs another argument specifying the
> data frame, so it can't really be a binary operator.
> 
> The double colon :: and triple colon ::: are already used for
> namespaces,
> and a search of r-help reveals two previous, different, suggestions
for
> %:%.
> 
> 
>   -thomas
> 
> Thomas Lumley Assoc. Professor, Biostatistics
> [EMAIL PROTECTED] University of Washington, Seattle

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sequential Rank Test

2007-08-27 Thread Birgit Lemcke

I looked for the same topic today and found ?wilcox.test in the stats  
package.

B

Am 27.08.2007 um 17:33 schrieb Bernardo Rangel Tura:

> Hi R-Masters
>
>
> I need use a sequential approach in serie of cases, but may data   
> is not
> normal.
>
> If data is normal distribution is very easy create analysis using
> likelihood ratio like of Wald test.
>
> But in my case I need use a non-parametric test (Mann-Whitney).
>
> I was use:  RSiteSearch("sequential rank test") but not solve my
> problem.
>
> Do you know routine or package implement sequential rank test in R?
>
> Thanks in advance
>
>
> -- 
> Bernardo Rangel Tura, M.D,Ph.D
> National Institute of Cardiology
> Brazil
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting- 
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

Birgit Lemcke
Institut für Systematische Botanik
Zollikerstrasse 107
CH-8008 Zürich
Switzerland
Ph: +41 (0)44 634 8351
[EMAIL PROTECTED]






[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sequential Rank Test

2007-08-27 Thread Henrique Dallazuanna

Hi Bernardo,

I think that ?wilcox.test will help you.


-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

On 27/08/07, Bernardo Rangel Tura <[EMAIL PROTECTED]> wrote:
>
> Hi R-Masters
>
>
> I need use a sequential approach in serie of cases, but may data  is not
> normal.
>
> If data is normal distribution is very easy create analysis using
> likelihood ratio like of Wald test.
>
> But in my case I need use a non-parametric test (Mann-Whitney).
>
> I was use:  RSiteSearch("sequential rank test") but not solve my
> problem.
>
> Do you know routine or package implement sequential rank test in R?
>
> Thanks in advance
>
>
> --
> Bernardo Rangel Tura, M.D,Ph.D
> National Institute of Cardiology
> Brazil
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Sequential Rank Test

2007-08-27 Thread Bernardo Rangel Tura

Hi R-Masters


I need use a sequential approach in serie of cases, but may data  is not
normal.

If data is normal distribution is very easy create analysis using
likelihood ratio like of Wald test.

But in my case I need use a non-parametric test (Mann-Whitney).

I was use:  RSiteSearch("sequential rank test") but not solve my
problem.

Do you know routine or package implement sequential rank test in R?

Thanks in advance


-- 
Bernardo Rangel Tura, M.D,Ph.D
National Institute of Cardiology
Brazil

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Robust Standar Errors in Zero-Truncated Poisson

2007-08-27 Thread Pedro Mota Veiga


Hi.

I would like to know if is it possible to estimate zero-truncated count
models with robust standard errors in R. In Stata that is possible. I
already made some searches and attempts but not obtained it. In R I made the
estimation of the truncated poisson by the vglm command of VGAM package .
-- 
View this message in context: 
http://www.nabble.com/Robust-Standar-Errors-in-Zero-Truncated-Poisson-tf4336437.html#a12351638
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] p-Value

2007-08-27 Thread Daniel Lakeland

On Mon, Aug 27, 2007 at 11:49:19AM +0500, amna khan wrote:
> Hi Sir
> 
> When we use Kendall Package to obtain Kendall's Tau statistic.
> Then we also get two-sided p value. What does "two-sided p-value" mean?
> The word "two-sided" is confusing to understand.

Two-sided is sometimes also called two-tailed... It refers to the
probability if being farther away from 0 than the observed value *in
either direction*

-- 
Daniel Lakeland
[EMAIL PROTECTED]
http://www.street-artists.org/~dlakelan

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R-2.5.1 RedHat EL5 compilation failed

2007-08-27 Thread Stefan Grosse

 Original Message  
Subject: [R] R-2.5.1 RedHat EL5 compilation failed
From: Wang Chengbin <[EMAIL PROTECTED]>
To: r-help@stat.math.ethz.ch
Date: 26.08.2007 15:22
> I can't get R-2.5.1 compiled under RedHat EL5 with gcc 4.1.1. Configure
> failed at the following:
>   
You don't need to compile, you could also use the Fedora Core 6 Extras
repository package(s) of R (current: is R-2.5.1-2.fc6.i386.rpm) to
install the necessary rpm packages from there. (Best is to use the smart
package manager, there you can easily activate "channels" which are
repositories.) As far as I understood FC6 is the base of RHEL 5.

Stefan
-=-=-
... Time is an illusion, lunchtime doubly so. (Ford Prefect)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] subset using noncontiguous variables by name (not index)

2007-08-27 Thread Thomas Lumley

On Mon, 27 Aug 2007, Muenchen, Robert A (Bob) wrote:

> Gabor, That works great!
>
> I think this would be a very helpful addition to the main R
> distribution. Perhaps with a single colon representing numerical order
> (exactly as you have written it) and two colons representing the order
> of the variables as they appear in the data frame (your first example).
> That's analogous to SAS' x1-xN, which you know gets those N variables,
> and a--z, which selects an unknown number of variables a through z. How
> many that is depends upon their order in the data frame. That would not
> only be very useful in general, but it would also make transitioning to
> R from SAS or SPSS less confusing.
>
> Is R still being extended in such basic ways, or does that muck up
> existing programs too much?
>

In principle base R can be extended like that, but a strong case is needed 
for non-standard evaluation rules and for depleting the restricted supply 
of short binary operator names.

The reason for subset() and its behaviour is that 'variables as they 
appear the in data frame' is typically ambiguous -- which data frame?  In 
SPSS you have only one and in SAS there is a default one, so there is no 
ambiguity in X1--Y2, but in R it needs another argument specifying the 
data frame, so it can't really be a binary operator.

The double colon :: and triple colon ::: are already used for namespaces, 
and a search of r-help reveals two previous, different, suggestions for 
%:%.

-thomas

Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] validate (package Design): error message "subscript out of bounds"

2007-08-27 Thread Wentzel-Larsen, Tore

Dear R users 

I use Windows XP, R2.5.1 (I have read the posting guide, I have 
contacted the package maintainer first, it is not homework).

In a research project on renal cell carcinoma we want to compute 
Harrell's c index, with optimism correction, for a multivariate 
Cox regression and also for some univariate Cox models.
For some of these univariate models I have encountered an error
message (and no result produced) from the function validate i 
Frank Harrell's Design package:

Error in Xb(x[, xcol, drop = FALSE], coef, non.slopes, non.slopes.in.x,  : 
subscript out of bounds

The following is an artificial example wherein I have been able to 
reproduce this error message (actual data has been changed to preserve
confidentiality):

library(Design)

# an example data frame:
frame.bc <- data.frame(time1 = c(9,24,28,43,58,62,66,107,116,118,123,
127,129,131,137,138,139,140,148,169,176,179,188,196,210,218,
1,1,1,2,2,3,4,8,23,32,33,34,43,44,48,51,52,54,59,59,60,60,62,
65,65,68,70,72,73,74,81,84,88,98,99,106,107,115,115,117,119,
120,122,122,122,122,126,128,130,135,136,136,138,149,151,154,
157,159,161,164,164,164,166,172,172,176,179,180,183,183,184,
187,190,197,201,201,203,203,203,209,210,214,219,227,233,4,18,
49,113,147,1,1,2,2,2,2,2,3,4,6,6,6,6,6,6,6,6,9,9,9,9,9,10,10,
10,11,12,12,12,13,14,14,17,18,18,19,19,20,20,21,21,21,21,22,23,
23,24,28,28,29,29,32,34,35,38,38,48,48,52,52,54,54,56,64,67,67,
69,70,70,72,84,88,90,114,115,140,142,154,171,195),
status1 = c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1),
bc1 = factor(c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,
2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,
2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,
2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,
2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2),
labels=c('bc.1','bc.2')),
age = c(58,68,23,20,50,43,41,69,20,48,19,27,39,20,65,49,70,59,31,43,25,
61,60,45,34,59,32,58,30,62,26,44,52,29,40,57,33,18,50,50,55,51,38,34,
69,56,67,38,66,21,48,39,62,62,29,68,66,19,60,39,55,42,24,29,56,61,40,
52,19,40,33,67,66,51,48,63,60,58,68,60,53,20,45,62,37,38,61,63,43,67,
49,39,43,67,49,69,32,37,32,63,33,47,66,39,23,57,26,61,20,49,69,30,40,
29,38,66,60,69,69,44,65,25,41,53,18,55,45,59,49,27,51,29,67,26,24,26,
47,23,50,27,35,45,32,26,45,45,63,39,39,22,38,27,31,27,49,65,66,49,39,
21,51,49,55,63,19,26,50,21,24,34,65,33,55,33,36,53,48,25,54,58,60,34,
47,23,34,60,39,34,22,30,41,55,64,48,34,54))
frame.bc

# preparing for a simple univariate Cox regression:
dd.bc <- datadist(frame.bc[, c('bc1','age')], adjto.cat='first')
options(datadist = 'dd.bc')

# a univariate Cox regression:
cph.bc <- cph(formula = Surv(time1,status1)~bc1,
data = frame.bc, x=TRUE, y=TRUE, surv=TRUE)
anova(cph.bc)
cph.bc
summary(cph.bc)

# the validate command for the Cox model:
val.cph.bc <- validate(cph.bc, B=200, dxy=TRUE , pr=TRUE)

--
Output from the validate command:

   training   test
Dxy   -0.124360 -0.1423409
R2 1.00  1.000
Slope  1.00  0.7919584
D  0.016791  0.0147536
U -0.002395  0.0006448
Q  0.019186  0.0141088
   training   test
Dxy   -0.191875 -0.1423409
R2 1.00  1.000
Slope  1.00  0.8936724
D  0.022397  0.0147536
U -0.002339  0.0001367
Q  0.024736  0.0146169
   training   test
Dxy   -0.199514 -0.1423409
R2 1.00  1.000
Slope  1.00  0.8075246
D  0.025717  0.0147536
U -0.002447  0.0005348
Q  0.028163  0.0142188
Error in Xb(x[, xcol, drop = FALSE], coef, non.slopes, non.slopes.in.x,  : 
subscript out of bounds


Any help/suggestions will be highly appreciated.


Sincerely,
Tore Wentzel-Larsen
statistician
Centre for Clinical research
Armauer Hansen house 
Haukeland University Hospital
N-5021 Bergen
tlf   +47 55 97 55 39 (a)
faks  +47 55 97 60 88 (a)
email [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R 2.5.1 - Rscript through tee

2007-08-27 Thread Dirk Eddelbuettel


On 26 August 2007 at 22:47, François Pinard wrote:
| I met a little problem for which someone might have a solution.  Let's 
| say I have an executable file (named "pp.R") with this contents:
| 
|#!/usr/bin/Rscript
|options(echo=TRUE)
|a <- 1
|Sys.sleep(3)
|a <- 2
| 
| If I execute "./pp.R" at the shell prompt, the output shows the timely 
| progress of the script as expected.  If I use "./pp.R | tee OUT" 
| instead, the output seems buffered and I see it all at once at the end.
| 
| The problem does not come from the "tee" program, as if I use this 
| command:
| 
|(echo a; sleep 5; echo b) | tee OUT
| 
| the output is timely, not batched.
| 
| So, is there a way to tell R (or Rscript) that standard output should be 
| unbuffered, even if it is not directly connected to a terminal?

Use explicit print statements, e.g.  print(a <- 1)

Also, you still have little as an alternate, at least on Unix [1].  Littler5D
actually won't show anything unless you explicitly call cat() or print(), but
then it does:

qa-v40z1:~/svn/hancock/app/aggposview> cat /tmp/fp2.r
#!/usr/bin/env r

options(echo=TRUE)
cat(a <- 1, "\n")
Sys.sleep(3)
cat(a <- 2, "\n")
foo:~> /tmp/fp2.r | tee /tmp/fp2.r.out
1
2
foo:~> 

Littler is an 'all-in' binary and starts and runs demonstrably faster than
Rscript. 

Hth, Dirk
 
[1] And despite the rather petty refusal of Rscript's main author to a least
give a reference to littler in Rscript's documentation, let alone credit as
'we were there first', the fact remains that littler became available in Sep
2006 whereas Rscript was not released until R 2.5.0 a good six month
later. Oh well. 



| In case useful, here is local R information:
| 
| Version:
|  platform = x86_64-unknown-linux-gnu
|  arch = x86_64
|  os = linux-gnu
|  system = x86_64, linux-gnu
|  status = 
|  major = 2
|  minor = 5.1
|  year = 2007
|  month = 06
|  day = 27
|  svn rev = 42083
|  language = R
|  version.string = R version 2.5.1 (2007-06-27)
| 
| Locale:
| 
LC_CTYPE=fr_CA.UTF-8;LC_NUMERIC=C;LC_TIME=fr_CA.UTF-8;LC_COLLATE=fr_CA.UTF-8;LC_MONETARY=fr_CA.UTF-8;LC_MESSAGES=fr_CA.UTF-8;LC_PAPER=fr_CA.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=fr_CA.UTF-8;LC_IDENTIFICATION=C
| 
| Search Path:
|  .GlobalEnv, package:stats, package:utils, package:datasets, fp.etc, 
package:graphics, package:grDevices, package:methods, Autoloads, package:base
| 
| -- 
| François Pinard   http://pinard.progiciels-bpi.ca
| 
| __
| R-help@stat.math.ethz.ch mailing list
| https://stat.ethz.ch/mailman/listinfo/r-help
| PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
| and provide commented, minimal, self-contained, reproducible code.

-- 
Three out of two people have difficulties with fractions.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] subset using noncontiguous variables by name (not index)

2007-08-27 Thread Muenchen, Robert A (Bob)

Gabor, That works great!

I think this would be a very helpful addition to the main R
distribution. Perhaps with a single colon representing numerical order
(exactly as you have written it) and two colons representing the order
of the variables as they appear in the data frame (your first example).
That's analogous to SAS' x1-xN, which you know gets those N variables,
and a--z, which selects an unknown number of variables a through z. How
many that is depends upon their order in the data frame. That would not
only be very useful in general, but it would also make transitioning to
R from SAS or SPSS less confusing.

Is R still being extended in such basic ways, or does that muck up
existing programs too much?

Thanks,
Bob

> -Original Message-
> From: Gabor Grothendieck [mailto:[EMAIL PROTECTED]
> Sent: Sunday, August 26, 2007 8:52 PM
> To: Muenchen, Robert A (Bob)
> Cc: r-help@stat.math.ethz.ch
> Subject: Re: [R] subset using noncontiguous variables by name (not
> index)
> 
> Try this:
> 
> > "%:%" <- function(x, y) {
> +prex <- gsub("[0-9]", "", x); postx <- gsub("[^0-9]", "", x)
> +prey <- gsub("[0-9]", "", y); posty <- gsub("[^0-9]", "", y)
> +stopifnot(prex == prey)
> +paste(prex, seq(from = as.numeric(postx), to =
> as.numeric(posty)), sep = "")
> + }
> > "x2" %:% "x4"
> [1] "x2" "x3" "x4"
> 
> 
> On 8/26/07, Muenchen, Robert A (Bob) <[EMAIL PROTECTED]> wrote:
> > Thanks Bert & Gabor for two very interesting solutions!
> >
> > It would be very handy in R if string1:stringN generated
> > "string1","string2"..."stringN" it would make selections like this
> much
> > more obvious. I know it's easy to with the colon operator and paste
> > function but that's quite a step up in complexity compared to SAS'
x1
> > x3-x4 y2 or SPSS' x1,x3 to x4, y2. And it's complexity that
beginners
> > face early in learning R.
> >
> > While on the subject of the colon operator, why doesn't
> anscombe[[1:4]]
> > select the x variables in list form as anscombe[,1:4] or
> anscombe[1:4]
> > do in data frame form?
> >
> > Thanks,
> >
> > Bob
> >
> > =
> > Bob Muenchen (pronounced Min'-chen), Manager
> > Statistical Consulting Center
> > U of TN Office of Information Technology
> > 200 Stokely Management Center, Knoxville, TN 37996-0520
> > Voice: (865) 974-5230
> > FAX: (865) 974-4810
> > Email: [EMAIL PROTECTED]
> > Web: http://oit.utk.edu/scc,
> > News: http://listserv.utk.edu/archives/statnews.html
> > =
> >
> >
> > > -Original Message-
> > > From: Bert Gunter [mailto:[EMAIL PROTECTED]
> > > Sent: Sunday, August 26, 2007 6:50 PM
> > > To: 'Gabor Grothendieck'; Muenchen, Robert A (Bob)
> > > Cc: r-help@stat.math.ethz.ch
> > > Subject: RE: [R] subset using noncontiguous variables by name (not
> > > index)
> > >
> > > The problem is that "x3:x5" does not mean what you think it means.
> The
> > > only
> > > reason it does the right thing in subset() is because a clever
> trick
> > is
> > > used
> > > there (read the code -- it's not hard to understand) to ensure
that
> it
> > > does.
> > > Gabor has essentially mimicked that trick in his solution.
> > >
> > > However, it is not necessary do this. You can construct the call
> > > directly as
> > > you tried to do. Using the anscombe example, here's how:
> > >
> > > chooz <- "c(x1,x3:x4,y2)"  ## enclose the desired expression in
> quotes
> > > do.call (subset, list( x = anscombe, select = parse(text =
chooz)))
> > >
> > > -- Bert Gunter
> > > Genentech Non-Clinical Statistics
> > > South San Francisco, CA
> > >
> > > "The business of the statistician is to catalyze the scientific
> > > learning
> > > process."  - George E. P. Box
> > >
> > >
> > >
> > > > -Original Message-
> > > > From: [EMAIL PROTECTED]
> > > > [mailto:[EMAIL PROTECTED] On Behalf Of Gabor
> > > > Grothendieck
> > > > Sent: Sunday, August 26, 2007 2:10 PM
> > > > To: Muenchen, Robert A (Bob)
> > > > Cc: r-help@stat.math.ethz.ch
> > > > Subject: Re: [R] subset using noncontiguous variables by name
> > > > (not index)
> > > >
> > > > Using builtin data frame anscombe try this. First we set up a
> > > > data frame
> > > > anscombe.seq which has one row containing 1, 2, 3, ... .  Then
> > select
> > > > out from that data frame and unlist it to get the desired
> > > > index vector.
> > > >
> > > > > anscombe.seq <- replace(anscombe[1,], TRUE,
> seq_along(anscombe))
> > > > > idx <- unlist(subset(anscombe.seq, select = c(x1, x3:x4, y2)))
> > > > > anscombe[idx]
> > > >x1 x3 x4   y2
> > > > 1  10 10  8 9.14
> > > > 2   8  8  8 8.14
> > > > 3  13 13  8 8.74
> > > > 4   9  9  8 8.77
> > > > 5  11 11  8 9.26
> > > > 6  14 14  8 8.10
> > > > 7   6  6  8 6.13
> > > > 8   4  4 19 3.10
> > > > 9  12 12  8 9.13
> > > > 10  7  7  8 7.26
> > > > 11  5  5  8 4.74
> > > >
> > > >
> > > > On 8/26/07, Muenchen, Robert A (Bob) <[EMAIL PROTECTED]> wrote:
> > > > > Hi All,
> > > > >

Re: [R] FAQ 7.x when 7 does not exist. Useability question

2007-08-27 Thread Duncan Murdoch

On 8/27/2007 8:52 AM, John Kane wrote:
> --- Duncan Murdoch <[EMAIL PROTECTED]> wrote:

>> I like the first, simple suggestion best; I'll put
>> it into R-devel.  
>> (With the slight change to use ul.menu instead
>> of just ul, because FAQ 2.7 includes a plain ul
>> list.)
>> 
>> Duncan Murdoch
> Thanks Deepayan and Duncan.  
> 
> It is not a make or break point in using R but it does
> seem to make the FAQ a bit more user-friendly.

I'm about to commit the change, but it's not perfect.  I've applied the 
change to the css used in all the manuals, not just the FAQ, so the HTML 
versions of the manuals now end up with numbered contents listings too. 
  However, appendices continue the chapter numbering, rather than 
switching to letters.  I think this is preferable to no numbering at 
all, but if others object to it, we can make this change for the FAQ only.

Another way to do this is what's used in the texinfo manual
http://www.gnu.org/software/texinfo/manual/texinfo/texinfo.html
but I find that ugly and inconsistent.  The contents listing gets the 
numbering and lettering right (but not well formatted), but within each 
chapter the menus are unnumbered.

The texinfo format is just a bit limited for this kind of thing.

Duncan Murdoch

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How can I interpret this test hypothesis test

2007-08-27 Thread Tom La Bone

 
Since no reply has been posted yet I will give it a shot. runs.test uses the
normal approximation and in your case it returned a z score of -1.8732. This
z score has a cumulative probability of 
 
pnorm(-1.8732,0,1)
[1] 0.03052039
 
If you are concerned about having too many runs and too few runs you would
select the "two.sided" option for runs.test, which gives a p-value of 0.0610
(0.0305 in each tail of the normal distribution). If you are concerned only
with too few runs you would select the "less" option, which will give a
p-value of 0.0305. Finally, if you are concerned only with too many runs you
would select the "greater" option which will give a p-value of 1-0.0305 =
0.9693. If your significance level is 0.05, you would compare 0.05 to 0.0610
and not reject the null hypothesis for the two-sided case and compare 0.05
to 0.0305 in the one-sided case and reject the null hypothesis. Note that
the normal approximation is OK for large samples but may give unacceptable
results for small samples. I am unaware of any packages in R that perform an
exact runs test.
 
Tom

 

>I have used "runs.test" (Package tseries)  for computes the runs test
>for randomness , but I get this result:
> 
>Runs test
>-1.8732   P-value = 0.0610
> 
>Alternative Hypothesis : Two sided
> 
>How can I interpret this result ?
 
 

 


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] how to write nicely a condition on a loop "for" (that is, not like I did)

2007-08-27 Thread Ptit_Bleu


Hi again,

This is the follow of my post "Problem with save or/and if (I think but
maybe not ...)".
In this post, I wrote that I solved my main problem. And it is true.
I also wrote that there was still another problem, which I managed to solve. 

But I think there must be another way to solve it taking advantages of the R
language (which I don't master at all), that is with less "if" tests.

To sum up :
nfichiers is a list of files (with .P0 or .Px (x>0) extension) I have to
copy to a database.
nfichiers can also be "0" if there is no file to copy
p0fichiers is the list of files having the .P0 extension if there are such
files to copy
And p0fichiers can also be "0" if there are only .Px files to copy

So, before doing the "for" loop, I want to test if p0fichiers really
contains something.
Thanks for your comments and your advices to improve this script.
Ptit Bleu.

-

So here is my solution :

p0fichiers<-"0" #initialization of
p0fichiers
if (length(nfichiers)>0)  # if nfichiers contains
file names
{
  if (length(grep(".P0", nfichiers))>0) {p0fichiers<-nfichiers[grep(".P0",
nfichiers)]}  #look if there is .P0
  if (p0fichiers[1]>"0")   # if .P0 has been updated
with the test above
{
for (i in 1:length(p0fichiers))  # do the loop "for"
{
 donnees<-read.table(p0fichiers[i], quote="\"", sep=";",
dec=",", skip=18)
 jourheure<-paste(donnees$V1, donnees$V2, sep=" ")
 donnees[1]<-jourheure
 donnees<-donnees[,-2]
rm(donnees, jourheure)
}
}
}
-- 
View this message in context: 
http://www.nabble.com/how-to-write-nicely-a-condition-on-a-loop-%22for%22-%28that-is%2C-not-like-I-did%29-tf4335310.html#a12347016
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Column naming mystery

2007-08-27 Thread Werner Wernersen

Sorry that the problem description was not sufficient.
Here is a self-contained code replicating the problem:

require(doBy)
x <-
as.data.frame(matrix(ncol=3,seq(1,12),dimnames=list(c(),c("hh","total","total.inf"
summaryBy(total+total.inf~hh,x,FUN=sum)

What surprises me are the zeros in the resulting
total.sum column. The problem remains if total.inf is
renamed to totalinf or total_inf but not if renamed to
ttotal.inf .

Can anyone explain to me what the rules for naming
columns are so that I can avoid such mistakes in the
future?

Thanks a lot!


  

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] FAQ 7.x when 7 does not exist. Useability question

2007-08-27 Thread John Kane


--- Duncan Murdoch <[EMAIL PROTECTED]> wrote:

> Deepayan Sarkar wrote:
> > On 8/23/07, Duncan Murdoch <[EMAIL PROTECTED]>
> wrote:
> >   
> >> On 8/23/2007 11:28 AM, Prof Brian Ripley wrote:
> >> 
> >>> On Thu, 23 Aug 2007, John Kane wrote:
> >>>
> >>>   
>  The FAQ Section 7 is a very useful place for
> new users
>  to find out any number of R idiosycracies. 
> However
>  there is no numbering on the FAQ Table of
> Content or
>  on the Sections Tables of Contents.
>  
> >>> Hmm, doc/FAQ does have a numbered table of
> contents and numbered sections
> >>> and doc/manual/R-FAQ.html does have numbered
> sections and my browser's
> >>> search finds 7.10 straight away.
> >>>   
> >> I think the suggestion is to change the contents
> lists in HTML from 
> >> lists to  lists.  Then one would see
> >>
> >> 1. Introduction
> >> 2. R Basics
> >> 3. R and S
> >> 4. R Web Interfaces
> >> 5. R Add-On Packages
> >> 6. R and Emacs
> >> 7. R Miscellanea
> >> 8. R Programming
> >> 9. R Bugs
> >>10. Acknowledgments
> >>
> >> instead of
> >>
> >>  * Introduction
> >>  * R Basics
> >>  * R and S
> >>  * R Web Interfaces
> >>  * R Add-On Packages
> >>  * R and Emacs
> >>  * R Miscellanea
> >>  * R Programming
> >>  * R Bugs
> >>  * Acknowledgments
> >>
> >> in a browser, and I agree that would be
> preferable (assuming the
> >> numbering is consistent with what we get in the
> other formats).
> >> However, I don't see how to tell makeinfo --html
> to do this.  Adding
> >> --number-sections isn't enough.
> >> 
> >
> > A simple CSS hack is to have
> >
> > ul{
> > list-style-type: decimal;
> > }
> >
> > in the style. The result can be seen in
> >
> > http://dsarkar.fhcrc.org/R/RFAQ-1.png
> >
> > A more sophisticated hack is to have something
> like
> >
> > ---
> > body{
> > counter-reset: chapter;
> > counter-reset: section;
> > }
> > h2.chapter {
> > counter-increment: chapter;
> > counter-reset: section;
> > }
> >
> > ul {
> > list-style-type: none;
> > }
> >
> > li:before {
> > counter-increment: section;
> > content: counter(chapter) "." counter(section)
> " " ;
> > }
> > -
> >
> > which results in
> >
> > http://dsarkar.fhcrc.org/R/RFAQ-2.png
> >
> > The only problem here is that there is no way to
> distinguish between
> > the chapter listing and the section listings (both
> are  > class="menu">). If that could be made to have a
> different class, the
> > chapter listing could be improved.
> 
> I like the first, simple suggestion best; I'll put
> it into R-devel.  
> (With the slight change to use ul.menu instead
> of just ul, because FAQ 2.7 includes a plain ul
> list.)
> 
> Duncan Murdoch
Thanks Deepayan and Duncan.  

It is not a make or break point in using R but it does
seem to make the FAQ a bit more user-friendly.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] [R-pkgs] proftools package now available from CRAN

2007-08-27 Thread Luke Tierney

PROFILE OUTPUT PROCESSING TOOLS FOR R
 =


This package provides some simple tools for examining Rprof output
and, in particular, extracting and viewing call graph information.
Call graph information, including which direct calls where observed
and how much time was spent in these calls, can be very useful in
identifying performance bottlenecks.

One important caution: because of lazy evaluation a nested call
f(g(x)) will appear on the profile call stack as if g had been called
by f or one of f's callees, because it is the point at which the value
of g(x) is first needed that triggers the evaluation.


EXPORTED FUNCTIONS

The package exports five functions:

 readProfileData reads the data in the file produced by Rprof into a
 data structure used by the other functions in the package.
 The format of the data structure is subject to change.

 flatProfile is similar to summaryRprof.  It returns either a
 matrix with output analogous to gprof's flat profile or a
 matrix like the by.total component returned by summaryRprof;
 which is returned depends on the value of an optional second
 argument.

 printProfileCallGraph produces a printed representation of the
 call graph.  It is analogous to the call graph produced by
 gprof with a few minor changes.  Reading the gprof manual
 section on the call graph should help understanding this
 output.  The output is similar enough to gprof output for the
 cgprof (http://mvertes.free.fr/) script to be able to produce
 a call graph via Graphviz.

 profileCallGraph2Dot prints out a Graphviz .dot file representing
 the profile graph.  Times spent in calls can be mapped to node
 and edge colors.  The resulting files can then be viewed with
 the Graphviz command line tools.

 plotProfileCallGraph uses the graph and Rgraphviz packages to
 produce call graph visualizations within R.  You will need to
 install these packages to use this function.


A SIMPLE EXAMPLE

Collect profile information  for the examples for glm:

   Rprof("glm.out")
   example(glm)
   Rprof()
   pd <- readProfileData("glm.out")

Obtain flat profile information:

   flatProfile(pd)
   flatProfile(pd, FALSE)

Obtain a printed call graph on the standard output:

   printProfileCallGraph(pd)

If you have the cgprof script and the Graphviz command line tools
available on a UNIX-like system, then you can save the printed graph
to a file,

   printProfileCallGraph(pd, "glm.graph")

and either use

   cgprof -TX glm.graph

to display the graph in the interactive graph viewer dotty, or use

   cgprof -Tps glm.graph > glm.ps
   gv glm.ps

to create a PostScript version of the call graph and display it with
gv.

Instead of using the printed graph and cgprof you can use create a
Graphviz .dot file representation of the call graph with

   profileCallGraph2Dot(pd, filename = "glm.dot", score = "total")

and view the graph interactively with dotty using

   dotty glm.dot

or as a postscript file with

   dot -Tps glm.dot > glm.ps
   gv glm.ps

Finally, if you have the graph package from CRAN and the Rgraphviz
package from Bioconductor installed, then you can view the call graph
within R using

   plotProfileCallGraph(pd, score = "total")

The default settings for this version need some work.]


OPEN ISSUES

My intention was to handle cycles roughly the same way that gprof
does.  I am not completely sure that I have managed to do this; I am
also not completely sure this is the best approach.

The graphs produced by cgprof and by plotProfileGraph and friends when
mergeEdges is false differ a bit.  I think this is due to the
heuristics of cgprof not handling cycle entries ideally and that the
plotProfileGraph graphs are actually closer to what is wanted.  When
mergeEdges is true the resulting graphs are DAGs, which simplifies
interpretation, but at the cost of lumping all cycle members together.

gprof provides options for pruning graph printouts by omitting
specified nodes.  It may be useful to allow this here as well.

Probably more use should be made of the graph package.


IMPLEMENTATION NOTES

The implementation is extremely crude (a real mess would be more
accurate) and will hopefully be improved over time--at this point it
is more of an existence proof than a final product.

Performance is less than ideal, though using these tools it was
possible to identify some problem points and speed up computing the
profile data by a factor of two (in other words, it may be bad now but
it used to be worse).  More careful design of the data structures and
memoizing calculations that are now repeated is likely to improve
performance substantially.




-- 
Luke Tierney
Chair, Statist

[R] Confidence intervals for ccf()

2007-08-27 Thread Gustaf Rydevik

Hello,

This is not a purely R-question, but perhaps someone can help me anyway.

I am trying to estimate the correlation between two time series (which
are both basically different types of  measurements of the same
phenomena), using both cor.test() (with pearson as method) and ccf().

Now, cor.test gives a confidence interval for the pearson correlation,
while ccf does not. I've tried to use bootstrap methods to get
confidence interval for the ccf function, but no luck. It is a bit
tricky, since the time series are non-stationary, and so I'm not sure
how to go about to generate the bootstrap-sample.

Does anyone have any ideas on how to do this, i.e get a confidence
interval for the ccf at different time lags?

Many thanks in advance,

Gustaf

-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Monmonier algorithm

2007-08-27 Thread Thibaut Jombart

Hello,

Here is a late answer, but an answer nonetheless to the question I asked 
almost one year ago on this list:

 > On Wed, 29 Mar 2006, Thibaut Jombart wrote:

 >> Hello list, 

/>> /
/>> does anyone know if Monmonier algorithm is available in R? I've 
checked /
/>> several spatial libraries, but I didn't find anything related to it. /
/>> However, there is a huge documentation and I may have missed it. /
/>> /
/>> Before coding it, I'd like to be sure it doesn't already exist. /

 > Googling, I found:

 > http://www-med-physik.vu-wien.ac.at/staff/rub/abstracts/ISCB_2005.pdf

 > which is a poster, and refers to using R for boundary finding, and 
other software for data management and display.  >Perhaps the authors 
are able to help by making code available, the poster looks like a nice 
example of spatial data >analysis.

> -- 
> Roger Bivand
> Economic Geography Section, Department of Economics, Norwegian School of
> Economics and Business Administration, Helleveien 30, N-5045 Bergen,
> Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
> e-mail: [EMAIL PROTECTED]

Basically, Monmonier algorithm aims at finding maximum-difference 
boundaries between geo-referenced objects. It requires a set of 
georeferenced objects along with matrix of distances among these objects.

Monmonier algorithm is now implemented in the adegenet package  
(http://pbil.univ-lyon1.fr/software/adegenet/). Main functions are 
'monmonier' and 'optimize.monmonier'. Despite the package is devoted to 
genetic data analysis, these functions can handle other kind of data as 
well.

The main difference I can see between this implementation and the 
original algorithm is that here, the function uses objects connected on 
a neighbouring graph rather than polygons of a Voronoi tesselation. 
Thus, Delaunay triangulation shall be used to recover the original 
version of the algorithm, but other graphs are also possible (e.g. 
Gabriel's graph).

Regards,

Thibaut.

-- 
##
Thibaut JOMBART
CNRS UMR 5558 - Laboratoire de Biométrie et Biologie Evolutive
Universite Lyon 1
43 bd du 11 novembre 1918
69622 Villeurbanne Cedex
Tél. : 04.72.43.29.35
Fax : 04.72.43.13.88
[EMAIL PROTECTED]
http://lbbe.univ-lyon1.fr/-Jombart-Thibault-.html?lang=en

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] [SOLVED] save/load - I finally found (to be honnest : jholtman found)

2007-08-27 Thread Ptit_Bleu


The post of jholtman gave me the solution :
http://www.nabble.com/problems-saving-and-loading-%28PLMset%29-objects-tf4179541.html#a11885136

Like Quin Wills, I was trying to assign tfichiers.rda to tfichiers.
I've just write load("tfichiers.rda") instead of
tfichiers<-("tfichiers.rda")
And now it works ... for this part (because if new files are only .P0, there
is a problem when the script try to read .P(not 0) file as there is none.
But this is not so difficult to solve even for me (I think, well, I hope).

Thanks to Prof Ripley and to all people helping people like me (maybe one
day I will also be able to help people).
Have a nice week,
Ptit Bleu.




-- 
View this message in context: 
http://www.nabble.com/Problem-with-save-or-and-if-%28I-think-but-maybe-not-...%29-tf4333945.html#a12345123
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem with save or/and if (I think but maybe not ...)

2007-08-27 Thread Ptit_Bleu


Dera Prof Ripley, 

You wrote :
What did you intend there?  It is not a test of no difference, but a test 
that each element of the difference is not "0", and furthermore if() 
expects a test of length one, not the length of nfichiers.  I suspect you 
intended to test length(nfichiers) > 0.

And of course, you were right.
With the condition length(nfichiers) > 0, there is no more warning.
And I tested manually length(nfichiers)>0 for different cases and it gave
the result I expected.

But still I have a problem with the save and the retrieve of tfichiers.
I keep on looking at help file and testing manually alternative scripts ...

Hoping to read you again,
Ptit Bleu. 

   
-- 
View this message in context: 
http://www.nabble.com/Problem-with-save-or-and-if-%28I-think-but-maybe-not-...%29-tf4333945.html#a12344227
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Column naming mystery

2007-08-27 Thread Werner Wernersen

Hi,

I hope somebody could help me explain what seems
mysterious to me? 

I use this line on a dataframe ae:
summaryBy(total_inflated+total~gr1, data=ae, FUN=sum,
na.rm=T)

and it returns 3 columns as expected and columns "gr1"
and "total_inflated.sum"are correct but the
"total.sum" column consists of only zeros which is not
correct. The same happens when I rename the
"total_inflated" to "total.inflated" or
"totalinflated" but not when I rename it to
"ttotal_inflated". In the latter case I get the
correct result also for the "total.sum" column.

Could anyone explain the rules for the column naming
to me?

Thank you very much in advance!
  Werner


  Machen Sie Yahoo! zu Ihrer Startseite. Los geht's:

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem with save or/and if (I think but maybe not ...)

2007-08-27 Thread Ptit_Bleu


Dear Prof Ripley,

I thank you for your fast answer.
In order to follow your advices :

I deleted all the objects and the tfichiers.r already created.
I changed all the "tfichiers.t" of the script into "tfichiers.rda"

Then I launched the script twice.
The first time, as tfichiers.rda didnt' exist, it created one.
During the script, I got this warning :
1: la condition a une longueur > 1 et seul le premier élément est utilisé
in: if (nfichiers != "0")
(translate with my words : the condition has a length superior to 1 and only
the first element is used in ...)
Below, you will find the results.

The second launch gave the same results for nfichiers and rfichiers but for
tfichiers I obtained
"tfichiers".

Have you some ideas to help me (because I really have none ...)
Again thank you,
Ptit Bleu.

--
FIRST LAUNCH

>nfichiers
[1] "d:/Mydata/31_07_07.P0"   "d:/Mydata/31_07_2007.P0"
[3] "d:/Mydata/31_07_2007.P1" "d:/Mydata/31_07_2007.P2"
[5] "d:/Mydata/31_07_2007.P3"

> nfichiers!="0"
[1] TRUE TRUE TRUE TRUE TRUE

>rfichiers
[1] "d:/Mydata/31_07_07.P0"   "d:/Mydata/31_07_2007.P0"
[3] "d:/Mydata/31_07_2007.P1" "d:/Mydata/31_07_2007.P2"
[5] "d:/Mydata/31_07_2007.P3"

>tfichiers
[1] "d:/Mydata/31_07_07.P0"   "d:/Mydata/31_07_2007.P0"
[3] "d:/Mydata/31_07_2007.P1" "d:/Mydata/31_07_2007.P2"
[5] "d:/Mydata/31_07_2007.P3"
--

SECOND LAUNCH
with these changes in order not to change tfichiers.rda
#tfichiers<-rfichiers
#save(tfichiers, file="tfichiers.rda")

>nfichiers
[1] "d:/Mydata/31_07_07.P0"   "d:/Mydata/31_07_2007.P0"
[3] "d:/Mydata/31_07_2007.P1" "d:/Mydata/31_07_2007.P2"
[5] "d:/Mydata/31_07_2007.P3"

> nfichiers!="0"
[1] TRUE TRUE TRUE TRUE TRUE

>rfichiers
[1] "d:/Mydata/31_07_07.P0"   "d:/Mydata/31_07_2007.P0"
[3] "d:/Mydata/31_07_2007.P1" "d:/Mydata/31_07_2007.P2"
[5] "d:/Mydata/31_07_2007.P3"

>tfichiers
"tfichiers"

 
-- 
View this message in context: 
http://www.nabble.com/Problem-with-save-or-and-if-%28I-think-but-maybe-not-...%29-tf4333945.html#a12344036
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Calculating diameters of cirkels in a picture.

2007-08-27 Thread Bartjoosen


Hi All,

I really like to thank you for the answers, while I was searching for some
edge detection and clustering algorithms, Moshe came with a simple but
effective solution: use the area to find the diameter!

But I tried Moshe's solution, but I couldn't figure out what you mean with
morphological closing and the labeling to split the images.
Could you please clarify this a bit?

Thanks for your support


Bart


Moshe Olshansky-2 wrote:
> 
> Hi Bart,
> 
> One more comment:
> 
> You do not really need the morphological closing to
> close the "holes" inside the circles. Another
> possibility is to reverse the black-and-withe picture,
> i.e. make the holes and background be 1 and the
> circles 0, label the connected components and then
> only the component which touches the boundaries is the
> background while all other components are "holes" and
> you can make them white (1) in the original
> black-and-white image.
> 
> --- Moshe Olshansky <[EMAIL PROTECTED]> wrote:
> 
>> Hi Bart,
>> 
>> I have never used image processing software in R (I
>> was doing this with Matlab), but here is what I
>> would
>> have done algorithmically:
>> 1) convert the picture to gray-scale
>> 2) find a threshold value which separates the
>> circles
>> from the background and convert your image to black
>> and white
>> 3) if the circles are far apart use morphological
>> closing to fill in small holes inside the circles
>> (may
>> be do this several times)
>> 4) use labeling to split the image into connected
>> components
>> 5) for each connected component get it's area (the
>> number of pixels) and use the formula S = Pi*R^2 to
>> find the approximate radii.
>> 
>> Regards,
>> 
>> Moshe.
>> 
>> --- Julian Burgos <[EMAIL PROTECTED]> wrote:
>> 
>> > Hi Bart,
>> > 
>> > If you only have 36 circles, the fastest way would
>> > be to use some image 
>> > processing software and measure the circles "by
>> > hand".  One option is to 
>> > use ImageJ, which you can download here
>> > 
>> > http://rsb.info.nih.gov/ij/
>> > 
>> > Julian
>> > 
>> > Bart Joosen wrote:
>> > > Hi,
>> > >
>> > > Maybe this is more a programming questions than
>> a
>> > specific R-project question, but maybe there is
>> > someone who can point me in the right direction.
>> > >
>> > > I have a picture of cirkels which I took with a
>> > digital camera.
>> > > Now I want to use the diameter of the cirkels on
>> > the picture for analysis in R.
>> > > I can use pixmap to import the picture, but how
>> do
>> > I find the outside cirkels and calculate the
>> > diameter?
>> > > I pointed out that I can use the edci package,
>> but
>> > then I need to preprocess the data to reduce the
>> > points, otherwise it takes a long time, and my
>> > computer crashes.
>> > >
>> > > If you want to see such a picture, I cropped a
>> > larger one, and highlighted the cirkel which is of
>> > interest.
>> > > In a real world, this is a plate with 36
>> cirkels,
>> > which all should be measured.
>> > > www.users.skynet.be/fa244930/fotos/outlined.jpg
>> > >
>> > >
>> > > Thanks for your time
>> > >
>> > > Bart
>> > >  [[alternative HTML version deleted]]
>> > >
>> > > __
>> > > R-help@stat.math.ethz.ch mailing list
>> > > https://stat.ethz.ch/mailman/listinfo/r-help
>> > > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > > and provide commented, minimal, self-contained,
>> > reproducible code.
>> > >
>> > >
>> > 
>> > __
>> > R-help@stat.math.ethz.ch mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained,
>> > reproducible code.
>> >
>> 
>> __
>> R-help@stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained,
>> reproducible code.
>>
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Calculating-diameters-of-cirkels-in-a-picture.-tf4319669.html#a12343143
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem with save or/and if (I think but maybe not ...)

2007-08-27 Thread Prof Brian Ripley


On Mon, 27 Aug 2007, Ptit_Bleu wrote:



Hi,

I recently discovered the R program and I thought it could be useful to me.
I have to analyse data saved as .Px file (x between 0 and 8 - .P0 files have
18 lines at the beginning that I have to skip). New files are generated
everyday.


relrfichiers<-dir(chemin, pattern=".P")

does not do that, though.  Better to use

dir(chemin, pattern="\\.[0-8]$", full.names=TRUE)

or

Sys.glob(file.path(chemin, "*.P[0-8]"))



This is my strategy :

In order to analyse the data, I first want to copy the new data in a
database in MySQL (which already contains the previous data).
So the first task is to compare the list of the files in the directory
(object : rfichiers) to the list of the files already saved (object :
tfichiers). The list containing the new files is then given by
nfichiers<-setdiff(rfichiers, tfichiers).

It sounds easy ...
... but it doesn't work !!!

Up to now, I'm am able to connect to MySQL and, if the file "tfichiers.r"
doesn't exist, I can copy data files to the MySQL database.
But if "tfichiers.r" already exists and there is no new file to save, it
ignores the condition if (nfichiers!="0") and save all the files of the
directory to the database.


What did you intend there?  It is not a test of no difference, but a test 
that each element of the difference is not "0", and furthermore if() 
expects a test of length one, not the length of nfichiers.  I suspect you 
intended to test length(nfichiers) > 0.


It often helps to print (or use str on) the objects you create.  Try this 
on


nfichiers
nfichiers!="0"


Is it a problem with the way I save tfichiers or is it a problem with the
condition if (nfichiers!="0") ?


Saving in R save format with extension .r is going to confuse others. 
Extension .rda is conventional for save format (and I doubt you need an 
ascii save).



Could you please give me some advices to correct my script (written with
Tinn-R) ?

I thank you in advance for your help.
Have a nice week,
Ptit Bleu.

PS : Ptit Bleu means something like "Full Newbye" in french. So thanks to be
patient :-)
PPS : I hope you understand my french english

--


# Connexion a la base de donnees database de MySQL

library(DBI)
library(RMySQL)
drv<-dbDriver("MySQL")
con<-dbConnect(drv, username="user", password="password", dbname="database",
host="localhost")


# Creation des objets contenant la liste des fichiers (rel pour chemin
relatif)
# - dans le repertoire : objet rfichiers
# - deja traites : objet tfichiers
# - nouveaux depuis la derniere connexion : objet nfichiers
# chemin est le repertoire de stockage des donnees
# RWork est le repertoire de travail de R
# sep='' pour eviter l'ajout d'un espace apres Mydata/

setwd("D:/RWork")
chemin<-"d:/Mydata/"
relrfichiers<-dir(chemin, pattern=".P")
rfichiers<-paste(chemin,relrfichiers, sep='')
if (file.exists("tfichiers.r"))
 {
   tfichiers<-load("tfichiers.r")
   nfichiers<-setdiff(rfichiers,tfichiers)
 } else {
   nfichiers<-rfichiers
 }


# p0fichiers : fichiers avec l'extension .P0 (fichiers contenant des lignes
d'infos à ne pas charger)
# pxfichiers : fichiers avec les extensions P1, ..., P8 (sans infos au
debut)

if (nfichiers!="0")
{
 p0fichiers<-nfichiers[grep(".P0", nfichiers)]
 pxfichiers<-setdiff(nfichiers, p0fichiers)


# Fusion des colonnes jour et heure pour permettre de tracer des variations
en fonction du temps
# Chaque fichier contenu dans l'objet p0fichiers est chargé, en supprimant
les 18 premieres lignes,
# et on met dans l'objet jourheure la fusion de la colonne jour (V1) et de
la colonne heure (V2)
# L'objet jourheure est recopie dans la premiere colonne de donnees
# On supprime ensuite la deuxieme colonne (contenant les heures) qui est
maintenant superflue
# L'objet donnees est copié dans la base de donnees MySQL Mydata
# Remarque : R comprend le format jour/mois/annee - MySQL : annee/mois/jour
-> stockage en CHAR dans MySQL

 for (i in 1:length(p0fichiers))
   {
 donnees<-read.table(p0fichiers[i], quote="\"", sep=";", dec=",",
skip=18)
 jourheure<-paste(donnees$V1, donnees$V2, sep=" ")
 donnees[1]<-jourheure
 donnees<-donnees[,-2]
#  assignTable(con, "Datatable", donnees, append=TRUE) - Ne marche pas
 dbWriteTable(con, "Datatable", donnees, append=TRUE)
 rm(donnees, jourheure)
   }


# Idem avec les fichiers d'extension .Px en chargant toutes les lignes
(skip=0)
# Amelioration possible : creer une fonction avec en argument p0fichiers ou
pxfichiers

 for (i in 1:length(pxfichiers))
   {
 donnees<-read.table(pxfichiers[i], quote="\"", sep=";", dec=",",
skip=0)
 jourheure<-paste(donnees$V1, donnees$V2, sep=" ")
 donnees[1]<-jourheure
 donnees<-donnees[,-2]
#   assignTable(con, "Datatable", donnees, append=TRUE) - Ne marche pas
 dbWriteTable(con, Datatable", donnees, append=TRUE)
 rm(donnees, jourheure)
   }
}

tfichiers<-rfichiers
save(rfichiers, file="tfichiers.r", ascii=TRUE)
rm(

[R] I again with shorter message and script

2007-08-27 Thread Ptit_Bleu


Hi,

I realized that my first message and the script were (maybe) too long and
difficult to read.
So I tested this shorter one :

- 

setwd("D:/RWork")
chemin<-"d:/Mydata/"
relrfichiers<-dir(chemin, pattern=".P")
rfichiers<-paste(chemin,relrfichiers, sep='')

tfichiers<-rfichiers
save("tfichiers", file="tfichiers.r", ascii=TRUE)

if (file.exists("tfichiers.r"))
  {
tfichiers<-load("tfichiers.r")
nfichiers<-setdiff(rfichiers,tfichiers)
  }



The result is :
nfichiers is equal to rfichiers
and when I ask tfichiers, I obtain ... "tfichiers" :-(

I read the ?save and saw the warning about the arguments but I have no idea
how to solve this problem which must be a basic one (but do not forget that
I'm a newbye and that I'm french :-)

Thans again for your comments and help,
Ptit Bleu.

-- 
View this message in context: 
http://www.nabble.com/Problem-with-save-or-and-if-%28I-think-but-maybe-not-...%29-tf4333945.html#a12343633
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Problem with save or/and if (I think but maybe not ...)

2007-08-27 Thread Ptit_Bleu


Hi,

I recently discovered the R program and I thought it could be useful to me.
I have to analyse data saved as .Px file (x between 0 and 8 - .P0 files have
18 lines at the beginning that I have to skip). New files are generated
everyday.

This is my strategy :

In order to analyse the data, I first want to copy the new data in a
database in MySQL (which already contains the previous data).
So the first task is to compare the list of the files in the directory
(object : rfichiers) to the list of the files already saved (object :
tfichiers). The list containing the new files is then given by
nfichiers<-setdiff(rfichiers, tfichiers).

It sounds easy ...
... but it doesn't work !!!

Up to now, I'm am able to connect to MySQL and, if the file "tfichiers.r"
doesn't exist, I can copy data files to the MySQL database.
But if "tfichiers.r" already exists and there is no new file to save, it
ignores the condition if (nfichiers!="0") and save all the files of the
directory to the database.

Is it a problem with the way I save tfichiers or is it a problem with the
condition if (nfichiers!="0") ?
Could you please give me some advices to correct my script (written with
Tinn-R) ?

I thank you in advance for your help.
Have a nice week,
Ptit Bleu.

PS : Ptit Bleu means something like "Full Newbye" in french. So thanks to be
patient :-)
PPS : I hope you understand my french english

--


# Connexion a la base de donnees database de MySQL

library(DBI)
library(RMySQL)
drv<-dbDriver("MySQL")
con<-dbConnect(drv, username="user", password="password", dbname="database",
host="localhost")


# Creation des objets contenant la liste des fichiers (rel pour chemin
relatif)
# - dans le repertoire : objet rfichiers
# - deja traites : objet tfichiers
# - nouveaux depuis la derniere connexion : objet nfichiers
# chemin est le repertoire de stockage des donnees
# RWork est le repertoire de travail de R
# sep='' pour eviter l'ajout d'un espace apres Mydata/

setwd("D:/RWork")
chemin<-"d:/Mydata/"
relrfichiers<-dir(chemin, pattern=".P")
rfichiers<-paste(chemin,relrfichiers, sep='')

if (file.exists("tfichiers.r"))
  {
tfichiers<-load("tfichiers.r")
nfichiers<-setdiff(rfichiers,tfichiers)
  } else {
nfichiers<-rfichiers
  }


# p0fichiers : fichiers avec l'extension .P0 (fichiers contenant des lignes
d'infos à ne pas charger)
# pxfichiers : fichiers avec les extensions P1, ..., P8 (sans infos au
debut)

if (nfichiers!="0")
{
  p0fichiers<-nfichiers[grep(".P0", nfichiers)]
  pxfichiers<-setdiff(nfichiers, p0fichiers)


# Fusion des colonnes jour et heure pour permettre de tracer des variations
en fonction du temps
# Chaque fichier contenu dans l'objet p0fichiers est chargé, en supprimant
les 18 premieres lignes,
# et on met dans l'objet jourheure la fusion de la colonne jour (V1) et de
la colonne heure (V2)
# L'objet jourheure est recopie dans la premiere colonne de donnees
# On supprime ensuite la deuxieme colonne (contenant les heures) qui est
maintenant superflue
# L'objet donnees est copié dans la base de donnees MySQL Mydata
# Remarque : R comprend le format jour/mois/annee - MySQL : annee/mois/jour
-> stockage en CHAR dans MySQL

  for (i in 1:length(p0fichiers))
{
  donnees<-read.table(p0fichiers[i], quote="\"", sep=";", dec=",",
skip=18)
  jourheure<-paste(donnees$V1, donnees$V2, sep=" ")
  donnees[1]<-jourheure
  donnees<-donnees[,-2]
#  assignTable(con, "Datatable", donnees, append=TRUE) - Ne marche pas
  dbWriteTable(con, "Datatable", donnees, append=TRUE)
  rm(donnees, jourheure)
}


# Idem avec les fichiers d'extension .Px en chargant toutes les lignes
(skip=0)
# Amelioration possible : creer une fonction avec en argument p0fichiers ou
pxfichiers

  for (i in 1:length(pxfichiers))
{
  donnees<-read.table(pxfichiers[i], quote="\"", sep=";", dec=",",
skip=0)
  jourheure<-paste(donnees$V1, donnees$V2, sep=" ")
  donnees[1]<-jourheure
  donnees<-donnees[,-2]
#   assignTable(con, "Datatable", donnees, append=TRUE) - Ne marche pas
  dbWriteTable(con, Datatable", donnees, append=TRUE)
  rm(donnees, jourheure)
}
}

tfichiers<-rfichiers 
save(rfichiers, file="tfichiers.r", ascii=TRUE) 
rm(list=ls())  

# Deconnexion à MySQL

dbDisconnect(con)
-- 
View this message in context: 
http://www.nabble.com/Problem-with-save-or-and-if-%28I-think-but-maybe-not-...%29-tf4333945.html#a12343236
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] (coxph, se) Obtaining standard errors of coefficients from coxph to store

2007-08-27 Thread joris . dewolf



David,

It would be helpful to give an example of what you would like to extract.

I guess you know how to extract elements from vectors and lists.
However, sometimes the objects returned by functions can be rather complex
(output of coxph() is...)
A general method to capture printed output is via capture.output(). Maybe
not fast, but if you have no other solution...

Joris

> a <- rnorm(10,1,1)
> b <- rnorm(10,1,1)
> mod <- lm(a~b)
> smod <- summary(mod)
> smod

Call:
lm(formula = a ~ b)

Residuals:
Min  1Q  Median  3Q Max
-1.7482 -0.5991  0.1211  0.8341  1.4975

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)   1.6210 0.5332   3.040   0.0161 *
b-0.7667 0.5037  -1.522   0.1664
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.142 on 8 degrees of freedom
Multiple R-Squared: 0.2246, Adjusted R-squared: 0.1277
F-statistic: 2.317 on 1 and 8 DF,  p-value: 0.1664

> output <- capture.output(print(smod))
> output
 [1] ""
 [2] "Call:"
 [3] "lm(formula = a ~ b)"
 [4] ""
 [5] "Residuals:"
 [6] "Min  1Q  Median  3Q Max "
 [7] "-1.7482 -0.5991  0.1211  0.8341  1.4975 "
 [8] ""
 [9] "Coefficients:"
[10] "Estimate Std. Error t value Pr(>|t|)  "
[11] "(Intercept)   1.6210 0.5332   3.040   0.0161 *"
[12] "b-0.7667 0.5037  -1.522   0.1664  "
[13] "---"
[14] "Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 "
[15] ""
[16] "Residual standard error: 1.142 on 8 degrees of freedom"
[17] "Multiple R-Squared: 0.2246,\tAdjusted R-squared: 0.1277 "
[18] "F-statistic: 2.317 on 1 and 8 DF,  p-value: 0.1664 "
[19] ""





   
 "David Lloyd" 
 <[EMAIL PROTECTED] 
 lloyd.com> To 
 Sent by:
 [EMAIL PROTECTED]  cc 
 at.math.ethz.ch   
   Subject 
   [R] (coxph,  se) Obtaining standard 
 16/08/2007 11:31  errors of coefficients from coxph   
   to store
   
   
   
   
   
   




Hi all,

I'm wanting to be able to find and store the z-score of coxph below: -

modz=coxph(Surv(TSURV,STATUS)~RAGE+DAGE+REG_WTIME_M+CLD_ISCH+POLY_VS,
data=kidneyT,method="breslow")


I know summary(modz) will give me this, but how do i extract the
standard error or z-score values in a similar way to obtaining the
coefficients by coef(modz) ? I think it must be something to do with
modz$var but I'm having a complete mental blank.

I need this info so I can write a function to use within a bootstrap so
I can record the number of times (proportion) each variable in the Cox
PH model is actually significant over all the bootstrap resamples.

Any assistance is greatly appreciated

DL


Click to find local singles for dating, romance and fun.




___Get
the Free email that has everyone talking at http://www.mail2world.com target=new>http://www.mail2world.com
Unlimited Email Storage – POP3 – Calendar
– SMS – Translator – Much More!
 [[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

56 matches

Mail list logo