date:20070518

Re: [R] R2 always increases as variables are added?

2007-05-18 Thread 李俊杰

I know that "-1" indicates to remove the intercept term. But my question is
why intercept term CAN NOT be treated as a variable term as we place a
column consited of 1 in the predictor matrix.

If I stick to make a comparison between a model with intercept and one
without intercept on adjusted r2 term, now I think the strategy is always to
use another definition of r-square or adjusted r-square, in which
r-square=sum((y.hat)^2)/sum((y)^2).

Am I  in the right way?

Thanks

Li Junjie


2007/5/19, Paul Lynch <[EMAIL PROTECTED]>:
>
> In case you weren't aware, the meaning of the "-1" in y ~ x - 1 is to
> remove the intercept term that would otherwise be implied.
> --Paul
>
> On 5/17/07, Àî¿¡½Ü <[EMAIL PROTECTED]> wrote:
> > Hi, everybody,
> >
> > 3 questions about R-square:
> > -(1)--- Does R2 always increase as variables are added?
> > -(2)--- Does R2 always greater than 1?
> > -(3)--- How is R2 in summary(lm(y~x-1))$r.squared
> > calculated? It is different from (r.square=sum((y.hat-mean
> > (y))^2)/sum((y-mean(y))^2))
> >
> > I will illustrate these problems by the following codes:
> > -(1)---  R2  doesn't always increase as variables are
> added
> >
> > > x=matrix(rnorm(20),ncol=2)
> > > y=rnorm(10)
> > >
> > > lm=lm(y~1)
> > > y.hat=rep(1*lm$coefficients,length(y))
> > > (r.square=sum((y.hat-mean(y))^2)/sum((y-mean(y))^2))
> > [1] 2.646815e-33
> > >
> > > lm=lm(y~x-1)
> > > y.hat=x%*%lm$coefficients
> > > (r.square=sum((y.hat-mean(y))^2)/sum((y-mean(y))^2))
> > [1] 0.4443356
> > >
> > >  This is the biggest model, but its R2 is not the
> biggest,
> > why?
> > > lm=lm(y~x)
> > > y.hat=cbind(rep(1,length(y)),x)%*%lm$coefficients
> > > (r.square=sum((y.hat-mean(y))^2)/sum((y-mean(y))^2))
> > [1] 0.2704789
> >
> >
> > -(2)---  R2  can greater than 1
> >
> > > x=rnorm(10)
> > > y=runif(10)
> > > lm=lm(y~x-1)
> > > y.hat=x*lm$coefficients
> > > (r.square=sum((y.hat-mean(y))^2)/sum((y-mean(y))^2))
> > [1] 3.513865
> >
> >
> >  -(3)--- How is R2 in summary(lm(y~x-1))$r.squared
> > calculated? It is different from (r.square=sum((y.hat-mean
> > (y))^2)/sum((y-mean(y))^2))
> > > x=matrix(rnorm(20),ncol=2)
> > > xx=cbind(rep(1,10),x)
> > > y=x%*%c(1,2)+rnorm(10)
> > > ### r2 calculated by lm(y~x)
> > > lm=lm(y~x)
> > > summary(lm)$r.squared
> > [1] 0.9231062
> > > ### r2 calculated by lm(y~xx-1)
> > > lm=lm(y~xx-1)
> > > summary(lm)$r.squared
> > [1] 0.9365253
> > > ### r2 calculated by me
> > > y.hat=xx%*%lm$coefficients
> > > (r.square=sum((y.hat-mean(y))^2)/sum((y-mean(y))^2))
> > [1] 0.9231062
> >
> >
> > Thanks a lot for any cue:)
> >
> >
> >
> >
> > --
> > Junjie Li,  [EMAIL PROTECTED]
> > Undergranduate in DEP of Tsinghua University,
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>
> --
> Paul Lynch
> Aquilent, Inc.
> National Library of Medicine (Contractor)
>



-- 
Junjie Li,  [EMAIL PROTECTED]
Undergranduate in DEP of Tsinghua University,

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] displaying intensity through opacity on an image (ONE SOLUTION)

2007-05-18 Thread Ranjan Maitra

Dear list, 

I did not get any response yet, but after looking around R and other things, I 
came up with something that works.

Basically, I use the rgb() function in R [though I could also use the hsv() 
function] to help me with the colormap.

Anyway, doing a help on rgb gives:

 This function creates "colors" corresponding to the given
 intensities (between 0 and 'max') of the red, green and blue
 primaries.

 An alpha transparency value can also be specified (0 means fully
 transparent and 'max' means opaque). If 'alpha' is not specified,
 an opaque colour is generated.

 The names argument may be used to provide names for the colors.

 The values returned by these functions can be used with a 'col='
 specification in graphics functions or in 'par'.

and later on.

 Semi-transparent colors ('0 < alpha < 1') are supported only on a
 few devices: at the time of writing only on the 'pdf' and (on
 MacOS X) 'quartz' devices.

The hsv() function has a similar point on semi-transparent colors.

Ok, looks promising: I don't use a Mac, and my potential journal does not 
accept .pdf, only .tiff or .eps, but we are not totally lost here.

So, I tried the following silly example in R:

> pdf()

> image( matrix(rep(1:5,5), nr = 5), col = gray(0:16/16)) 

> image( matrix(1:25, nr = 5), col = rgb(rep(1, 15), g=0, b=0, alpha =
  rep(1:15)/15), add = T) # red with different opacities 

> q()

(we are out of R).

And then look at the pdf file created: by default it is Rplots.pdf.

OK, now we can use gimp, simply to convert this to .eps. Alternatively on 
linux, the command pdftops and then psto epsi on it would also work.

Yippee! Isn't R wonderful??

Hope this helps: though others may have known about this before, I certainly 
did not know how to do this in R.

Best wishes,
Ranjan

On Thu, 17 May 2007 19:16:18 -0500 Ranjan Maitra <[EMAIL PROTECTED]> wrote:

> Dear colleagues,
> 
> I have an image which I can display in the greyscale using image. On this 
> image, for some pixels, which I know, I want to display their activity based 
> on a third measure. One way to do that would be to color these differently, 
> and use an opacity measure to display the third measure. An example of what I 
> am trying to do is at:
> 
> http://www.public.iastate.edu/~maitra/papers/mrm02.pdf
> 
> page 26, for instance. There are two different kinds of voxels, given by 
> greens and red. At the low end, there is transparency on the red scale and at 
> the upper end there is opacity in the red and the green. 
> 
> A simpler example involving only one kind of voxels is on page 24 of the same 
> paper. Either way, that figure was done using Matlab, but I was wondering how 
> do i do this using R.
> 
> Any suggestions, please?
> 
> Many thanks and best wishes,
> Ranjan 
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] my ugly apply/sweep code needs help

2007-05-18 Thread Gabor Grothendieck

Please include test data in your posts.  We define
sweep.med to perform the sweep on an entire matrix.  Then
we lapply f over group.sel where f(g) combines a column of all
g with sweep.med applied to the submatrix of data.mat whose
rows correspond to group.vec of g.

sweep.median2 <- function(data.mat, group.vec, group.sel) {
   sweep.med <- function(x) sweep(x, 2, apply(x, 2, median))
   f <- function(g) cbind(g+0, sweep.med(data.mat[group.vec == g,,drop
= FALSE ]))
   do.call(rbind, lapply(group.sel, f))
}

# test
mat <- matrix(1:24, 6)
group.sel <- 1:2
group.vec <- rep(1:3, 2)

sweep.median(data.mat, group.vec, group.sel)
sweep.median2(data.mat, group.vec, group.sel)


On 5/18/07, Tyler Smith <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I have a matrix of data from from several groups. I need to center the
> data by group, subtracting the group median from each value, initially
> for two groups at a time. I have a working function to do this, but it
> looks quite inelegant. There must be a more straightforward way to do
> this, but I always get tangled up in apply/sweep/subset
> operations. Any suggestions welcome!
>
> Thanks,
>
> Tyler
>
> My code:
>
> Notes: data.mat is an nxm matrix of data. group.vec is a vector of
> length n with grouping factors. group.sel is a vector of length 2 of
> the groups to include in the analysis.
>
> sweep.median <- function (data.mat, group.vec, group.sel) {
>
>  data.sub1 <- data.mat[group.vec %in% group.sel[1],]
>  data.sub2 <- data.mat[group.vec %in% group.sel[2],]
>
>  data.sub1.med <- apply(data.sub1, MAR=2, median)
>  data.sub1.cent <- sweep(data.sub1, MARGIN=2, data.sub1.med)
>
>  data.sub2.med <- apply(data.sub2, MAR=2, median)
>  data.sub2.cent <- sweep(data.sub2, MARGIN=2, data.sub2.med)
>
>  data.comb <- rbind(data.sub1.cent, data.sub2.cent)
>  data.comb <- cbind(c(rep(group.sel[1],nrow(data.sub1.cent)),
>   rep(group.sel[2],nrow(data.sub2.cent))),
> data.comb)
>
>  return(data.comb)
> }
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] graphics on Ubunut

2007-05-18 Thread Dirk Eddelbuettel

On 18 May 2007 at 20:25, Erin Hodgess wrote:
| Dear R People:
| 
| I'm working with R on the latest version of Ubuntu.
| 
| However, I can't get graphics to appear, even with the
| simplest plot commands.

What does this show for you:

> capabilities()["X11"]
 X11 
TRUE 
> 

If you get 'FALSE', and by chance you built this yourself, then you probably
omitted to install the X11 development packages, and overlooked the warning
that configure gave you.  

You could consider installing the prebuilt Ubuntu binaries that are provided
via CRAN and its mirrors; see the R FAQ.

Hth, Dirk

-- 
Hell, there are no rules here - we're trying to accomplish something. 
  -- Thomas A. Edison

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] penalized maximum likelihood estimator

2007-05-18 Thread rupos sujon

Dear r helper,
are their any package in R in which i can find
penalized maximum likelihood estimator with beta pdf
from generalized extreme value distribution? i tried
to find out the package from CRAN but I could not get
that yet though i found some penalized functions but
those are not my work related.if possible please help
me to find out that package.thanks
S.Murshed   


   
Looking
 for a deal? Find great prices on flights and hotels with Yahoo! FareChase.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] graphics on Ubunut

2007-05-18 Thread Erin Hodgess

Dear R People:

I'm working with R on the latest version of Ubuntu.

However, I can't get graphics to appear, even with the
simplest plot commands.

Has anyone run into that, please?

I'm using R-2.5.0 on Ubuntu Feisty Fawn.

(Please don't puke.  That's really the name)

Thanks in advance,
Sincerely,
Erin Hodgess
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How can we add a legend to a set of graphs?

2007-05-18 Thread Duncan Murdoch

On 18/05/2007 7:33 PM, Judith Flores wrote:
> Hi,
> 
>I have a set of 4 graphs and I need to add a legend
> that is shared by those 4 graphs. This is what I
> tried:
> 
>> locator(1) # I placed the cursor in the center of the
> 4 graphs
> $x
> [1] 9.299001
> 
> $y
> [1] 226.3201
> 
> 
>> legend(9.3,226.3,"and the rest of the legend
> arguments")# but the legend didn't show.
> 
> The legend only appears when I place in inside any
> of the for plots. How can I place it outside these
> plots, in the center.

RSiteSearch("legend outside")

suggests using par(xpd=TRUE).

Duncan Murdoch

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] my ugly apply/sweep code needs help

2007-05-18 Thread Tyler Smith

Hi,

I have a matrix of data from from several groups. I need to center the
data by group, subtracting the group median from each value, initially
for two groups at a time. I have a working function to do this, but it
looks quite inelegant. There must be a more straightforward way to do
this, but I always get tangled up in apply/sweep/subset
operations. Any suggestions welcome!

Thanks,

Tyler

My code:

Notes: data.mat is an nxm matrix of data. group.vec is a vector of
length n with grouping factors. group.sel is a vector of length 2 of
the groups to include in the analysis.

sweep.median <- function (data.mat, group.vec, group.sel) {

  data.sub1 <- data.mat[group.vec %in% group.sel[1],]
  data.sub2 <- data.mat[group.vec %in% group.sel[2],]

  data.sub1.med <- apply(data.sub1, MAR=2, median)
  data.sub1.cent <- sweep(data.sub1, MARGIN=2, data.sub1.med)

  data.sub2.med <- apply(data.sub2, MAR=2, median)
  data.sub2.cent <- sweep(data.sub2, MARGIN=2, data.sub2.med)

  data.comb <- rbind(data.sub1.cent, data.sub2.cent)
  data.comb <- cbind(c(rep(group.sel[1],nrow(data.sub1.cent)),
   rep(group.sel[2],nrow(data.sub2.cent))),
 data.comb)

  return(data.comb)
}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How can we add a legend to a set of graphs?

2007-05-18 Thread Judith Flores

Hi,

   I have a set of 4 graphs and I need to add a legend
that is shared by those 4 graphs. This is what I
tried:

>locator(1) # I placed the cursor in the center of the
4 graphs
$x
[1] 9.299001

$y
[1] 226.3201


>legend(9.3,226.3,"and the rest of the legend
arguments")# but the legend didn't show.

The legend only appears when I place in inside any
of the for plots. How can I place it outside these
plots, in the center.

   Your help will be very much appreciated.

Sincerely,

Judith 


   
Get
 the Yahoo! toolbar and be alerted to new email wherever you're surfing.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] BATCH

2007-05-18 Thread Stefan Grosse

it states that it does not find the object

donParEssai


which is not there. It is nor there since you have given the object the
name

donParCara

from what I see... 



elyakhlifi mustapha wrote:
> hello,
> I tried to run programs in BATCH like you told me but to read the results 
> it's a lil hard 
> first to read the results it's important to write down an outfile
> but when I do this I've got stil the same answer
>
> R version 2.4.1 (2006-12-18)
> Copyright (C) 2006 The R Foundation for Statistical Computing
> ISBN 3-900051-07-0
> R est un logiciel libre livré sans AUCUNE GARANTIE.
> Vous pouvez le redistribuer sous certaines conditions.
> Tapez 'license()' ou 'licence()' pour plus de détails.
> R est un projet collaboratif avec de nombreux contributeurs.
> Tapez 'contributors()' pour plus d'information et
> 'citation()' pour la façon de le citer dans les publications.
> Tapez 'demo()' pour des démonstrations, 'help()' pour l'aide
> en ligne ou 'help.start()' pour obtenir l'aide au format HTML.
> Tapez 'q()' pour quitter R.
> [Sauvegarde de la session précédente restaurée]
>   
>> #programme pour calculer les caractères
>>
>> donParCara <- read.table("C:/Documents and Settings/melyakhlifi/Mes 
>> documents/feuilles 
>> excel/copi_donnees6.csv",header=TRUE,sep=";",quote="",dec=",")
>> #print(subset(donParCara, select = c(Id_Cara,Form_C,X)))
>>
>>
>>
>> #valeurs observées pour les caractères observés
>>
>> C103 <- as.numeric(as.character(subset(donParEssai, Id_Essai == 1006961 & 
>> Id_Cara == 103, c(Val_O,Surf_O))[,1]))
>> 
> Erreur dans subset(donParEssai, Id_Essai == 1006961 & Id_Cara == 103, 
> c(Val_O,  : 
>  objet "donParEssai" non trouvé
> Exécution arrêtée
>
> I thnik that it's a problem in the options but  I don't know how to do 
> correct the errors.
>
>
>   
> _ 
> Ne gardez plus qu'une seule adresse mail ! Copiez vos mails vers Yahoo! Mail 
>   [[alternative HTML version deleted]]
>
>   
> 
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>   
> 
>
> No virus found in this incoming message.
> Checked by AVG Free Edition. 
> Version: 7.5.467 / Virus Database: 269.7.3/809 - Release Date: 17.05.2007 
> 17:18
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Trouble compiling XML package [Broadcast]

2007-05-18 Thread Wiener, Matthew

Oops - I see I was unclear.  I'm using R-2.4.1 to install; 1.9.0 is the
version of XML I was trying to load.  I'll try your change - I'm really
using this for reading XML.

Thanks,
Matt

-Original Message-
From: Duncan Temple Lang [mailto:[EMAIL PROTECTED] 
Sent: Friday, May 18, 2007 4:18 PM
To: Wiener, Matthew
Cc: R-help
Subject: Re: [R] Trouble compiling XML package [Broadcast]



Wiener, Matthew wrote:
> Dear Prof. Lang  - 
> 
> I am trying to install the XML library on a 64-bit SUSE linux system
> (version info below) running 2.4.1.
> 
> I have gcc version 3.3.3, and libxml2 version 2.6.7.  I know this is
not
> current, but I'm on a machine used and administered by others, and
> updating libxml2 would require updating libc, and things get pretty
> complicated from there.
> 
> Trying to install through R (1.9-0), 

Wow, that's really old!


> I eventually get the error:
> XMLTree.c: In function `xmlBufferWrite': 
> XMLTree.c:729: error: void value not ignored as it ought to be 
> make: *** [XMLTree.o] Error 1 
> 


You might try changing the routine xmlBufferWrite to

static int
xmlBufferWrite (void * context, const char * buffer, int len) {
  xmlBufferAdd((xmlBufferPtr) context, (const xmlChar *) buffer,len);
  return(len);
}

That will work if there are no errors when adding to the buffer.
But this is used when generating XML in a particular way, i.e. using
internal nodes.  So let's just hope that you don't invoke
this code and if you do, that there are no errors.


> I manually downloaded version 1.8-0 and got the same problem.  I took
a
> look at that part of the code, but do not understand enough to start
> tinkering with it.  I was able to install an earlier version a couple
of
> years ago, and it was extremely useful (thanks!) but the relevant
> machine has been decommissioned.
> 
> Can you make any suggestions about which component of my system this
> might indicate needs to be changed?  I checked the mailing list
> archives, but didn't find anything.  I'm hoping there's an alternative
> to changing libxml2, with all the cascading requirements that would
> bring (and no guarantee, with what I know now, that that's the
problem).
> 
> Thanks,
> 
> Matt Wiener
> 
>> version
>_
> platform   x86_64-unknown-linux-gnu
> arch   x86_64
> os linux-gnu
> system x86_64, linux-gnu
> status
> major  2
> minor  4.1
> year   2006
> month  12
> day18
> svn rev40228
> language   R
> version.string R version 2.4.1 (2006-12-18)
> 
>

--
> Notice:  This e-mail message, together with any
attachments,...{{dropped}}
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




--
Notice:  This e-mail message, together with any attachments,...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Trouble compiling XML package

2007-05-18 Thread Duncan Temple Lang



Wiener, Matthew wrote:
> Dear Prof. Lang  - 
> 
> I am trying to install the XML library on a 64-bit SUSE linux system
> (version info below) running 2.4.1.
> 
> I have gcc version 3.3.3, and libxml2 version 2.6.7.  I know this is not
> current, but I'm on a machine used and administered by others, and
> updating libxml2 would require updating libc, and things get pretty
> complicated from there.
> 
> Trying to install through R (1.9-0), 

Wow, that's really old!


> I eventually get the error:
> XMLTree.c: In function `xmlBufferWrite': 
> XMLTree.c:729: error: void value not ignored as it ought to be 
> make: *** [XMLTree.o] Error 1 
> 


You might try changing the routine xmlBufferWrite to

static int
xmlBufferWrite (void * context, const char * buffer, int len) {
  xmlBufferAdd((xmlBufferPtr) context, (const xmlChar *) buffer,len);
  return(len);
}

That will work if there are no errors when adding to the buffer.
But this is used when generating XML in a particular way, i.e. using
internal nodes.  So let's just hope that you don't invoke
this code and if you do, that there are no errors.


> I manually downloaded version 1.8-0 and got the same problem.  I took a
> look at that part of the code, but do not understand enough to start
> tinkering with it.  I was able to install an earlier version a couple of
> years ago, and it was extremely useful (thanks!) but the relevant
> machine has been decommissioned.
> 
> Can you make any suggestions about which component of my system this
> might indicate needs to be changed?  I checked the mailing list
> archives, but didn't find anything.  I'm hoping there's an alternative
> to changing libxml2, with all the cascading requirements that would
> bring (and no guarantee, with what I know now, that that's the problem).
> 
> Thanks,
> 
> Matt Wiener
> 
>> version
>_
> platform   x86_64-unknown-linux-gnu
> arch   x86_64
> os linux-gnu
> system x86_64, linux-gnu
> status
> major  2
> minor  4.1
> year   2006
> month  12
> day18
> svn rev40228
> language   R
> version.string R version 2.4.1 (2006-12-18)
> 
> --
> Notice:  This e-mail message, together with any attachments,...{{dropped}}
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ayuda con macros

2007-05-18 Thread Dony Henry Clavel Quijada

necesito que me manden lo mas pronto posible la macro que ejecuta la accion del 
modelo lineal general 'glm', para una familia binomial, con funcion de enlace 
la probit y la lo-log complementaria, bueno tengo la macro para lafuncion de 
enlace logit es:

# macro logit
macro.logit <- function( y, m, X, b.0, mu, niter )
{
# Inicializaciones (BOPTIM = B0 o ) MU = y/m C5=g(y/m)=ETA
# y Observaciones de ocurrencias
# Considerar MU=PI, es decir Y/M seguido B(M,PI)/M
# Recordar V(y(NITER))=MUx(1-MU)/m
# X es la matriz de diseño
c1 <- y/m
mu <- c1
c5 <- log(c1/(1-c1)) # C5 = g(y/M) FUNCIÓN DE ENLACE
for (i in 1:niter)
{
c1 <- mu # C1 = MU
c2 <-((1/c1)*(1/(1-c1)))^2   # C2 = g'(C1)**2 (g'(MU)**2)
c3 <-c1*(1-c1)*c2/m  # C3 = var(C1)*C2=(C1(1-C1)/M)*C2 
=1/wkk) Depende
# de M y MU !!!
c4 <- 1/c3 # C4 = 1/C3 (=wkk) (Diagonal de W)
M6 <- as.matrix(diag(c4)) # M6 = W
c6 <- c5 # C6 = ETA (C5)
c7 <- (1/c1)*(1/(1-c1)) # C7 = g'(C1) (= g'(MU))
c8 <- y
c8 <- (c8/m)-c1# C8 = y/m-MU
c9 <- c6+c7*c8 # C9 = C6 + C7*C8 (z=ETA+g'(MU)*(Y/M-MU))
M11 <- t(X)%*%M6 %*% as.vector(c9) # M11 = XTWz
M12 <- t(X)%*%M6 %*% X # M12 = XTWX
M13 <- solve( M12 ) # M13 = (XTWX)-1
# browser() # > n # para ejecutar siguiente línea de comandos
b.fi <- as.vector( M13 %*% M11 ) # BOPTIM = (XTWX)-1 XTWz
c5 <- as.vector(X %*% b.fi) # C5 = ETA = X*BOPTIM (=Xb)
mu <- exp(c5)/(1+exp(c5)) # MU = LINK-1(C5)
}
list(estimadors=b.fi,prediccions=mu*m)
}

b.0 <- as.vector(c(0,0))
mu <- as.vector(rep(0.1,6))
X <- as.matrix(data.frame(dobson$uns,dobson$log.x))
niter<-30
exe1 <- macro.logit( dobson$y, dobson$m, X, b.0, mu, 30 )
exe1

# LA INSTRUCCION DIRECTA EN R ES:
rexe1 <- glm(cbind(y,m-y)~log.x,data=dobson, family=binomial)
rexe1


bueno esto como un ejemplo de lo que quiero. Estare infinitamente agradecido.

 __



[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] partial correlation significance

2007-05-18 Thread gatemaze

On 18/05/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
>
> Hi,
>
> among the many (5) methods that I found in the list to do partial
> correlation in the following two that I had a look I am getting different
> t-values. Does anyone have any clues on why is that? The source code is
> below. Thanks.
>
> pcor3 <- function (x, test = T, p = 0.05) {
>   nvar <- ncol(x)
>   ndata <- nrow(x)
>   conc <- solve(cor(x))
>   resid.sd <- 1/sqrt(diag(conc))
>   pcc <- -sweep(sweep(conc, 1, resid.sd, "*"), 2, resid.sd, "*")
>   #colnames(pcc) <- rownames(pcc) <- colnames(x)
>   if (test) {
> t.df <- ndata - nvar
> t <- pcc/sqrt((1 - pcc^2)/t.df)
> print(t);
> pcc <- list(coefs = pcc, sig = t > qt(1 - (p/2), df = t.df))
>   }
>   return(pcc)
> }
>
>
> pcor4 <- function(x, y, z) {
>   return(cor.test (lm(x~z)$resid,lm(y~z)$resid));
> }
>
>

Just to self-reply my question since I found the answer. The difference is
in the degrees of freedom. The variable t.df in pcor3 is smaller than the df
used for the test in pcor4, and how smaller it is depends on the number of
variables used in the partial correlation.

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to extract R codes that embedded in a HTML file

2007-05-18 Thread Tao Shi


Works perfectly!

Thank you very much, Fritz!

Tao



From: Friedrich Leisch <[EMAIL PROTECTED]>
To: Tao Shi <[EMAIL PROTECTED]>, [EMAIL PROTECTED],
[EMAIL PROTECTED], r-help@stat.math.ethz.ch

Subject: Re: [R] How to extract R codes that embedded in a HTML file
Date: Fri, 18 May 2007 15:14:48 +0200


  >  Original Message 
  > Subject: Re: [R] How to extract R codes that embedded in a HTML file
  > usingStangle?
  > Date: Thu, 17 May 2007 17:01:30 +
  > From: Tao Shi <[EMAIL PROTECTED]>
  > To: [EMAIL PROTECTED], [EMAIL PROTECTED]
  > CC: r-help@stat.math.ethz.ch

  > : <[EMAIL PROTECTED]>

  > Hi Uwe,

  > Thanks for the answer, but I still need a bit more clearification.  I
  > always
  > thought that a .rnw or .snw file is a file mixing word processing 
markup

  > (e.g. tex or HTML) and R/S code using noweb syntax.  Is the reason for
  > 'Stangle' is not working with .rnw file with HTML due to there is no 
proper
  > driver available (like RweaveHTML driver for Sweave)?  If yes, does 
R2HTML

  > package have plans to provide a such driver?

  > Tao


The following does the trick for me:

R> mytangle <- function ()
  list(setup = RtangleSetup, runcode = utils:::RtangleRuncode,
   writedoc = RtangleWritedoc,
   finish = utils:::RtangleFinish, checkopts = RweaveHTMLOptions)

R> Stangle("/PATH/TO/R/SITE-LIBRARY/R2HTML/samples/example1.snw",
   driver=mytangle)
Writing to file example1.R


Best,
Fritz




_
PC Magazines 2007 editors choice for best Web mailaward-winning Windows 
Live Hotmail.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] {10,20,30}>={25,30,15}

2007-05-18 Thread Kyle.

Van---

Perhaps I'm misunderstanding your question, but in a null hypothesis  
framework, the only conclusion you can draw from failing to reject  
the null hypothesis is that, based on your observed data,  you were  
unable to conclude that your null hypothesis was false.  Put another  
way, the correct conclusion for both of your hypothesis tests is  
"inconclusive."

Kyle H. Ambert
Graduate Student, Dept. Behavioral Neuroscience
Oregon Health & Science University
[EMAIL PROTECTED]

On May 18, 2007, at 11:07 AM, [EMAIL PROTECTED] wrote:

> Hi There,
>
> Using t.test to test hypothesis about which one is greater, A or B?
> where A={10,20,30},B={25,30,15}.
>
> My question is which of the following conclusions is right?
>
> #hypothesis testing 1
>
> h0: A greater than or equal to B
> h1: A less than B
>
> below is splus code
> A=c(10,20,30)
> B=c(25,30,15)
> t.test(c(10,20,30),c(25,30,15),alternative="less")
>
> output:
> p-value=0.3359
>
> because p-value is not less than alpha (0.05), we
> cannot reject h0.
>
> so A greater than or equal to B.
>
>
> #hypothesis testing 2
>
> h0: A less than or equal to B
> h1: A greater than B
>
> below is splus code
>
> A=c(10,20,30)
> B=c(25,30,15)
> t.test(c(10,20,30),c(25,30,15),alternative="greater")
>
> output:
> p-value=0.6641
>
> because p-value is not less than alpha (0.05), we
> cannot reject h0.
>
> so A less than or euqal to B.
> #
>
> Thank you very much.
> Van
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting- 
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] {10,20,30}>={25,30,15}

2007-05-18 Thread francogrex


At the alpha level you set, A is neither greater nor less than B. Supposing
you don't use paired t.test: 
data:  c(10, 20, 30) and c(25, 30, 15) 
t = -0.4588, df = 3.741, p-value = 0.6717
alternative hypothesis: true difference in means is not equal to 0 
95 percent confidence interval:
 -24.06675  17.40008 
sample estimates:
mean of x mean of y 
 20.0  23.3 

it tells you what is the null and what is the alternative.



genomenet wrote:
> 
> Hi There,
> Using t.test to test hypothesis about which one is greater, A or B?
> where A={10,20,30},B={25,30,15}.
> My question is which of the following conclusions is right?
> #hypothesis testing 1
> h0: A greater than or equal to B
> h1: A less than B
> below is splus code
> A=c(10,20,30)
> B=c(25,30,15)
> t.test(c(10,20,30),c(25,30,15),alternative="less")
> output:
> p-value=0.3359
> because p-value is not less than alpha (0.05), we
> cannot reject h0.
> so A greater than or equal to B.
> #hypothesis testing 2
> h0: A less than or equal to B
> h1: A greater than B
> below is splus code
> A=c(10,20,30)
> B=c(25,30,15)
> t.test(c(10,20,30),c(25,30,15),alternative="greater")
> output:
> p-value=0.6641
> because p-value is not less than alpha (0.05), we
> cannot reject h0.
> so A less than or euqal to B.
> #
> Thank you very much.
> Van
> 

-- 
View this message in context: 
http://www.nabble.com/%7B10%2C20%2C30%7D%3E%3D%7B25%2C30%2C15%7D-tf3779346.html#a10688603
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] partial correlation significance

2007-05-18 Thread gatemaze

Hi,

among the many (5) methods that I found in the list to do partial
correlation in the following two that I had a look I am getting different
t-values. Does anyone have any clues on why is that? The source code is
below. Thanks.

pcor3 <- function (x, test = T, p = 0.05) {
  nvar <- ncol(x)
  ndata <- nrow(x)
  conc <- solve(cor(x))
  resid.sd <- 1/sqrt(diag(conc))
  pcc <- -sweep(sweep(conc, 1, resid.sd, "*"), 2, resid.sd, "*")
  #colnames(pcc) <- rownames(pcc) <- colnames(x)
  if (test) {
t.df <- ndata - nvar
t <- pcc/sqrt((1 - pcc^2)/t.df)
print(t);
pcc <- list(coefs = pcc, sig = t > qt(1 - (p/2), df = t.df))
  }
  return(pcc)
}


pcor4 <- function(x, y, z) {
  return(cor.test(lm(x~z)$resid,lm(y~z)$resid));
}

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Trouble compiling XML package

2007-05-18 Thread Wiener, Matthew


Dear Prof. Lang  - 

I am trying to install the XML library on a 64-bit SUSE linux system
(version info below) running 2.4.1.

I have gcc version 3.3.3, and libxml2 version 2.6.7.  I know this is not
current, but I'm on a machine used and administered by others, and
updating libxml2 would require updating libc, and things get pretty
complicated from there.

Trying to install through R (1.9-0), I eventually get the error:
XMLTree.c: In function `xmlBufferWrite': 
XMLTree.c:729: error: void value not ignored as it ought to be 
make: *** [XMLTree.o] Error 1 

I manually downloaded version 1.8-0 and got the same problem.  I took a
look at that part of the code, but do not understand enough to start
tinkering with it.  I was able to install an earlier version a couple of
years ago, and it was extremely useful (thanks!) but the relevant
machine has been decommissioned.

Can you make any suggestions about which component of my system this
might indicate needs to be changed?  I checked the mailing list
archives, but didn't find anything.  I'm hoping there's an alternative
to changing libxml2, with all the cascading requirements that would
bring (and no guarantee, with what I know now, that that's the problem).

Thanks,

Matt Wiener

> version
   _
platform   x86_64-unknown-linux-gnu
arch   x86_64
os linux-gnu
system x86_64, linux-gnu
status
major  2
minor  4.1
year   2006
month  12
day18
svn rev40228
language   R
version.string R version 2.4.1 (2006-12-18)

--
Notice:  This e-mail message, together with any attachments,...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] {10,20,30}>={25,30,15}

2007-05-18 Thread genomenet

Hi There,

Using t.test to test hypothesis about which one is greater, A or B?
where A={10,20,30},B={25,30,15}.

My question is which of the following conclusions is right?

#hypothesis testing 1

h0: A greater than or equal to B
h1: A less than B

below is splus code
A=c(10,20,30)
B=c(25,30,15)
t.test(c(10,20,30),c(25,30,15),alternative="less")

output:
p-value=0.3359

because p-value is not less than alpha (0.05), we
cannot reject h0.

so A greater than or equal to B.


#hypothesis testing 2

h0: A less than or equal to B
h1: A greater than B

below is splus code

A=c(10,20,30)
B=c(25,30,15)
t.test(c(10,20,30),c(25,30,15),alternative="greater")

output:
p-value=0.6641

because p-value is not less than alpha (0.05), we
cannot reject h0.

so A less than or euqal to B.
#

Thank you very much.
Van

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lapply not reading arguments from the correct environment

2007-05-18 Thread Gabor Grothendieck

On 5/18/07, jiho <[EMAIL PROTECTED]> wrote:
> On 2007-May-18  , at 18:21 , Gabor Grothendieck wrote:
> > In particular, we can use "[" directly instead of subset.  This is the
> > same as your function except for the line marked ### :
> >
> > myfun2 <- function() {
> >   foo = data.frame(1:10,10:1)
> >   foos = list(foo)
> >   fooCollumn=2
> >   cFoo = lapply(foos, "[", fooCollumn) ###
> >   return(cFoo)
> > }
> > myfun2() # test
> >
> > On 5/18/07, Prof Brian Ripley <[EMAIL PROTECTED]> wrote:
> >> You need to study carefully what the semantics of 'subset' are.  The
> >> function body of myfun is not in the evaluation environment.  (The
> >> issue
> >> is 'subset', not 'lapply': select is an *expression* and not a
> >> value.)
> >>
> >> Hint: using subset() programmatically is almost always a mistake.
> >> R's
> >> subsetting function is '[': subset is a convenience wrapper.
>
> Thank you very much. Indeed it is much better this way. I got used to
> subset for data.frames because [ does not work with negative named
> arguments while select does. E.g.:
>x[,-c("name1","name2")]
> does not work while
>subset(x,select=-c("name1","name2"))
> works (it eliminates columns named name1 and name 2 from x). But I
> guess in most cases an other syntax can achieve the same thing with
> [, like:
>x[,-which(names(x)%in%c("name1","name2"))]
> it's just a little less clear.

which is not needed.  Using builtin CO2:

  CO2[ ! names(CO2) %in% c("Type", "conc" ]

or

  CO2[ setdiff(names(CO2), c("Type", "conc")) ]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to repress the annoying complains from X window system

2007-05-18 Thread Marc Schwartz

You might want to find out how R was installed and suggest the he/she
installs the latest version of R. 

If from a pre-compiled binary, the update will likely be 2.5.0 and if
from source, be sure that they install using the latest R 2.5.0 Patched
source tarball from:

  ftp://ftp.stat.math.ethz.ch/Software/R/

In addition, have them run, as 'root', update.packages() from with an R
session started using:

  R --vanilla

to update any installed packages that might be causing a conflict.

HTH,

Marc

On Fri, 2007-05-18 at 13:14 -0400, Hao Liu wrote:
> Thanks for the input, however, I am using R 2.4.0. I don't know how the 
> SysAdmin installed or configured it though.
> 
> I am not running Rcmdr, I am developing R GUI applications using Tcl/Tk 
> package, for some weird reason, those messages comes and goes...
> 
> Thanks
> Hao
> 
> Marc Schwartz wrote:
> 
> >On Fri, 2007-05-18 at 11:25 -0400, Hao Liu wrote:
> >  
> >
> >>Dear All:
> >>
> >>I am running some GUI functions in linux environment, they runs fine, 
> >>however I constantly get this kind of message in R console:
> >>
> >>Warning: X11 protocol error: BadWindow (invalid Window parameter)
> >>
> >>Is there a way to repress it? Or am I doing something wrong here.. it 
> >>does not interfere with the running of fucntion though.
> >>
> >>Thanks
> >>Hao
> >>
> >>
> >
> >Upgrade your version of R.
> >
> >You have not provided sufficient details, but if I had to guess, you are
> >either running RCmdr or using other tcl/tk based widgets.
> >
> >If correct, the error message that you are seeing was fixed back in R
> >2.4.0:
> >
> >oThe X11() device no longer produces (apparently spurious)
> > 'BadWindow (invalid Window parameter)' warnings when run from
> > Rcmdr.
> >
> >HTH,
> >
> >Marc Schwartz
> >
> >
> >  
> >
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to repress the annoying complains from X window system

2007-05-18 Thread Hao Liu

Thanks for the input, however, I am using R 2.4.0. I don't know how the 
SysAdmin installed or configured it though.

I am not running Rcmdr, I am developing R GUI applications using Tcl/Tk 
package, for some weird reason, those messages comes and goes...

Thanks
Hao

Marc Schwartz wrote:

>On Fri, 2007-05-18 at 11:25 -0400, Hao Liu wrote:
>  
>
>>Dear All:
>>
>>I am running some GUI functions in linux environment, they runs fine, 
>>however I constantly get this kind of message in R console:
>>
>>Warning: X11 protocol error: BadWindow (invalid Window parameter)
>>
>>Is there a way to repress it? Or am I doing something wrong here.. it 
>>does not interfere with the running of fucntion though.
>>
>>Thanks
>>Hao
>>
>>
>
>Upgrade your version of R.
>
>You have not provided sufficient details, but if I had to guess, you are
>either running RCmdr or using other tcl/tk based widgets.
>
>If correct, the error message that you are seeing was fixed back in R
>2.4.0:
>
>o  The X11() device no longer produces (apparently spurious)
>   'BadWindow (invalid Window parameter)' warnings when run from
>   Rcmdr.
>
>HTH,
>
>Marc Schwartz
>
>
>  
>

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lapply not reading arguments from the correct environment

2007-05-18 Thread jiho

On 2007-May-18  , at 18:21 , Gabor Grothendieck wrote:
> In particular, we can use "[" directly instead of subset.  This is the
> same as your function except for the line marked ### :
>
> myfun2 <- function() {
>   foo = data.frame(1:10,10:1)
>   foos = list(foo)
>   fooCollumn=2
>   cFoo = lapply(foos, "[", fooCollumn) ###
>   return(cFoo)
> }
> myfun2() # test
>
> On 5/18/07, Prof Brian Ripley <[EMAIL PROTECTED]> wrote:
>> You need to study carefully what the semantics of 'subset' are.  The
>> function body of myfun is not in the evaluation environment.  (The  
>> issue
>> is 'subset', not 'lapply': select is an *expression* and not a  
>> value.)
>>
>> Hint: using subset() programmatically is almost always a mistake.   
>> R's
>> subsetting function is '[': subset is a convenience wrapper.

Thank you very much. Indeed it is much better this way. I got used to  
subset for data.frames because [ does not work with negative named  
arguments while select does. E.g.:
x[,-c("name1","name2")]
does not work while
subset(x,select=-c("name1","name2"))
works (it eliminates columns named name1 and name 2 from x). But I  
guess in most cases an other syntax can achieve the same thing with  
[, like:
x[,-which(names(x)%in%c("name1","name2"))]
it's just a little less clear.
Thanks again.

JiHO
---
http://jo.irisson.free.fr/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] A programming question

2007-05-18 Thread Marc Schwartz

On Fri, 2007-05-18 at 09:01 -0700, Anup Nandialath wrote:
> Dear Friends,
> 
> My problem is related to how to measure probabilities from a probit
> model by changing one independent variable keeping the others
> constant. 
> 
> A simple toy example is like this
> 
> Range for my variables is defined as follows
> 
> y=0 or 1,  x1 = -10 to 10, x2=-40 to 100, x3 = -5 to 5
> 
> Model
> 
> output <- glim(y ~ x1+x2+x3 -1, family=binomial(link="probit"))
> outcoef <- output$coef
> xbeta <- as.matrix(cbind(x1, x2, x3)
> 
> predprob <- pnorm(xbeta%*%outcoef)
> 
> now I have the predicted probabilities for y=1 as defined above. My
> problem is as follows
> 
> Keep X2 at 20 and X3 at 2. Then compute the predicted probability
> (predprob) for the entire range of X1 ie from -10 to 10 with an
> increment of 1.
> 
> Therefore i need the predicted probabilities when x1=-10,
> x1=-9,x1=9, x1=10 keeping the other constant. 
> 
> Can somebody give me some direction on how this can be programmed. 
> 
> Thanks in advance for your help
> 
> Sincerely
> 
> Anup

Anup,

What glim() function are you using? 

Or is that a typo and should be glm()?

In either case, take a look at ?predict.glm which takes your fitted
glm() model and generates predicted values based upon specifying a data
frame ('newdata' argument) containing new values.

Be sure that your 'newdata' data frame contains the same columns AND
names as the data used to fit the model.

So you could do something like:

  newdata <- data.frame(X2 = 20, X3 = 2, X1 = -10:10)

  predict(model, newdata, type = "response")

BTW, if you also want the fitted values for the actual data used to
create the model, you can use fitted(model) rather than doing the matrix
multiplications directly.  See ?fitted for more information.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to extract R codes that embedded in a HTML file

2007-05-18 Thread Friedrich Leisch


  >  Original Message 
  > Subject: Re: [R] How to extract R codes that embedded in a HTML file 
  > using   Stangle?
  > Date: Thu, 17 May 2007 17:01:30 +
  > From: Tao Shi <[EMAIL PROTECTED]>
  > To: [EMAIL PROTECTED], [EMAIL PROTECTED]
  > CC: r-help@stat.math.ethz.ch

  > : <[EMAIL PROTECTED]>

  > Hi Uwe,

  > Thanks for the answer, but I still need a bit more clearification.  I 
  > always
  > thought that a .rnw or .snw file is a file mixing word processing markup
  > (e.g. tex or HTML) and R/S code using noweb syntax.  Is the reason for
  > 'Stangle' is not working with .rnw file with HTML due to there is no proper
  > driver available (like RweaveHTML driver for Sweave)?  If yes, does R2HTML
  > package have plans to provide a such driver?

  > Tao


The following does the trick for me:

R> mytangle <- function ()
  list(setup = RtangleSetup, runcode = utils:::RtangleRuncode,
   writedoc = RtangleWritedoc,
   finish = utils:::RtangleFinish, checkopts = RweaveHTMLOptions)

R> Stangle("/PATH/TO/R/SITE-LIBRARY/R2HTML/samples/example1.snw",
   driver=mytangle)
Writing to file example1.R


Best,
Fritz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] A programming question

2007-05-18 Thread Anup Nandialath

Dear Friends,

My problem is related to how to measure probabilities from a probit model by 
changing one independent variable keeping the others constant. 

A simple toy example is like this

Range for my variables is defined as follows

y=0 or 1,  x1 = -10 to 10, x2=-40 to 100, x3 = -5 to 5

Model

output <- glim(y ~ x1+x2+x3 -1, family=binomial(link="probit"))
outcoef <- output$coef
xbeta <- as.matrix(cbind(x1, x2, x3)

predprob <- pnorm(xbeta%*%outcoef)

now I have the predicted probabilities for y=1 as defined above. My problem is 
as follows

Keep X2 at 20 and X3 at 2. Then compute the predicted probability (predprob) 
for the entire range of X1 ie from -10 to 10 with an increment of 1.

Therefore i need the predicted probabilities when x1=-10, x1=-9,x1=9, x1=10 
keeping the other constant. 

Can somebody give me some direction on how this can be programmed. 

Thanks in advance for your help

Sincerely

Anup

   
-
Got a little couch potato? 
Check out fun summer activities for kids.
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lapply not reading arguments from the correct environment

2007-05-18 Thread Gabor Grothendieck

In particular, we can use "[" directly instead of subset.  This is the
same as your function except for the line marked ### :

myfun2 <- function() {
   foo = data.frame(1:10,10:1)
   foos = list(foo)
   fooCollumn=2
   cFoo = lapply(foos, "[", fooCollumn) ###
   return(cFoo)
}
myfun2() # test

On 5/18/07, Prof Brian Ripley <[EMAIL PROTECTED]> wrote:
> You need to study carefully what the semantics of 'subset' are.  The
> function body of myfun is not in the evaluation environment.  (The issue
> is 'subset', not 'lapply': select is an *expression* and not a value.)
>
> Hint: using subset() programmatically is almost always a mistake.  R's
> subsetting function is '[': subset is a convenience wrapper.
>
> On Fri, 18 May 2007, jiho wrote:
>
> > Hello,
> >
> > I am facing a problem with lapply which I think''' may be a bug.
> > This is the most basic function in which I can reproduce it:
> >
> > myfun <- function()
> > {
> >   foo = data.frame(1:10,10:1)
> >   foos = list(foo)
> >   fooCollumn=2
> >   cFoo = lapply(foos,subset,select=fooCollumn)
> >   return(cFoo)
> > }
> >
> > I am building a list of dataframes, in each of which I want to keep
> > only column 2 (obviously I would not do it this way in real life but
> > that's just to demonstrate the bug).
> > If I execute the commands inline it works but if I clean my
> > environment, then define the function and then execute:
> >   > myfun()
> > I get this error:
> >   Error in eval(expr, envir, enclos) : object "fooCollumn" not found
> > while fooCollumn is defined, in the function, right before lapply. In
> > addition, if I define it outside the function and then execute the
> > function:
> >   > fooCollumn=1
> >   > myfun()
> > it works but uses the value defined in the general environment and
> > not the one defined in the function.
> > This is with R 2.5.0 on both OS X and Linux (Fedora Core 6)
> > What did I do wrong? Is this indeed a bug? An intended behavior?
>
> It is a bug, in your function.
>
> --
> Brian D. Ripley,  [EMAIL PROTECTED]
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford, Tel:  +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272866 (PA)
> Oxford OX1 3TG, UKFax:  +44 1865 272595
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simple programming question

2007-05-18 Thread Bert Gunter

?cut

This would recode to a factor with numeric labels for its levels.
as.numeric(as.character(...))would then convert the labels to numeric values
that you can manipulate. This presumes that the variable you are coding is
numeric and you want to recode by binning the values into ordered bins. 

Bert Gunter
Genentech Nonclinical Statistics

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Lauri Nikkinen
Sent: Friday, May 18, 2007 8:02 AM
To: Gabor Grothendieck
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] Simple programming question

Thank you all for your answers. Actually Gabor's first post was right in
that sense that I wanted to have "low" to all cases which are lower than
second highest. But how about if I want to convert/recode those "high",
"mid" and "low" to numeric to make some calculations, e.g. 3, 1, 0
respectively. How do I have to modify your solutions? I would also like to
apply this solution to many kinds of recoding situations.

-Lauri

2007/5/18, Gabor Grothendieck <[EMAIL PROTECTED]>:
>
> There was a problem in the first line in the case that the highest number
> is not unique within a category.   In this example its not apparent since
> that never occurs.  At any rate, it should be:
>
> f <- function(x) 4 - pmin(3, match(x, sort(unique(x), decreasing = TRUE)))
> factor(ave(dfr$var3, dfr$categ, FUN = f), lab = c("low", "mid", "high"))
>
> Also note that the factor labels were arranged so that
> "low", "mid" and "high" correspond to levels 1, 2 and 3
> respectively.
>
> On 5/18/07, Gabor Grothendieck <[EMAIL PROTECTED]> wrote:
> > Try this.  f assigns 1, 2 and 3 to the highest, second highest and third
> highest
> > within a category.  ave applies f to each category.  Finally we convert
> it to a
> > factor.
> >
> > f <- function(x) 4 - pmin(3, match(x, sort(x, decreasing = TRUE)))
> > factor(ave(dfr$var3, dfr$categ, FUN = f), lab = c("low", "mid", "high"))
> >
> >
> >
> > On 5/18/07, Lauri Nikkinen <[EMAIL PROTECTED]> wrote:
> > > Hi R-users,
> > >
> > > I have a simple question for R heavy users. If I have a data frame
> like this
> > >
> > >
> > > dfr <- data.frame(id=1:16, categ=rep(LETTERS[1:4], 4),
> > > var3=c(8,7,6,6,5,4,5,4,3,4,3,2,3,2,1,1))
> > > dfr <- dfr[order(dfr$categ),]
> > >
> > > and I want to score values or points in variable named "var3"
> following this
> > > kind of logic:
> > >
> > > 1. the highest value of var3 within category (variable named "categ")
> ->
> > > "high"
> > > 2. the second highest value -> "mid"
> > > 3. lowest value -> "low"
> > >
> > > This would be the output of this reasoning:
> > >
> > > dfr$score <-
> > >
>
factor(c("high","mid","low","low","high","mid","mid","low","high","mid","low
","low","high","mid","low","low"))
> > > dfr
> > >
> > > The question is how I do this programmatically in R (i.e. if I have
> 2000
> > > rows in my dfr)?
> > >
> > > I appreciate your help!
> > >
> > > Cheers,
> > > Lauri
> > >
> > >[[alternative HTML version deleted]]
> > >
> > > __
> > > R-help@stat.math.ethz.ch mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> >
>

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to repress the annoying complains from X window system

2007-05-18 Thread Marc Schwartz

On Fri, 2007-05-18 at 11:25 -0400, Hao Liu wrote:
> Dear All:
> 
> I am running some GUI functions in linux environment, they runs fine, 
> however I constantly get this kind of message in R console:
> 
> Warning: X11 protocol error: BadWindow (invalid Window parameter)
> 
> Is there a way to repress it? Or am I doing something wrong here.. it 
> does not interfere with the running of fucntion though.
> 
> Thanks
> Hao

Upgrade your version of R.

You have not provided sufficient details, but if I had to guess, you are
either running RCmdr or using other tcl/tk based widgets.

If correct, the error message that you are seeing was fixed back in R
2.4.0:

o   The X11() device no longer produces (apparently spurious)
'BadWindow (invalid Window parameter)' warnings when run from
Rcmdr.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lapply not reading arguments from the correct environment

2007-05-18 Thread Prof Brian Ripley

You need to study carefully what the semantics of 'subset' are.  The 
function body of myfun is not in the evaluation environment.  (The issue 
is 'subset', not 'lapply': select is an *expression* and not a value.)

Hint: using subset() programmatically is almost always a mistake.  R's 
subsetting function is '[': subset is a convenience wrapper.

On Fri, 18 May 2007, jiho wrote:

> Hello,
>
> I am facing a problem with lapply which I think''' may be a bug.
> This is the most basic function in which I can reproduce it:
>
> myfun <- function()
> {
>   foo = data.frame(1:10,10:1)
>   foos = list(foo)
>   fooCollumn=2
>   cFoo = lapply(foos,subset,select=fooCollumn)
>   return(cFoo)
> }
>
> I am building a list of dataframes, in each of which I want to keep
> only column 2 (obviously I would not do it this way in real life but
> that's just to demonstrate the bug).
> If I execute the commands inline it works but if I clean my
> environment, then define the function and then execute:
>   > myfun()
> I get this error:
>   Error in eval(expr, envir, enclos) : object "fooCollumn" not found
> while fooCollumn is defined, in the function, right before lapply. In
> addition, if I define it outside the function and then execute the
> function:
>   > fooCollumn=1
>   > myfun()
> it works but uses the value defined in the general environment and
> not the one defined in the function.
> This is with R 2.5.0 on both OS X and Linux (Fedora Core 6)
> What did I do wrong? Is this indeed a bug? An intended behavior?

It is a bug, in your function.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lapply not reading arguments from the correct environment

2007-05-18 Thread jiho

On 2007-May-18  , at 17:09 , Thomas Lumley wrote:
> On Fri, 18 May 2007, jiho wrote:
>> I am facing a problem with lapply which I think''' may be a bug.
>> This is the most basic function in which I can reproduce it:
>>
>> myfun <- function()
>> {
>>  foo = data.frame(1:10,10:1)
>>  foos = list(foo)
>>  fooCollumn=2
>>  cFoo = lapply(foos,subset,select=fooCollumn)
>>  return(cFoo)
>> }
>>
> 
>> I get this error:
>>  Error in eval(expr, envir, enclos) : object "fooCollumn" not found
>> while fooCollumn is defined, in the function, right before lapply.
> 
>> This is with R 2.5.0 on both OS X and Linux (Fedora Core 6)
>> What did I do wrong? Is this indeed a bug? An intended behavior?
>
> The problem is that subset() evaluates its "select" argument in an  
> unusual way. Usually the argument would be evaluated inside myfun()  
> and the value passed to lapply(), and everything would work as you  
> expect.
> subset() bypasses the normal evaluation and explicitly evaluates  
> the "select" argument in the calling frame, ie, inside lapply(),  
> where fooCollumn is not visible.
> You could do
>   lapply(foos, function(foo) subset(foo, select=fooCollum))
> capturing fooCollum by lexical scope.  In R this is often a better  
> option than passing extra arguments to lapply (or other functions  
> that take function arguments).

Thank you very much, this works well indeed. I agree it is a bit  
confusing, to say the least. The point is that supplying other  
arguments in the ... of lapply worked for all other functions I tried  
before (mean, sd, summary and even spline) so it is really a problem  
with subset. Anyway, R is great even with such little flaws here and  
there and as long as the community is there to support it, it will rule.

Cheers,

JiHO
---
http://jo.irisson.free.fr/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Execute expression ( R -e) without close

2007-05-18 Thread Prof Brian Ripley

You cannot.  '-e' is modelled on batch tools, and is really back-end 
support for Rscript -e.

See ?Startup for other ways to achieve what you seem to be trying to do.

On Fri, 18 May 2007, marcelll wrote:

>
> I need to find a way how to execute an expression with the new command line
> parameter in  R (R -e "AnExpression") without R get closed after the
> expression has been evaluated.
> For example if i want to start R with a plot or start R with a *.RData file.
> Any help or suggestions are appreciated !
>
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ordering in list.files

2007-05-18 Thread Gabor Grothendieck

Try this:

library(gtools)
mixedsort(x)


On 5/18/07, Shubha Vishwanath Karanth <[EMAIL PROTECTED]> wrote:
> Hi R,
>
>
>
> My csv files are stored in the order, '1abc.csv', '2def.csv',
> '3ghi.csv', '10files.csv' in a folder. When I read this into R from
> list.files (R command: x=list.files("Z:/CSV/fold",full.names=F), I don't
> get the same order, instead I get the order as "10files.csv" "1abc.csv"
> "2def.csv""3ghi.csv". But I don't want this ordering. So, how do I
> maintain the oder which I have in my physical folder?
>
>
>
> Thanks in advance
>
> Shubha
>
>
>[[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ordering in list.files

2007-05-18 Thread Marc Schwartz

On Fri, 2007-05-18 at 20:16 +0530, Shubha Vishwanath Karanth wrote:
> Hi R,

> My csv files are stored in the order, '1abc.csv', '2def.csv',
> '3ghi.csv', '10files.csv' in a folder. When I read this into R from
> list.files (R command: x=list.files("Z:/CSV/fold",full.names=F), I don't
> get the same order, instead I get the order as "10files.csv" "1abc.csv"
> "2def.csv""3ghi.csv". But I don't want this ordering. So, how do I
> maintain the oder which I have in my physical folder?

> Thanks in advance
> 
> Shubha

>From ?list.files in the Value section:

"The files are sorted in alphabetical order, on the full path if
full.names = TRUE."

Presumably you are on Windows and you have the folder view set to sort
the files in some order, possibly by the date/time of creation?  Check
the folder settings to see how you have this set.

In R the list of files is sorted in alpha order and in this case, the
numbers are sorted based upon the order of the ASCII values of the
numeric characters, not in numeric value order.

You can try this approach using a regex and sub() to get the numeric
value parts of the file names, get the ordered indices and then pass
them back to the vector of file names:

> Files
[1] "10files.csv" "1abc.csv""2def.csv""3ghi.csv"   

> Files[order(as.numeric(sub("([0-9]*).*", "\\1", Files)))]
[1] "1abc.csv""2def.csv""3ghi.csv""10files.csv"

See ?sub, ?regex and ?order for more information.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to repress the annoying complains from X window system

2007-05-18 Thread Hao Liu

Dear All:

I am running some GUI functions in linux environment, they runs fine, 
however I constantly get this kind of message in R console:

Warning: X11 protocol error: BadWindow (invalid Window parameter)

Is there a way to repress it? Or am I doing something wrong here.. it 
does not interfere with the running of fucntion though.

Thanks
Hao

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lapply not reading arguments from the correct environment

2007-05-18 Thread Thomas Lumley

On Fri, 18 May 2007, jiho wrote:

> Hello,
>
> I am facing a problem with lapply which I think''' may be a bug.
> This is the most basic function in which I can reproduce it:
>
> myfun <- function()
> {
>   foo = data.frame(1:10,10:1)
>   foos = list(foo)
>   fooCollumn=2
>   cFoo = lapply(foos,subset,select=fooCollumn)
>   return(cFoo)
> }
>

> I get this error:
>   Error in eval(expr, envir, enclos) : object "fooCollumn" not found
> while fooCollumn is defined, in the function, right before lapply.

> This is with R 2.5.0 on both OS X and Linux (Fedora Core 6)
> What did I do wrong? Is this indeed a bug? An intended behavior?

No, it isn't a bug (though it may be confusing).

The problem is that subset() evaluates its "select" argument in an unusual 
way. Usually the argument would be evaluated inside myfun() and the value 
passed to lapply(), and everything would work as you expect.

subset() bypasses the normal evaluation and explicitly evaluates the 
"select" argument in the calling frame, ie, inside lapply(), where 
fooCollumn is not visible.

You could do
   lapply(foos, function(foo) subset(foo, select=fooCollum))
capturing fooCollum by lexical scope.  In R this is often a better option 
than passing extra arguments to lapply (or other functions that take 
function arguments).

-thomas

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] BATCH

2007-05-18 Thread elyakhlifi mustapha

hello,
I tried to run programs in BATCH like you told me but to read the results it's 
a lil hard 
first to read the results it's important to write down an outfile
but when I do this I've got stil the same answer

R version 2.4.1 (2006-12-18)
Copyright (C) 2006 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
R est un logiciel libre livré sans AUCUNE GARANTIE.
Vous pouvez le redistribuer sous certaines conditions.
Tapez 'license()' ou 'licence()' pour plus de détails.
R est un projet collaboratif avec de nombreux contributeurs.
Tapez 'contributors()' pour plus d'information et
'citation()' pour la façon de le citer dans les publications.
Tapez 'demo()' pour des démonstrations, 'help()' pour l'aide
en ligne ou 'help.start()' pour obtenir l'aide au format HTML.
Tapez 'q()' pour quitter R.
[Sauvegarde de la session précédente restaurée]
> #programme pour calculer les caractères
> 
> donParCara <- read.table("C:/Documents and Settings/melyakhlifi/Mes 
> documents/feuilles 
> excel/copi_donnees6.csv",header=TRUE,sep=";",quote="",dec=",")
> #print(subset(donParCara, select = c(Id_Cara,Form_C,X)))
> 
> 
> 
> #valeurs observées pour les caractères observés
> 
> C103 <- as.numeric(as.character(subset(donParEssai, Id_Essai == 1006961 & 
> Id_Cara == 103, c(Val_O,Surf_O))[,1]))
Erreur dans subset(donParEssai, Id_Essai == 1006961 & Id_Cara == 103, c(Val_O,  
: 
 objet "donParEssai" non trouvé
Exécution arrêtée

I thnik that it's a problem in the options but  I don't know how to do correct 
the errors.


  
_ 
Ne gardez plus qu'une seule adresse mail ! Copiez vos mails vers Yahoo! Mail 
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] length, mean, na.rm, na.omit...

2007-05-18 Thread Duncan Murdoch

On 5/18/2007 10:32 AM, Muenchen, Robert A (Bob) wrote:
> Hi All,
> 
> Can anyone tell me why the length function does not use na.rm? I know
> how to work around it, I'm just curious to know why such a useful option
> was left out.

length() is used very frequently in other functions, so it is encoded as 
a primitive for speed.  Adding an optional argument to it would slow it 
  down.

> I'm also interested in the logic of setting na.rm=TRUE as the default on
> mean, sd, etc. This is the opposite of the many other stat packages I
> have used, so I assume it provides some programming benefit that is not
> obvious to me.

That's also the opposite of what R does.  Did you mean to ask why 
na.rm=FALSE is the default?  I think it follows from thinking of NA as 
meaning "not known", rather than "missing at random".  If you don't know 
why values are missing, you may get biased results by calculating the 
mean of the others:  and R would rather not give you biased results.

Duncan Murdoch

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simple programming question

2007-05-18 Thread Lauri Nikkinen

Thank you all for your answers. Actually Gabor's first post was right in
that sense that I wanted to have "low" to all cases which are lower than
second highest. But how about if I want to convert/recode those "high",
"mid" and "low" to numeric to make some calculations, e.g. 3, 1, 0
respectively. How do I have to modify your solutions? I would also like to
apply this solution to many kinds of recoding situations.

-Lauri


2007/5/18, Gabor Grothendieck <[EMAIL PROTECTED]>:
>
> There was a problem in the first line in the case that the highest number
> is not unique within a category.   In this example its not apparent since
> that never occurs.  At any rate, it should be:
>
> f <- function(x) 4 - pmin(3, match(x, sort(unique(x), decreasing = TRUE)))
> factor(ave(dfr$var3, dfr$categ, FUN = f), lab = c("low", "mid", "high"))
>
> Also note that the factor labels were arranged so that
> "low", "mid" and "high" correspond to levels 1, 2 and 3
> respectively.
>
> On 5/18/07, Gabor Grothendieck <[EMAIL PROTECTED]> wrote:
> > Try this.  f assigns 1, 2 and 3 to the highest, second highest and third
> highest
> > within a category.  ave applies f to each category.  Finally we convert
> it to a
> > factor.
> >
> > f <- function(x) 4 - pmin(3, match(x, sort(x, decreasing = TRUE)))
> > factor(ave(dfr$var3, dfr$categ, FUN = f), lab = c("low", "mid", "high"))
> >
> >
> >
> > On 5/18/07, Lauri Nikkinen <[EMAIL PROTECTED]> wrote:
> > > Hi R-users,
> > >
> > > I have a simple question for R heavy users. If I have a data frame
> like this
> > >
> > >
> > > dfr <- data.frame(id=1:16, categ=rep(LETTERS[1:4], 4),
> > > var3=c(8,7,6,6,5,4,5,4,3,4,3,2,3,2,1,1))
> > > dfr <- dfr[order(dfr$categ),]
> > >
> > > and I want to score values or points in variable named "var3"
> following this
> > > kind of logic:
> > >
> > > 1. the highest value of var3 within category (variable named "categ")
> ->
> > > "high"
> > > 2. the second highest value -> "mid"
> > > 3. lowest value -> "low"
> > >
> > > This would be the output of this reasoning:
> > >
> > > dfr$score <-
> > >
> factor(c("high","mid","low","low","high","mid","mid","low","high","mid","low","low","high","mid","low","low"))
> > > dfr
> > >
> > > The question is how I do this programmatically in R (i.e. if I have
> 2000
> > > rows in my dfr)?
> > >
> > > I appreciate your help!
> > >
> > > Cheers,
> > > Lauri
> > >
> > >[[alternative HTML version deleted]]
> > >
> > > __
> > > R-help@stat.math.ethz.ch mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> >
>

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] subset arg in (modified) evalq

2007-05-18 Thread Gabor Grothendieck

Try this:

e <- quote(summary(y + z))
all.vars(e)


On 5/18/07, Vadim Ogranovich <[EMAIL PROTECTED]> wrote:
> Sorry, I didn't explain myself clear enough. I knew about the select arg in
> subset(). My question was, given the expression expression(summary(x+y)),
> how to extract all names that will be looked up during its evaluation.
>
> As to checking performance assumptions, you are right, in most cases the
> overhead is negligible, but sometimes I work with really big data sets.
>
> Thanks a lot for your help,
> Vadim
>
>
> - Original Message -
> From: "Gabor Grothendieck" <[EMAIL PROTECTED]>
> To: "Vadim Ogranovich" <[EMAIL PROTECTED]>
> Cc: r-help@stat.math.ethz.ch
> Sent: Friday, May 18, 2007 9:53:26 AM (GMT-0600) America/Chicago
> Subject: Re: [R] subset arg in (modified) evalq
>
> I would check your performance assumption with an actual test before
> concluding such but at any rate subset does have a select argument. See
> ?subset
>
> On 5/18/07, Vadim Ogranovich <[EMAIL PROTECTED]> wrote:
> > Thanks Gabor!  This does exactly what I wanted.
> >
> > One follow-up question, how to extract the var names, in this case y, z,
> > from the expression? The subset function creates a new object and this may
> > be expensive when the data has a lot of irrelevant collumns. So I thougth
> > that I could reduce this to the columns I actually need.
> >
> > Thanks,
> > Vadim
> >
> >
> >
> > - Original Message -
> > From: "Gabor Grothendieck" <[EMAIL PROTECTED]>
> > To: "Vadim Ogranovich" <[EMAIL PROTECTED]>
> > Cc: r-help@stat.math.ethz.ch
> > Sent: Friday, May 18, 2007 9:19:49 AM (GMT-0600) America/Chicago
> > Subject: Re: [R] subset arg in (modified) evalq
> >
> > Try this:
> >
> >with(subset(data, x > 0), summary(y + z))
> >
> >
> > On 5/18/07, Vadim Ogranovich <[EMAIL PROTECTED]> wrote:
> > > Hi,
> > >
> > > When using evalq to evaluate expressions within a say data.frame context
> I
> > often wish there was a 'subset' argument, much like in lm() or any ather
> > advanced regression model. I would be grateful for a tip how to do this.
> > >
> > > Here is an illustration of what I want:
> > >
> > > n <- 100
> > > data <- data.frame(x=rnorm(n), y=rnorm(y), z=rnorm(z))
> > >
> > > # this works
> > > evalq({ i <- 0 > >
> > > # I want to do the above w/o explicit subscripting, e.g.
> > > myevalq(summary(y + z), subset=0 > >
> > > Thanks,
> > > Vadim
> > >
> > >[[alternative HTML version deleted]]
> > >
> > > __
> > > R-help@stat.math.ethz.ch mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> >
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] subset arg in (modified) evalq

2007-05-18 Thread Gabor Grothendieck

I would check your performance assumption with an actual test before
concluding such but at any rate subset does have a select argument. See
?subset

On 5/18/07, Vadim Ogranovich <[EMAIL PROTECTED]> wrote:
> Thanks Gabor!  This does exactly what I wanted.
>
> One follow-up question, how to extract the var names, in this case y, z,
> from the expression? The subset function creates a new object and this may
> be expensive when the data has a lot of irrelevant collumns. So I thougth
> that I could reduce this to the columns I actually need.
>
> Thanks,
> Vadim
>
>
>
> - Original Message -
> From: "Gabor Grothendieck" <[EMAIL PROTECTED]>
> To: "Vadim Ogranovich" <[EMAIL PROTECTED]>
> Cc: r-help@stat.math.ethz.ch
> Sent: Friday, May 18, 2007 9:19:49 AM (GMT-0600) America/Chicago
> Subject: Re: [R] subset arg in (modified) evalq
>
> Try this:
>
>with(subset(data, x > 0), summary(y + z))
>
>
> On 5/18/07, Vadim Ogranovich <[EMAIL PROTECTED]> wrote:
> > Hi,
> >
> > When using evalq to evaluate expressions within a say data.frame context I
> often wish there was a 'subset' argument, much like in lm() or any ather
> advanced regression model. I would be grateful for a tip how to do this.
> >
> > Here is an illustration of what I want:
> >
> > n <- 100
> > data <- data.frame(x=rnorm(n), y=rnorm(y), z=rnorm(z))
> >
> > # this works
> > evalq({ i <- 0 >
> > # I want to do the above w/o explicit subscripting, e.g.
> > myevalq(summary(y + z), subset=0 >
> > Thanks,
> > Vadim
> >
> >[[alternative HTML version deleted]]
> >
> > __
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ordering in list.files

2007-05-18 Thread jim holtman

Rename them so that they all have the same number of zero filled leading
digits:

01abc.csv, 02def.csv, ...

'10' comes before '1a' in the stardard sorting sequence.

The other way is to get a list of all your files, parse off the numerics at
the beginning and then sort in numerical order.

On 5/18/07, Shubha Vishwanath Karanth <[EMAIL PROTECTED]> wrote:
>
> Hi R,
>
>
>
> My csv files are stored in the order, '1abc.csv', '2def.csv',
> '3ghi.csv', '10files.csv' in a folder. When I read this into R from
> list.files (R command: x=list.files("Z:/CSV/fold",full.names=F), I don't
> get the same order, instead I get the order as "10files.csv" "1abc.csv"
> "2def.csv""3ghi.csv". But I don't want this ordering. So, how do I
> maintain the oder which I have in my physical folder?
>
>
>
> Thanks in advance
>
> Shubha
>
>
>[[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simple programming question

2007-05-18 Thread Gabor Grothendieck

The solution already calculates it as numeric and only after that
does it convert it to factor so just omit the conversion:

f <- function(x) 4 - pmin(3, match(x, sort(unique(x), decreasing = TRUE)))
score <- ave(dfr$var3, dfr$categ, FUN = f)

As mentioned, this assigns 1 to low (everything other than the highest
two numbers in a category), 2 to the second highest and 3 to the highest.

If you want some other assignment, e.g. 3 is low, 1 is mid and 0 is high
then try:

c(3, 1, 0)[score]

On 5/18/07, Lauri Nikkinen <[EMAIL PROTECTED]> wrote:
> Thank you all for your answers. Actually Gabor's first post was right in
> that sense that I wanted to have "low" to all cases which are lower than
> second highest. But how about if I want to convert/recode those "high",
> "mid" and "low" to numeric to make some calculations, e.g. 3, 1, 0
> respectively. How do I have to modify your solutions? I would also like to
> apply this solution to many kinds of recoding situations.
>
> -Lauri
>
>
> 2007/5/18, Gabor Grothendieck <[EMAIL PROTECTED]>:
> > There was a problem in the first line in the case that the highest number
> > is not unique within a category.   In this example its not apparent since
> > that never occurs.  At any rate, it should be:
> >
> > f <- function(x) 4 - pmin(3, match(x, sort(unique(x), decreasing = TRUE)))
> > factor(ave(dfr$var3, dfr$categ, FUN = f), lab = c("low", "mid", "high"))
> >
> > Also note that the factor labels were arranged so that
> > "low", "mid" and "high" correspond to levels 1, 2 and 3
> > respectively.
> >
> > On 5/18/07, Gabor Grothendieck < [EMAIL PROTECTED]> wrote:
> > > Try this.  f assigns 1, 2 and 3 to the highest, second highest and third
> highest
> > > within a category.  ave applies f to each category.  Finally we convert
> it to a
> > > factor.
> > >
> > > f <- function(x) 4 - pmin(3, match(x, sort(x, decreasing = TRUE)))
> > > factor(ave(dfr$var3, dfr$categ, FUN = f), lab = c("low", "mid", "high"))
> > >
> > >
> > >
> > > On 5/18/07, Lauri Nikkinen <[EMAIL PROTECTED]> wrote:
> > > > Hi R-users,
> > > >
> > > > I have a simple question for R heavy users. If I have a data frame
> like this
> > > >
> > > >
> > > > dfr <- data.frame(id=1:16, categ=rep(LETTERS[1:4], 4),
> > > > var3=c(8,7,6,6,5,4,5,4,3,4,3,2,3,2,1,1))
> > > > dfr <- dfr[order(dfr$categ),]
> > > >
> > > > and I want to score values or points in variable named "var3"
> following this
> > > > kind of logic:
> > > >
> > > > 1. the highest value of var3 within category (variable named "categ")
> ->
> > > > "high"
> > > > 2. the second highest value -> "mid"
> > > > 3. lowest value -> "low"
> > > >
> > > > This would be the output of this reasoning:
> > > >
> > > > dfr$score <-
> > > >
> factor(c("high","mid","low","low","high","mid","mid","low","high","mid","low","low","high","mid","low","low"))
> > > > dfr
> > > >
> > > > The question is how I do this programmatically in R (i.e. if I have
> 2000
> > > > rows in my dfr)?
> > > >
> > > > I appreciate your help!
> > > >
> > > > Cheers,
> > > > Lauri
> > > >
> > > >[[alternative HTML version deleted]]
> > > >
> > > > __
> > > > R-help@stat.math.ethz.ch mailing list
> > > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > > > and provide commented, minimal, self-contained, reproducible code.
> > > >
> > >
> >
>
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R: Simple programming question

2007-05-18 Thread Guazzetti Stefano

try also this

dfr$score<-factor(dfr$var3 %in% sort(unique(dfr$var3), decr=T)[1:2] * dfr$var3,
   labels=c("low", "mid", "high"))
Hope this helps, 

Stefano

-Messaggio originale-
Da: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] conto di Lauri Nikkinen
Inviato: venerdì 18 maggio 2007 15.15
A: r-help@stat.math.ethz.ch
Oggetto: [R] Simple programming question


Hi R-users,

I have a simple question for R heavy users. If I have a data frame like this


dfr <- data.frame(id=1:16, categ=rep(LETTERS[1:4], 4),
var3=c(8,7,6,6,5,4,5,4,3,4,3,2,3,2,1,1))
dfr <- dfr[order(dfr$categ),]

and I want to score values or points in variable named "var3" following this
kind of logic:

1. the highest value of var3 within category (variable named "categ") ->
"high"
2. the second highest value -> "mid"
3. lowest value -> "low"

This would be the output of this reasoning:

dfr$score <-
factor(c("high","mid","low","low","high","mid","mid","low","high","mid","low","low","high","mid","low","low"))
dfr

The question is how I do this programmatically in R (i.e. if I have 2000
rows in my dfr)?

I appreciate your help!

Cheers,
Lauri

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] subset arg in (modified) evalq

2007-05-18 Thread Vadim Ogranovich

Sorry, I didn't explain myself clear enough. I knew about the select arg in 
subset(). My question was, given the expression expression(summary(x+y)), how 
to extract all names that will be looked up during its evaluation. 

As to checking performance assumptions, you are right, in most cases the 
overhead is negligible, but sometimes I work with really big data sets. 

Thanks a lot for your help, 
Vadim 


- Original Message - 
From: " Gabor Grothendieck " < ggrothendieck @ gmail .com> 
To: " Vadim Ogranovich " < vogranovich @ jumptrading .com> 
Cc: r-help @stat.math. ethz .ch 
Sent: Friday, May 18, 2007 9:53:26 AM ( GMT-0600 ) America/Chicago 
Subject: Re: [R] subset arg in (modified) evalq 

I would check your performance assumption with an actual test before 
concluding such but at any rate subset does have a select argument. See 
?subset 

On 5/18/07, Vadim Ogranovich < vogranovich @ jumptrading .com> wrote: 
> Thanks Gabor ! This does exactly what I wanted. 
> 
> One follow-up question, how to extract the var names, in this case y, z, 
> from the expression? The subset function creates a new object and this may 
> be expensive when the data has a lot of irrelevant collumns . So I thougth 
> that I could reduce this to the columns I actually need. 
> 
> Thanks, 
> Vadim 
> 
> 
> 
> - Original Message - 
> From: " Gabor Grothendieck " < ggrothendieck @ gmail .com> 
> To: " Vadim Ogranovich " < vogranovich @ jumptrading .com> 
> Cc: r-help @stat.math. ethz .ch 
> Sent: Friday, May 18, 2007 9:19:49 AM ( GMT-0600 ) America/Chicago 
> Subject: Re: [R] subset arg in (modified) evalq 
> 
> Try this: 
> 
> with(subset(data, x > 0), summary(y + z)) 
> 
> 
> On 5/18/07, Vadim Ogranovich < vogranovich @ jumptrading .com> wrote: 
> > Hi, 
> > 
> > When using evalq to evaluate expressions within a say data.frame context I 
> often wish there was a 'subset' argument, much like in lm () or any ather 
> advanced regression model. I would be grateful for a tip how to do this. 
> > 
> > Here is an illustration of what I want: 
> > 
> > n <- 100 
> > data <- data.frame(x= rnorm (n), y= rnorm (y), z= rnorm (z)) 
> > 
> > # this works 
> > evalq ({ i <- 0 > 
> > # I want to do the above w/o explicit subscripting , e.g. 
> > myevalq (summary(y + z), subset=0 > 
> > Thanks, 
> > Vadim 
> > 
> > [[alternative HTML version deleted]] 
> > 
> > __ 
> > R-help @stat.math. ethz .ch mailing list 
> > https ://stat. ethz .ch/mailman/ listinfo / r-help 
> > PLEASE do read the posting guide 
> http :// www . R-project .org/ posting-guide . html 
> > and provide commented, minimal, self-contained , reproducible code. 
> > 
> 

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lapply not reading arguments from the correct environment

2007-05-18 Thread Dimitris Rizopoulos

subset() was not defined inside myfun(); try this version instead:

myfun <- function () {
foo <- data.frame(1:10, 10:1)
foos <- list(foo)
fooCollumn <- 2
my.subset <- function(...) subset(...)
cFoo <-  lapply(foos, my.subset, select = fooCollumn)
cFoo
}
myfun()


I hope it helps.

Best,
Dimitris


Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://med.kuleuven.be/biostat/
 http://www.student.kuleuven.be/~m0390867/dimitris.htm


- Original Message - 
From: "jiho" <[EMAIL PROTECTED]>
To: 
Sent: Friday, May 18, 2007 4:41 PM
Subject: [R] lapply not reading arguments from the correct environment


> Hello,
>
> I am facing a problem with lapply which I think''' may be a bug.
> This is the most basic function in which I can reproduce it:
>
> myfun <- function()
> {
> foo = data.frame(1:10,10:1)
> foos = list(foo)
> fooCollumn=2
> cFoo = lapply(foos,subset,select=fooCollumn)
> return(cFoo)
> }
>
> I am building a list of dataframes, in each of which I want to keep
> only column 2 (obviously I would not do it this way in real life but
> that's just to demonstrate the bug).
> If I execute the commands inline it works but if I clean my
> environment, then define the function and then execute:
> > myfun()
> I get this error:
> Error in eval(expr, envir, enclos) : object "fooCollumn" not found
> while fooCollumn is defined, in the function, right before lapply. 
> In
> addition, if I define it outside the function and then execute the
> function:
> > fooCollumn=1
> > myfun()
> it works but uses the value defined in the general environment and
> not the one defined in the function.
> This is with R 2.5.0 on both OS X and Linux (Fedora Core 6)
> What did I do wrong? Is this indeed a bug? An intended behavior?
> Thanks in advance.
>
> JiHO
> ---
> http://jo.irisson.free.fr/
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 


Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] subset arg in (modified) evalq

2007-05-18 Thread Vadim Ogranovich

Thanks Gabor! This does exactly what I wanted. 

One follow-up question, how to extract the var names, in this case y, z, from 
the expression? The subset function creates a new object and this may be 
expensive when the data has a lot of irrelevant collumns. So I thougth that I 
could reduce this to the columns I actually need. 

Thanks, 
Vadim 

- Original Message - 
From: "Gabor Grothendieck" <[EMAIL PROTECTED]> 
To: "Vadim Ogranovich" <[EMAIL PROTECTED]> 
Cc: r-help@stat.math.ethz.ch 
Sent: Friday, May 18, 2007 9:19:49 AM (GMT-0600) America/Chicago 
Subject: Re: [R] subset arg in (modified) evalq 

Try this: 

with(subset(data, x > 0), summary(y + z)) 

On 5/18/07, Vadim Ogranovich <[EMAIL PROTECTED]> wrote: 
> Hi, 
> 
> When using evalq to evaluate expressions within a say data.frame context I 
> often wish there was a 'subset' argument, much like in lm() or any ather 
> advanced regression model. I would be grateful for a tip how to do this. 
> 
> Here is an illustration of what I want: 
> 
> n <- 100 
> data <- data.frame(x=rnorm(n), y=rnorm(y), z=rnorm(z)) 
> 
> # this works 
> evalq({ i <- 0 
> # I want to do the above w/o explicit subscripting, e.g. 
> myevalq(summary(y + z), subset=0 
> Thanks, 
> Vadim 
> 
> [[alternative HTML version deleted]] 
> 
> __ 
> R-help@stat.math.ethz.ch mailing list 
> https://stat.ethz.ch/mailman/listinfo/r-help 
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html 
> and provide commented, minimal, self-contained, reproducible code. 
> 

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ordering in list.files

2007-05-18 Thread Shubha Vishwanath Karanth

Hi R,

 

My csv files are stored in the order, '1abc.csv', '2def.csv',
'3ghi.csv', '10files.csv' in a folder. When I read this into R from
list.files (R command: x=list.files("Z:/CSV/fold",full.names=F), I don't
get the same order, instead I get the order as "10files.csv" "1abc.csv"
"2def.csv""3ghi.csv". But I don't want this ordering. So, how do I
maintain the oder which I have in my physical folder?

 

Thanks in advance

Shubha


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] lapply not reading arguments from the correct environment

2007-05-18 Thread jiho

Hello,

I am facing a problem with lapply which I think''' may be a bug.  
This is the most basic function in which I can reproduce it:

myfun <- function()
{
foo = data.frame(1:10,10:1)
foos = list(foo)
fooCollumn=2
cFoo = lapply(foos,subset,select=fooCollumn)
return(cFoo)
}

I am building a list of dataframes, in each of which I want to keep  
only column 2 (obviously I would not do it this way in real life but  
that's just to demonstrate the bug).
If I execute the commands inline it works but if I clean my  
environment, then define the function and then execute:
> myfun()
I get this error:
Error in eval(expr, envir, enclos) : object "fooCollumn" not found
while fooCollumn is defined, in the function, right before lapply. In  
addition, if I define it outside the function and then execute the  
function:
> fooCollumn=1
> myfun()
it works but uses the value defined in the general environment and  
not the one defined in the function.
This is with R 2.5.0 on both OS X and Linux (Fedora Core 6)
What did I do wrong? Is this indeed a bug? An intended behavior?
Thanks in advance.

JiHO
---
http://jo.irisson.free.fr/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] length, mean, na.rm, na.omit...

2007-05-18 Thread Muenchen, Robert A (Bob)

Hi All,

Can anyone tell me why the length function does not use na.rm? I know
how to work around it, I'm just curious to know why such a useful option
was left out.

I'm also interested in the logic of setting na.rm=TRUE as the default on
mean, sd, etc. This is the opposite of the many other stat packages I
have used, so I assume it provides some programming benefit that is not
obvious to me.

Thanks,
Bob

=
  Bob Muenchen (pronounced Min'-chen), Manager  
  Statistical Consulting Center
  U of TN Office of Information Technology
  200 Stokely Management Center, Knoxville, TN 37996-0520
  Voice: (865) 974-5230  
  FAX:   (865) 974-4810
  Email: [EMAIL PROTECTED]
  Web:   http://oit.utk.edu/scc, 
  News:  http://listserv.utk.edu/archives/statnews.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simple programming question

2007-05-18 Thread Gabor Grothendieck

There was a problem in the first line in the case that the highest number
is not unique within a category.   In this example its not apparent since
that never occurs.  At any rate, it should be:

f <- function(x) 4 - pmin(3, match(x, sort(unique(x), decreasing = TRUE)))
factor(ave(dfr$var3, dfr$categ, FUN = f), lab = c("low", "mid", "high"))

Also note that the factor labels were arranged so that
"low", "mid" and "high" correspond to levels 1, 2 and 3
respectively.

On 5/18/07, Gabor Grothendieck <[EMAIL PROTECTED]> wrote:
> Try this.  f assigns 1, 2 and 3 to the highest, second highest and third 
> highest
> within a category.  ave applies f to each category.  Finally we convert it to 
> a
> factor.
>
> f <- function(x) 4 - pmin(3, match(x, sort(x, decreasing = TRUE)))
> factor(ave(dfr$var3, dfr$categ, FUN = f), lab = c("low", "mid", "high"))
>
>
>
> On 5/18/07, Lauri Nikkinen <[EMAIL PROTECTED]> wrote:
> > Hi R-users,
> >
> > I have a simple question for R heavy users. If I have a data frame like this
> >
> >
> > dfr <- data.frame(id=1:16, categ=rep(LETTERS[1:4], 4),
> > var3=c(8,7,6,6,5,4,5,4,3,4,3,2,3,2,1,1))
> > dfr <- dfr[order(dfr$categ),]
> >
> > and I want to score values or points in variable named "var3" following this
> > kind of logic:
> >
> > 1. the highest value of var3 within category (variable named "categ") ->
> > "high"
> > 2. the second highest value -> "mid"
> > 3. lowest value -> "low"
> >
> > This would be the output of this reasoning:
> >
> > dfr$score <-
> > factor(c("high","mid","low","low","high","mid","mid","low","high","mid","low","low","high","mid","low","low"))
> > dfr
> >
> > The question is how I do this programmatically in R (i.e. if I have 2000
> > rows in my dfr)?
> >
> > I appreciate your help!
> >
> > Cheers,
> > Lauri
> >
> >[[alternative HTML version deleted]]
> >
> > __
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simple programming question

2007-05-18 Thread Adaikalavan Ramasamy

According to your post you are assuming that there are only 3 unique 
values for var3 within each category. But category C and D have 4 unique 
values for var3.

split(dfr, dfr$categ)
...
$C
   id categ var3 score
3   3 C6  high
7   7 C5   mid
11 11 C3   low
15 15 C1   low
...

If you meant something different, then just change myfun() below


  gmax <- function(x, rnk=1){
   ## generalized maximum with rnk=1 being the bigest value (i.e. max)
   return( sort( unique(x), decreasing=T )[rnk] )
  }

  myfun <- function(x){ ifelse( x==gmax(x,1), "high",
ifelse( x==gmax(x,2), "med", "low" ) ) }

  out   <- lapply( split(dfr$var3, dfr$categ), myfun )

  data.frame( dfr, my.score = unsplit(out, dfr$categ) )

Regards, Adai



Lauri Nikkinen wrote:
> Hi R-users,
> 
> I have a simple question for R heavy users. If I have a data frame like this
> 
> 
> dfr <- data.frame(id=1:16, categ=rep(LETTERS[1:4], 4),
> var3=c(8,7,6,6,5,4,5,4,3,4,3,2,3,2,1,1))
> dfr <- dfr[order(dfr$categ),]
> 
> and I want to score values or points in variable named "var3" following this
> kind of logic:
> 
> 1. the highest value of var3 within category (variable named "categ") ->
> "high"
> 2. the second highest value -> "mid"
> 3. lowest value -> "low"
> 
> This would be the output of this reasoning:
> 
> dfr$score <-
> factor(c("high","mid","low","low","high","mid","mid","low","high","mid","low","low","high","mid","low","low"))
> dfr
> 
> The question is how I do this programmatically in R (i.e. if I have 2000
> rows in my dfr)?
> 
> I appreciate your help!
> 
> Cheers,
> Lauri
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] subset arg in (modified) evalq

2007-05-18 Thread Gabor Grothendieck

Try this:

   with(subset(data, x > 0), summary(y + z))


On 5/18/07, Vadim Ogranovich <[EMAIL PROTECTED]> wrote:
> Hi,
>
> When using evalq to evaluate expressions within a say data.frame context I 
> often wish there was a 'subset' argument, much like in lm() or any ather 
> advanced regression model. I would be grateful for a tip how to do this.
>
> Here is an illustration of what I want:
>
> n <- 100
> data <- data.frame(x=rnorm(n), y=rnorm(y), z=rnorm(z))
>
> # this works
> evalq({ i <- 0
> # I want to do the above w/o explicit subscripting, e.g.
> myevalq(summary(y + z), subset=0
> Thanks,
> Vadim
>
>[[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simple programming question

2007-05-18 Thread Gabor Grothendieck

Try this.  f assigns 1, 2 and 3 to the highest, second highest and third highest
within a category.  ave applies f to each category.  Finally we convert it to a
factor.

f <- function(x) 4 - pmin(3, match(x, sort(x, decreasing = TRUE)))
factor(ave(dfr$var3, dfr$categ, FUN = f), lab = c("low", "mid", "high"))



On 5/18/07, Lauri Nikkinen <[EMAIL PROTECTED]> wrote:
> Hi R-users,
>
> I have a simple question for R heavy users. If I have a data frame like this
>
>
> dfr <- data.frame(id=1:16, categ=rep(LETTERS[1:4], 4),
> var3=c(8,7,6,6,5,4,5,4,3,4,3,2,3,2,1,1))
> dfr <- dfr[order(dfr$categ),]
>
> and I want to score values or points in variable named "var3" following this
> kind of logic:
>
> 1. the highest value of var3 within category (variable named "categ") ->
> "high"
> 2. the second highest value -> "mid"
> 3. lowest value -> "low"
>
> This would be the output of this reasoning:
>
> dfr$score <-
> factor(c("high","mid","low","low","high","mid","mid","low","high","mid","low","low","high","mid","low","low"))
> dfr
>
> The question is how I do this programmatically in R (i.e. if I have 2000
> rows in my dfr)?
>
> I appreciate your help!
>
> Cheers,
> Lauri
>
>[[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] time series

2007-05-18 Thread Rogerio Porto

Jessica,

> I am working with a data file which is the record of precipitation
> measurement normaly done every 10 minutes. I would like to check if there
> are missing times in my data file.
> 
> Is there a function existing able to check for that in R ?

I'd use max(diff(time))==min(diff(time)).

HTH,

Rogerio

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simple programming question

2007-05-18 Thread Dimitris Rizopoulos

try this:

dfr <- data.frame(id=1:16, categ=rep(LETTERS[1:4], 4),
var3=c(8,7,6,6,5,4,5,4,3,4,3,2,3,2,1,1))
dfr <- dfr[order(dfr$categ), ]

dfr$score <- unlist(tapply(dfr$var3, dfr$categ, function (x) {
sn <- sort(unique(x), decreasing = TRUE)
labs <- c("high", "mid", rep("low", length(sn) - 2))
labs[match(x, sn)]
}))


I hope it helps.

Best,
Dimitris


Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://med.kuleuven.be/biostat/
 http://www.student.kuleuven.be/~m0390867/dimitris.htm


- Original Message - 
From: "Lauri Nikkinen" <[EMAIL PROTECTED]>
To: 
Sent: Friday, May 18, 2007 3:15 PM
Subject: [R] Simple programming question


> Hi R-users,
>
> I have a simple question for R heavy users. If I have a data frame 
> like this
>
>
> dfr <- data.frame(id=1:16, categ=rep(LETTERS[1:4], 4),
> var3=c(8,7,6,6,5,4,5,4,3,4,3,2,3,2,1,1))
> dfr <- dfr[order(dfr$categ),]
>
> and I want to score values or points in variable named "var3" 
> following this
> kind of logic:
>
> 1. the highest value of var3 within category (variable named 
> "categ") ->
> "high"
> 2. the second highest value -> "mid"
> 3. lowest value -> "low"
>
> This would be the output of this reasoning:
>
> dfr$score <-
> factor(c("high","mid","low","low","high","mid","mid","low","high","mid","low","low","high","mid","low","low"))
> dfr
>
> The question is how I do this programmatically in R (i.e. if I have 
> 2000
> rows in my dfr)?
>
> I appreciate your help!
>
> Cheers,
> Lauri
>
> [[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 


Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] subset arg in (modified) evalq

2007-05-18 Thread Vadim Ogranovich

Hi, 

When using evalq to evaluate expressions within a say data.frame context I 
often wish there was a 'subset' argument, much like in lm() or any ather 
advanced regression model. I would be grateful for a tip how to do this. 

Here is an illustration of what I want: 

n <- 100 
data <- data.frame(x=rnorm(n), y=rnorm(y), z=rnorm(z)) 

# this works 
evalq({ i <- 0https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Execute expression ( R -e) without close

2007-05-18 Thread marcelll


I need to find a way how to execute an expression with the new command line
parameter in  R (R -e "AnExpression") without R get closed after the
expression has been evaluated.
For example if i want to start R with a plot or start R with a *.RData file.
Any help or suggestions are appreciated !

-- 
View this message in context: 
http://www.nabble.com/Execute-expression-%28-R--e%29-without-close-tf3777652.html#a10681974
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Simple programming question

2007-05-18 Thread Lauri Nikkinen

Hi R-users,

I have a simple question for R heavy users. If I have a data frame like this


dfr <- data.frame(id=1:16, categ=rep(LETTERS[1:4], 4),
var3=c(8,7,6,6,5,4,5,4,3,4,3,2,3,2,1,1))
dfr <- dfr[order(dfr$categ),]

and I want to score values or points in variable named "var3" following this
kind of logic:

1. the highest value of var3 within category (variable named "categ") ->
"high"
2. the second highest value -> "mid"
3. lowest value -> "low"

This would be the output of this reasoning:

dfr$score <-
factor(c("high","mid","low","low","high","mid","mid","low","high","mid","low","low","high","mid","low","low"))
dfr

The question is how I do this programmatically in R (i.e. if I have 2000
rows in my dfr)?

I appreciate your help!

Cheers,
Lauri

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] svychisq

2007-05-18 Thread Moss, Angela \(Dudley PCT\)

Dear All

I am trying to use svychisq with a two-dimensional table 4 x 5. The
command I am using is
summary(svytable(~dietperception+dietstatus,dudleyls1rake,na.rm=TRUE),"C
hisq")

 It is throwing up an error message as follows:

Error in NCOL(y) : only 0's may be mixed with negative subscripts

In addition: Warning messages:

1: is.na() applied to non-(list or vector) in: is.na(rowvar) 

2: is.na() applied to non-(list or vector) in: is.na(colvar)

 

The dietperception data set does have some NA's where as there are none
in dietstatus.

 

The table is

svytable(~dietperception+dietstatus,dudleyls1rake)

 
dietstatus

dietperceptionGood  OK
Poor   Very Poor  Unclassified

  Perceive healthy   6669.15287  6306.38635
47563.49174 80030.97096  12340.28453

  Neither agree not disagree250.68278   204.88193   6086.84308
35575.32925   2158.47668

  Perceive unhealthy   0.0  171.49710
2075.8023026390.92946318.73213

  Don't know   0.022.33107
334.44880  4402.99293562.91532

 

I wondered if you could give me any idea where I am gong wrong.

 

Many thanks

Angela

 

 

Dr Angela Moss 
Public Health Information Analyst 
Dudley PCT 
St. John's House 
Union Street 
Dudley 
DY2 8PP 

Tel: 01384 366091 
Fax: 01384 366485 

 


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] 9 Courses: Upcoming July 2007 R/S+ schedule by XLSolutions Corp / New Website!

2007-05-18 Thread Sue Turner


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] penalized maximum likelihood estimator

2007-05-18 Thread rupos sujon

dear R-helper,
I tried to find out a package in which i can have
penalized maximum likelihood estimator applying on
generalized extreme value distribution with beta
function) but could not. would you please help me to
know the name of the package. thanks for your help.
S.Murshed
--- [EMAIL PROTECTED] wrote:

> Send R-help mailing list submissions to
>   r-help@stat.math.ethz.ch
> 
> To subscribe or unsubscribe via the World Wide Web,
> visit
>   https://stat.ethz.ch/mailman/listinfo/r-help
> or, via email, send a message with subject or body
> 'help' to
>   [EMAIL PROTECTED]
> 
> You can reach the person managing the list at
>   [EMAIL PROTECTED]
> 
> When replying, please edit your Subject line so it
> is more specific
> than "Re: Contents of R-help digest..."
> 
> 
> Today's Topics:
> 
>1. Re: use mathematics formula (hadley wickham)
>2. creating different strata (raymond chiruka)
>3. converting a data frame to ts objects (fatih
> ozgul)
>4. Running R function as a Batch process (d.
> sarthi maheshwari)
>5. Re: creating different strata (raymond
> chiruka)
>6. Re: Filled step-function? (Beate Kowalczyk)
>7. Re: creating different strata (Petr
> Klasterecky)
>8. Re: use mathematics formula (John Kane)
>9. Re: Running R function as a Batch process
> (Vladimir Eremeev)
>   10. more woes trying to convert a data.frame to a
> numerical
>   matrix (Andrew Yee)
>   11. Re: Filled step-function? (Petr Klasterecky)
>   12. Re: more woes trying to convert a data.frame
> to a numerical
>   matrix (ONKELINX, Thierry)
>   13. Re: Filled step-function? (Jim Lemon)
>   14. Re: more woes trying to convert a data.frame
> to a  numerical
>   matrix (Marc Schwartz)
>   15. Re: more woes trying to convert a data.frame
> to a numerical
>   matrix (Dimitris Rizopoulos)
>   16. Re: more woes trying to convert a data.frame
> to a numerical
>   matrix (Andrew Yee)
>   17. Re: Running R function as a Batch process
> (Hanke, Alex)
>   18. Re: more woes trying to convert a data.frame
> to a  numerical
>   matrix (Marc Schwartz)
>   19. Is it possible to pass a Tcl/Tk component as
> argument to a
>   function (Hao Liu)
>   20. Re: converting a data frame to ts objects
> (Gabor Grothendieck)
>   21. Re: more woes trying to convert a data.frame
> to a numerical
>   matrix (Andrew Yee)
>   22. Re: log rank test p value (Terry Therneau)
>   23. effective df in local polinomial regression
> (Simone Vantini)
>   24. Re: more woes trying to convert a data.frame
> to a numerical
>   matrix (Andrew Yee)
>   25. Re: Is it possible to pass a Tcl/Tk component
> as argument to
>   afunction (John Fox)
>   26. Re: Is it possible to pass a Tcl/Tk component
> as argument to
>   a   function (Duncan Murdoch)
>   27. Re: more woes trying to convert a data.frame
> to a  numerical
>   matrix (Marc Schwartz)
>   28. partial least regression
> (=?gb2312?Q?=D5=D4=D3=F1=D6=D2?=)
>   29. Re: partial least regression (Gabor
> Grothendieck)
>   30. Installing SJava - problem (mister_bluesman)
>   31. Re: Installing SJava - problem (Prof Brian
> Ripley)
>   32. Re: more woes trying to convert a data.frame
> to a numerical
>   matrix (Liaw, Andy)
>   33. Re: Installing SJava - problem
> (mister_bluesman)
>   34. Re: more woes trying to convert a data.frame
> to a  numerical
>   matrix (Marc Schwartz)
>   35. Re: urca package - summary method - (Pfaff,
> Bernhard Dr.)
>   36. Re: Installing SJava - problem
> (mister_bluesman)
>   37. Re: how to reduce in a grid ? (Norma Leyva)
>   38. Re: Installing SJava - problem (Prof Brian
> Ripley)
>   39. Re: Installing SJava - problem
> (mister_bluesman)
>   40. substitute "x" for "pattern" in a list, while
> preservign list
>   "structure". lapply, gsub, list...? (new
> ruser)
>   41. Unable to compile "Matrix" package (Vittorio
> De Martino)
>   42. Re: substitute "x" for "pattern" in a list,
> while preservign
>   list "structure". lapply, gsub, list...? (Marc
> Schwartz)
>   43. read.table opening a website incl Password
> (Roland Rau)
>   44. Re: lmer function (Douglas Bates)
>   45. Re: Unable to compile "Matrix" package
> (Douglas Bates)
>   46. lmer error confusion (Rick DeShon)
>   47. Re: read.table opening a website incl Password
> (Chuck Cleland)
>   48. Re: substitute "x" for "pattern" in a list,
> while preservign
>   list "structure". lapply, gsub, list...?
> (Gabor Grothendieck)
>   49. Re: lmer error confusion (Douglas Bates)
>   50. Re: read.table opening a website incl Password
> (Prof Brian Ripley)
>   51. Re: read.table opening a website incl Password
> (Bos, Roger)
>   52. Re: substitute "x" for "pattern" in a list,
> while preservign
>   list "structure". lapply, gsub, list...? (Marc
> Schwartz)
>   53. Re: read.table opening a website incl Password
> (Roland Rau)
>   54. Re: substitute "x" for "pattern" in a list,
> while preservign
>   list "structure". lapply, gsub,

[R] time series

2007-05-18 Thread jessica . gervais


Dear all,

I am working with a data file which is the record of precipitation
measurement normaly done every 10 minutes. I would like to check if there
are missing times in my data file.

Is there a function existing able to check for that in R ?

Thanks by advance,


Jessica

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Cross-validation for logistic regression with lasso2

2007-05-18 Thread francogrex


Hello, I am trying to shrink the coefficients of a logistic regression for a
sparse dataset, I am using the lasso (lasso2) and I am trying to determine
the shrinkinage factor by cross-validation. I would like please some of the
experts here to tell me whether i'm doing it correctly or not. Below is my
dataset and the functions I use

w=
a   b   c   d   e   P   A
0   0   0   0   0   1   879
1   0   0   0   0   1   3
0   1   0   0   0   7   7
0   0   1   0   0   230 2
0   0   0   1   0   450 7
0   0   0   0   1   4   

#The GLM output shows that the coefficients c and d are larger than 10:
resp=cbind(w$P,w$A)
summary(glm(resp~a+b+c+d+e,data=w,family=binomial))
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept)   -6.779  1.001  -6.775 1.24e-11 ***
a  5.680  1.528   3.718 0.000201 ***
b  6.779  1.134   5.976 2.29e-09 ***
c 11.524  1.227   9.392  < 2e-16 ***
d 10.942  1.071  10.220  < 2e-16 ***
e  3.688  1.124   3.282 0.001031 ** 

#so I wrote this below using the lasso2 package to determine the best
shrinkage factor using the gcv cross-validation:

for (i in seq(1,40,1)) {
glmba=gl1ce(resp~a+b+c+d+e, data = w, family = binomial(),bound=i) 
ecco=round(gcv(glmba,type="Tibshirani",gen.inverse.diag =1e11),digits=3)
print(ecco)
}
#and it gives me 21 with the lowest gcv.

#then I determine the shrunken coefficients:
>gl1ce( resp ~ a + b + c + d + e, data = w, family = binomial(),  bound =
21)
Coefficients:
(Intercept)   a   b   c d   
 
e 
  -4.7498162.7762154.3426618.9565838.6615931.264660 
Family:
Family: binomial 
Link function: logit 
The absolute L1 bound was   :  21 
The Lagrangian for the bound is :  1.843283 

Thanks

-- 
View this message in context: 
http://www.nabble.com/Cross-validation-for-logistic-regression-with-lasso2-tf3777173.html#a10680591
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Inverse gamma

2007-05-18 Thread Alberto Monteiro

Patrick Wang wrote:
> 
> assume I need to generate X from inverse gamma with parameter (k,
>  beta).
> 
> should I generate from Y from gamma(-k, beta),
> 
> then take X=1/Y?
> 
Check the Borg of All Wisdom...
http://en.wikipedia.org/wiki/Inverse-gamma_distribution

Generate Y from gamma(k, 1/beta) (using...
  rgamma(n = number.of.points, shape = k, scale = 1/beta)
... or ...
  rgamma(n = number.of.points, shape = k, rate = beta)
) and take X = 1/Y

(unless your beta is not the rate parameter...)

Alberto Monteiro

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] unscuscribe

2007-05-18 Thread Wassim Kamoum

unsuscribe

-

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] using lm() with variable formula [Broadcast]

2007-05-18 Thread Liaw, Andy

One way to do it is by giving a data frame with the right variables to
lm() as the first argument each time.  If lm() is given a data frame as
the first argument, it will treat the first variable as the LHS and the
rest as the RHS of the formula.

As examples, you can do:

lm(myData[c("height", "weight", "BP", "Cals")])

(The drawback to this is that the "formula" in the fitted model object
looks a bit strange...)

Andy


From: Chris Elsaesser
> 
> New to R; please excuse me if this is a dumb question.  I 
> tried to RTFM;
> didn't help.
> 
> I want to do a series of regressions over the columns in a data.frame,
> systematically varying the response variable and the the 
> terms; and not
> necessarily including all the non-response columns.  In my case, the
> columns are time series. I don't know if that makes a difference; it
> does mean I have to call lag() to offset non-response terms. I can not
> assume a specific number of columns in the data.frame; might 
> be 3, might
> be 20. 
> 
> My central problem is that the formula given to lm() is different each
> time.  For example, say a data.frame had columns with the following
> headings:  height, weight, BP (blood pressure), and Cals 
> (calorie intake
> per time frame).  In that case, I'd need something like the following:
> 
>   lm(height ~ weight + BP + Cals)
>   lm(height ~ weight + BP)
>   lm(height ~ weight + Cals)
>   lm(height ~ BP + Cals)
>   lm(weight ~ height + BP)
>   lm(weight ~ height + Cals)
>   etc.
> 
> In general, I'll have to read the header to get the argument labels.
> 
> Do I have to write several functions, each taking a different 
> number of
> arguments?  I'd like to construct a string or list representing the
> varialbes in the formula and apply lm(), so to say  [I'm mainly a Lisp
> programmer where that part would be very simple. Anyone have 
> a Lisp API
> for R? :-}]
> 
> Thanks,
> chris
> 
> Chris Elsaesser, PhD
> Principal Scientist, Machine Learning
> SPADAC Inc.
> 7921 Jones Branch Dr. Suite 600  
> McLean, VA 22102  
> 
> 703.371.7301 (m)
> 703.637.9421 (o)
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 
> 


--
Notice:  This e-mail message, together with any attachments,...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] naive question about using an object as the name of another object

2007-05-18 Thread Duncan Murdoch

On 17/05/2007 10:11 PM, Andrew Yee wrote:
> This is a dumb question, but I'm having trouble finding the answer to this.
> 
> I'd like to do the following:
> 
> x<-"asdf"
> 
> and then have
> 
> the object x.y become automatically converted/represented as asdf.y (sort of
> akin to macro variables in SAS where you would do:
> %let x=asdf and do &x..y)
> 
> What is the syntax for having x represented as "asdf" in x.y ?

You can use assign( gsub("x", x, "x.y"), x.y ), but this is not a normal 
thing to do in R programs:  R is not SAS. You should investigate using a 
list and setting a member of it, e.g.

asdf$y <- value

or

f <- function(value) list(y=value)
asdf <- f(value)

depending on what you are trying to do.

Duncan Murdoch

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] .jinit() problem

2007-05-18 Thread mister_bluesman

Please let me know if you require any more information.

Thanks

mister_bluesman wrote:
> 
> Hello there.
> 
> When I try to start the jvm using .jinit() after loading library(rJava) I
> don't seem to be able to as I get the message: 
> 
> Error in .jinit() : Cannot create Java Virtual Machine 
> 
> What is going on here? I have java 1.6 installed on my XP machine
> 
> Thanks again. 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/.jinit%28%29-problem-tf3774265.html#a10679584
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] .jinit() problem

2007-05-18 Thread mister_bluesman


Please let me know if you needany other information


mister_bluesman wrote:
> 
> Hello there.
> 
> When I try to start the jvm using .jinit() after loading library(rJava) I
> don't seem to be able to as I get the message: 
> 
> Error in .jinit() : Cannot create Java Virtual Machine 
> 
> What is going on here? I have java 1.6 installed on my XP machine
> 
> Thanks again. 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/.jinit%28%29-problem-tf3774265.html#a10679578
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Anderson-Darling GoF (re-sent)

2007-05-18 Thread Shiazy

Hi,
I'm not a statistician so sorry for possible trivial questions ...

I want to perform a GoF test on sample data against several distribution 
(like Extreme Value, Phase Type, Pareto, ...).

Since I suspect a long-tailed behaviour on data I want to use 
Anderson-Darling (AD) GoF test because it's well known it's more 
sensible to tail data.

Looking at R packages the only AD test is the AD normality test 
("ad.test") in the "nortest" package. So I think this function is not 
for me since long-tailed samples aren't normally distribuited (right?!)

I've found the Marsaglia article ("Evaluating the Anderson Darling 
distribution") where it seems I can consider the ECDF (empirical CDF) 
and the theoretical as a uniformly [0,1] distributed data and then 
perform the test like I had to compare two uniform distribution. The 
problem is the theoretical CDF ( i.e. the parameters of theoretical 
distribution) has been estimated from the data against which I want to 
make the test. I've read somewhere it's not a good technique to compare 
the distribution with the above way because the resulting AD test might 
be biased.

So, finally, I don't know how to proceed ...

Can anyone give me a help or any reference (please remember I'm not a 
statistician so do not write too technically)??

Thanks a lot to everyone!!

-- Marco

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Fwd: Re: Goodness-of-fit test for gamma distribution?

2007-05-18 Thread Sean Connolly

Thanks Petr. Comments below:

At 03:40 PM 18/05/2007, Petr Klasterecky wrote:

>Sean Connolly napsal(a):
>>Hi all,
>>I am wondering if anyone has written (or knows of) a function that 
>>will conduct a goodness-of-fit test for a gamma distribution. I am 
>>especially interested in test statistics have some asymptotic 
>>parametric distribution that is independent of sample size or 
>>values of fitted parameters (e.g., a chi-squared distribution with some

Petr's reply:

>The GOF test will always depend on the parameter values, since it 
>has to estimate them (if you don't provide them yourself). Anyway, 
>the gamma family is so versatile that you can fit *some* gamma 
>distribution to almost any nonnegative continuous data.

Sean's reply to Petr:

An example of what I'm looking for would be the "K-squared" statistic 
that tests for normality (D'Agostino and Pearson 1973, Biometrika 60: 
613, also in Zar, 1996, Biostatistical Analysis, p89). The expected 
distribution of the test statistic is approximately chi-squared with 
2df, regardless of values of estimated parameters or sample size 
(provided sample size is sufficiently large).

Petr's reply:

>Maybe it is easier and sufficient to use the Kolmogorov - Smirnov 
>test, that is implemented as ks.test() in R. However, I am not able 
>to check your reference, so my comment may not be what you want at all.

Sean's reply to Petr:

My understanding is that the K-S test requires that parameters be 
specified (i.e., not estimated from data), and that the test 
statistic depends on sample size. Am I missing something?

Thanks again.

Sean

>>Sean R. Connolly, PhD
>>Associate Professor
>>ARC Centre of Excellence for Coral Reef Studies, and
>>School of Marine and Tropical Biology
>>James Cook University
>>Townsville, QLD 4811
>>AUSTRALIA
>>__
>>R-help@stat.math.ethz.ch mailing list
>>https://stat.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>and provide commented, minimal, self-contained, reproducible code.
>
>--
>Petr Klasterecky
>Dept. of Probability and Statistics
>Charles University in Prague
>Czech Republic

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Anderson-Darling GoF

2007-05-18 Thread Shiazy Fuzzy

Hi,
I'm not a statistician so sorry for possible trivial questions ...

I want to perform a GoF test on sample data against several distribution
(like Extreme Value, Phase Type, Pareto, ...).

Since I suspect a long-tailed behaviour on data I want to use
Anderson-Darling (AD) GoF test because it's well known it's more sensible to
tail data.

Looking at R packages the only AD test is the AD normality test ("ad.test")
in the "nortest" package. So I think this function is not for me since
long-tailed samples aren't normally distribuited (right?!)

I've found the Marsaglia article ("Evaluating the Anderson Darling
distribution") where it seems I can consider the ECDF (empirical CDF) and
the theoretical as a uniformly [0,1] distributed data and then perform the
test like I had to compare two uniform distribution. The problem is the
theoretical CDF (i.e. the parameters of theoretical distribution) has been
estimated from the data against which I want to make the test. I've read
somewhere it's not a good technique to compare the distribution with the
above way because the resulting AD test might be biased.

So, finally, I don't know how to proceed ...

Can anyone give me a help or any reference (please remember I'm not a
statistician so do not write too technically)??

Thanks a lot to everyone!!

-- Marco

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] add info

2007-05-18 Thread Dimitris Rizopoulos

you could use attributes, e.g.,

dat <- data.frame(x = 1:3, y = letters[1:3])
attr(dat, "name") <- "my data.frame"
attr(dat, "author") <- "John Smith"
attr(dat, "date") <- "2007-05-18"

##

dat
attributes(dat)


I hope it helps.

Best,
Dimitris


Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://med.kuleuven.be/biostat/
 http://www.student.kuleuven.be/~m0390867/dimitris.htm


- Original Message - 
From: "XinMeng" <[EMAIL PROTECTED]>
To: 
Sent: Friday, May 18, 2007 9:05 AM
Subject: [R] add info


> hi all:
> If there's a dataframe:
>
> x y
> 1 a
> 2 b
> 3 c
>
> The info of the data such as :
>
> name
> date
> author
>
> The result I want is:
>
> name
> date
> author
> x y
> 1 a
> 2 b
> 3 c
>
> In other words,I wanna add the info above the dataframe.
>
> How can I do it ?
>
> Thanks a lot!
>
>
> My best
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 


Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

75 matches

Mail list logo