subject:"\[R\] Question"

[R] question about non-linear least squares in R

2007-09-05 Thread Yu (Warren) Wang

Hi, everyone,
My question is: It's not every time that you can get a converged 
result from the nls function. Is there any solution for me to get a 
reasonable result? For example:

x - c(-0.06,-0.04,-0.025,-0.015,-0.005,0.005,0.015,0.025,0.04,0.06)

y - 
c(1866760,1457870,1314960,1250560,1184850,1144920,1158850,1199910,1263850,1452520)

fitOup- nls(y ~ constant + A*(x-MA)^4 + B*(x-MA)^2, 
start=list(constant=1000, A=1, B=-100, MA=0), 
control=nls.control(maxiter=100, minFactor=1/4096), trace=TRUE)

 

 For this one, I cannot get the converged result, how can I reach it? To 
use another funtion or to modify some settings for nls?

Thank you very much!

Yours,

Warren

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] question about non-linear least squares in R

2007-09-05 Thread Moshe Olshansky

Below is one possibility:

If you knew MA you would get a regular linear
least-squares for parameters A,B and constant which
can be easily solved. So now you can define a function
f(MA) which returns that value. Now you must minimize
that f - a function of one argument. It can have
several local minima and so you must be careful but I
believe that minimizing (even bad) function of one
argument should be easier than your original problem.

Regards,

Moshe.

P.S. if you do this I would be interested to know
whether this works.

--- Yu (Warren) Wang [EMAIL PROTECTED] wrote:

 Hi, everyone,
 My question is: It's not every time that you can
 get a converged 
 result from the nls function. Is there any solution
 for me to get a 
 reasonable result? For example:
 
 x -

c(-0.06,-0.04,-0.025,-0.015,-0.005,0.005,0.015,0.025,0.04,0.06)
 
 y - 

c(1866760,1457870,1314960,1250560,1184850,1144920,1158850,1199910,1263850,1452520)
 
 fitOup- nls(y ~ constant + A*(x-MA)^4 + B*(x-MA)^2,
 
 start=list(constant=1000, A=1,
 B=-100, MA=0), 
 control=nls.control(maxiter=100, minFactor=1/4096),
 trace=TRUE)
 
  
 
  For this one, I cannot get the converged result,
 how can I reach it? To 
 use another funtion or to modify some settings for
 nls?
 
 Thank you very much!
 
 Yours,
 
 Warren
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] question about non-linear least squares in R

2007-09-05 Thread apjaworski

Here is one way of getting a reasonable fit:

1.  Scale your y's by dividing all values by 1e6.
2.  Plot x vs. y.  The plot looks like a quadratic function.
3.  Fit a quadratic const. + B*x^2 - this a linear regression problem so
use lm.
4.  Plot the predictions.
5.  Eyball the necessary shift - MA is around 0.01.  Refit const. +
B*(x-.01)^2.  Should get const.=1.147 and B=139.144
6.  Use start=list(const.= 1.147, A=0, B=1.147, MA=.01).  nls should
converge in 4 iterations.

In general, good starting points may be crucial to nls convergence.
Scaling the y's to reasonable values also helps.

Hope this helps,

Andy

__
Andy Jaworski
518-1-01
Process Laboratory
3M Corporate Research Laboratory
-
E-mail: [EMAIL PROTECTED]
Tel:  (651) 733-6092
Fax:  (651) 736-3122


   
 Yu (Warren)  
 Wang 
 [EMAIL PROTECTED]  To 
 Sent by:  r-help@stat.math.ethz.ch  
 [EMAIL PROTECTED] r-help@stat.math.ethz.ch  
 at.math.ethz.chcc 
   
   Subject 
 09/05/2007 02:51  [R] question about non-linear least 
 AMsquares in R
   
   
   
   
   
   




Hi, everyone,
My question is: It's not every time that you can get a converged
result from the nls function. Is there any solution for me to get a
reasonable result? For example:

x - c(-0.06,-0.04,-0.025,-0.015,-0.005,0.005,0.015,0.025,0.04,0.06)

y -
c(1866760,1457870,1314960,1250560,1184850,1144920,1158850,1199910,1263850,1452520)


fitOup- nls(y ~ constant + A*(x-MA)^4 + B*(x-MA)^2,
start=list(constant=1000, A=1, B=-100, MA=0),
control=nls.control(maxiter=100, minFactor=1/4096), trace=TRUE)



 For this one, I cannot get the converged result, how can I reach it? To
use another funtion or to modify some settings for nls?

Thank you very much!

Yours,

Warren

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Question about making array dataset inside a package

2007-09-05 Thread Wallace Hui

Hello

I have a question about how I can include an array dataset inside a  
package. In our package, one R function is designed to take an array  
of 2 rows * 2 columns * k levels as input data enter by the user,  
where k is positive integers. I am trying to include a 3-D array of  
2-by-2-by-8 as dataset in the package, so users can load the data  
using data(dataname). We prefer to load the data as dataset, rather  
than include the long syntax in the help file for users to copy and  
paste. I can generate array in R console using long chain of syntax  
like:arraydat=array(.omitted...)

However, I cannot figure a way to save the data in 3D array data frame format
using write(), write.table(), or use data.frame() etc. If I directly  
copy-paste the screen output to a text file. I cannot read it into R  
using like:

 arraydat=read.table(array.txt, header=TRUE)

This will give me a 2 rows by (2*k) columns two-dimension data set,  
and lost all level (or depth) structure for the purpose of our R  
function. The header is messed too.

I appreciate if any expert can tell me how to include/save array  
stucture in dataset in R, and also how to read it into R to preserve  
the header and array structure.

Thank you for your time.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question about making array dataset inside a package

2007-09-05 Thread Kuhn, Max

Please read 

  http://cran.r-project.org/doc/manuals/R-exts.html 

especially Section 1.1.3. Use save to create an Rdata file.

Max

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Wallace Hui
Sent: Wednesday, September 05, 2007 4:18 PM
To: r-help@stat.math.ethz.ch
Subject: [R] Question about making array dataset inside a package

Hello

I have a question about how I can include an array dataset inside a  
package. In our package, one R function is designed to take an array  
of 2 rows * 2 columns * k levels as input data enter by the user,  
where k is positive integers. I am trying to include a 3-D array of  
2-by-2-by-8 as dataset in the package, so users can load the data  
using data(dataname). We prefer to load the data as dataset, rather  
than include the long syntax in the help file for users to copy and  
paste. I can generate array in R console using long chain of syntax  
like:arraydat=array(.omitted...)

However, I cannot figure a way to save the data in 3D array data frame
format
using write(), write.table(), or use data.frame() etc. If I directly  
copy-paste the screen output to a text file. I cannot read it into R  
using like:

 arraydat=read.table(array.txt, header=TRUE)

This will give me a 2 rows by (2*k) columns two-dimension data set,  
and lost all level (or depth) structure for the purpose of our R  
function. The header is messed too.

I appreciate if any expert can tell me how to include/save array  
stucture in dataset in R, and also how to read it into R to preserve  
the header and array structure.

Thank you for your time.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
LEGAL NOTICE\ Unless expressly stated otherwise, this messag...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] question: randomization t-test function already defined in R?

2007-09-05 Thread Karen.Green

Dear R Users,
 
I am hoping you can help me.
 
I have received code from a colleague who uses Matlab.  I need to
translate it into R.
 
I am wondering if there is a randomization t-test (from non-parametric
statistics) function already defined in R.
(In Matlab the function is randtest.m.)
 

**
QUESTION:  Is anyone aware of a randomization t-test function in R?

**

Thank you for your help,

Karen

---

 Karen M. Green, Ph.D.

 Research Investigator

 Drug Design Group

 Sanofi Aventis Pharmaceuticals

 1580 E. Hanley Blvd.

 Tucson, AZ  85737-9525

  USA 


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] question: randomization t-test function already defined in R?

2007-09-05 Thread Achim Zeileis

On Wed, 5 Sep 2007, [EMAIL PROTECTED] wrote:

 Dear R Users,

 I am hoping you can help me.

 I have received code from a colleague who uses Matlab.  I need to
 translate it into R.

 I am wondering if there is a randomization t-test (from non-parametric
 statistics) function already defined in R.
 (In Matlab the function is randtest.m.)

I don't know what randtest in MATLAB does exactly, but I guess that you 
might want to look at the function independence_test() in package coin. 
The distribution argument should probably set to use either exact or 
approximate p values.

hth,
Z

 
 **
 QUESTION:  Is anyone aware of a randomization t-test function in R?
 
 **

 Thank you for your help,

 Karen

 ---

 Karen M. Green, Ph.D.

 Research Investigator

 Drug Design Group

 Sanofi Aventis Pharmaceuticals

 1580 E. Hanley Blvd.

 Tucson, AZ  85737-9525

  USA


   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Question on graphs and simple if statements

2007-09-02 Thread JHawk3


Hi all,

I'm a relatively new user to the R interface, and am trying to do some basic
operations. So far I've used the curve() command to plot graphs. This has a
couple of limitations, namely trying to do piecewise graphs. Basically, I'm
looking for how to do if and while loops for making a graph.

in simple code, something like:

if x is on the interval [-1,1], x=x^2+1
otherwise, x=x^2

something along those lines. I also am wondering if it's possible to graph
something like this:
while(n20)
{
curve(1/n^2, -1,1)
n++
}
Along these same lines, is there a way to graph finite series?

Thank you for your time and your replies.

-- 
View this message in context: 
http://www.nabble.com/Question-on-graphs-and-simple-if-statements-tf4366956.html#a12446999
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question on graphs and simple if statements

2007-09-02 Thread Ross Darnell

The answer to your first question is

curve(ifelse((x-1)(x1),x^2+1,x+2),from=-10,to=10)

Sorry I cannot understand the next part of your query.

Ross Darnell

On Sun, 2007-09-02 at 00:31 -0700, JHawk3 wrote:
 Hi all,
 
 I'm a relatively new user to the R interface, and am trying to do some basic
 operations. So far I've used the curve() command to plot graphs. This has a
 couple of limitations, namely trying to do piecewise graphs. Basically, I'm
 looking for how to do if and while loops for making a graph.
 
 in simple code, something like:
 
 if x is on the interval [-1,1], x=x^2+1
 otherwise, x=x^2
 
 something along those lines. I also am wondering if it's possible to graph
 something like this:
 while(n20)
 {
 curve(1/n^2, -1,1)
 n++
 }
 Along these same lines, is there a way to graph finite series?
 
 Thank you for your time and your replies.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Question on shardsplot

2007-08-31 Thread ebi

Dear All,



Would you please tell me how to display the sample No. on the map ?



---Below commands don't display the sample No.(from 1 to 150).---

library(som)

library(klaR)

iris.som3 - som(iris[,1:4], xdim = 14,ydim = 6)

library(klaR); opar- par(xpd = NA)

shardsplot(iris.som3, data.or = iris,label = TRUE)

legend(3.5,14.3, col = rainbow(3), xjust =0.5, yjust = 0,legend =
levels(iris[, 5]),pch = 16, horiz = TRUE)

par(opar)



Ebi

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question on shardsplot

2007-08-31 Thread Uwe Ligges

Ebi, keisyu, or whatever your name is,

I know that this questions has already been answered by the shardplot 
author in a private thread, where this has been posted under a different 
name. Why do you obscure your real name on the list???

The answer by Nils Raabe was that shardsplot is intended to draw the 
cluster map but adding sample numbers is more difficult. His quick 
proposal that needs some minor tweaking is:

xycords - cbind(kronecker(1:14, rep(1,6)), rep(1:6, 14))
labs - sapply(1:nrow(xycords),
 function(x) paste(as.character(which(
   (iris.som3$visual[,1] + 1 == xycords[x,1]) *
   (iris.som3$visual[,2] + 1 == xycords[x,2])   == 1)),
collapse = ;))
text(cbind(xycords[,2], xycords[,1]), labs, cex=0.75)


Uwe Ligges



ebi wrote:
 Dear All,
 
 
 
 Would you please tell me how to display the sample No. on the map ?
 
 
 
 ---Below commands don't display the sample No.(from 1 to 150).---
 
 library(som)
 
 library(klaR)
 
 iris.som3 - som(iris[,1:4], xdim = 14,ydim = 6)
 
 library(klaR); opar- par(xpd = NA)
 
 shardsplot(iris.som3, data.or = iris,label = TRUE)
 
 legend(3.5,14.3, col = rainbow(3), xjust =0.5, yjust = 0,legend =
 levels(iris[, 5]),pch = 16, horiz = TRUE)
 
 par(opar)
 
 
 
 Ebi
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Question about writing some code

2007-08-20 Thread squall44


Just a short question:
Why does this work:

x = c(1.6,1.8,2.4,2.7,2.9,3.3,3.4,3.4,4,5.2)
F26-boxplot(x,
add=FALSE,
horizontal=TRUE,
main=Figur 2.6 Boxplot,
axes=FALSE)

...but this doesn't...

x = c(1.6,1.8,2.4,2.7,2.9,3.3,3.4,3.4,4,5.2)
F26 - boxplot(x)
plot(F26,
add=FALSE,
horizontal=TRUE,
main=Figur 2.6 Boxplot,
axes=FALSE)

Thanks for any help,
Squall44
-- 
View this message in context: 
http://www.nabble.com/Question-about-writing-some-code-tf4297359.html#a12231853
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question about writing some code

2007-08-20 Thread Uwe Ligges



squall44 wrote:
 Just a short question:
 Why does this work:
 
 x = c(1.6,1.8,2.4,2.7,2.9,3.3,3.4,3.4,4,5.2)
 F26-boxplot(x,
 add=FALSE,
 horizontal=TRUE,
 main=Figur 2.6 Boxplot,
 axes=FALSE)


boxplot returns some information on the boxplot, but the result cannot 
be plotted by boxplot itself. You can use bxp() to do so.

Uwe Ligges



 ...but this doesn't...
 
 x = c(1.6,1.8,2.4,2.7,2.9,3.3,3.4,3.4,4,5.2)
 F26 - boxplot(x)
 plot(F26,
 add=FALSE,
 horizontal=TRUE,
 main=Figur 2.6 Boxplot,
 axes=FALSE)
 
 Thanks for any help,
 Squall44

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question on R server and TinnR

2007-08-20 Thread adschai

So if you go to TinnR in R menu on the menu bar and choose 'R server: 
connections and tests'. What does that do exactly? It seems to me that there's 
a way to have R listening to incoming request from remote machine from TinnR. 
If anyone could share your knowledge, I'd really appreciate. Thank you again.

- adschai

- Original Message -
From: Adaikalavan Ramasamy 
Date: Monday, August 20, 2007 8:19 am
Subject: Re: [R] Question on R server and TinnR
To: [EMAIL PROTECTED]

 I don't understand your question on how to make R act as a server.

 If you want to execute different pre-written R scripts then have 
 a look 
 at help(BATCH) and help(commandArgs).

 Regards, Adai

 [EMAIL PROTECTED] wrote:
  Hi -

  Classic question that I tried to look up online and couldn't 
 find a clear answer. It seems that I can have R to act as a 
 server. But I never know how this works. Would anyone please 
 provide an example or introduction material where I can learn 
 about this? I'm trying to build an environment where computation 
 are distributed/delegated among different servers requested 
 whenever I need. Thank you.

  - adschai

  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-
 project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question about lme and AR(1)

2007-08-19 Thread Thomas Lumley

On Sat, 18 Aug 2007, Francisco Redelico wrote:

 Dear R users,


 As far as I know, EM algorithm can be only applied to estimate parameter 
 from a regular exponential family.

No. The EM algorithm will converge to a stationary point (and except in 
artificial cases, to a local maximum) for any likelihood.

The special case you may be thinking of is that in some problems the 
E-step is equivalent to computing E[missing data | observed data] rather 
than the more general E[loglikelihood|observed data]

-thomas

Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Question on R server and TinnR

2007-08-19 Thread adschai

Hi -

Classic question that I tried to look up online and couldn't find a clear 
answer. It seems that I can have R to act as a server. But I never know how 
this works. Would anyone please provide an example or introduction material 
where I can learn about this? I'm trying to build an environment where 
computation are distributed/delegated among different servers requested 
whenever I need. Thank you.

- adschai

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question about unicode characters in tcltk

2007-08-18 Thread Peter Dalgaard

R Help wrote:
 hello list,

 Can someone help me figure out why the following code doesn't work?
 I'm trying to but both Greek letters and subscripts into a tcltk menu.
  The code creates all the mu's, and the 1 and 2 subscripts, but it
 won't create the 0.  Is there a certain set of characters that R won't
 recognize the unicode for?  Or am I input the \u2080 incorrectly?

 library(tcltk)
 m -tktoplevel()
 frame1 - tkframe(m)
 frame2 - tkframe(m)
 frame3 - tkframe(m)
 entry1 - tkentry(frame1,width=5,bg='white')
 entry2 - tkentry(frame2,width=5,bg='white')
 entry3 - tkentry(frame3,width=5,bg='white')

 tkpack(tklabel(frame1,text='\u03bc\u2080'),side='left')
 tkpack(tklabel(frame2,text='\u03bc\u2081'),side='left')
 tkpack(tklabel(frame3,text='\u03bc\u2082'),side='left')

 tkpack(frame1,entry1,side='top')
 tkpack(frame2,entry2,side='top')
 tkpack(frame3,entry3,side='top')

 thanks
 -- Sam

   
Which OS was this? I can reproduce the issue on SuSE, but NOT Fedora 7.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question about unicode characters in tcltk

2007-08-18 Thread Gavin Simpson

On Sat, 2007-08-18 at 14:40 +0200, Peter Dalgaard wrote:
 R Help wrote:
  hello list,
 
  Can someone help me figure out why the following code doesn't work?
  I'm trying to but both Greek letters and subscripts into a tcltk menu.
   The code creates all the mu's, and the 1 and 2 subscripts, but it
  won't create the 0.  Is there a certain set of characters that R won't
  recognize the unicode for?  Or am I input the \u2080 incorrectly?
 
  library(tcltk)
  m -tktoplevel()
  frame1 - tkframe(m)
  frame2 - tkframe(m)
  frame3 - tkframe(m)
  entry1 - tkentry(frame1,width=5,bg='white')
  entry2 - tkentry(frame2,width=5,bg='white')
  entry3 - tkentry(frame3,width=5,bg='white')
 
  tkpack(tklabel(frame1,text='\u03bc\u2080'),side='left')
  tkpack(tklabel(frame2,text='\u03bc\u2081'),side='left')
  tkpack(tklabel(frame3,text='\u03bc\u2082'),side='left')
 
  tkpack(frame1,entry1,side='top')
  tkpack(frame2,entry2,side='top')
  tkpack(frame3,entry3,side='top')
 
  thanks
  -- Sam
 

 Which OS was this? I can reproduce the issue on SuSE, but NOT Fedora 7.

I can reproduce this on Fedora 7 in that the \u2080 is reproduced as is
and not as a subscript, unlike the other \u which appear as
subscripted characters,

 sessionInfo()
R version 2.5.1 Patched (2007-08-02 r42389) 
i686-pc-linux-gnu 

locale:
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C

attached base packages:
[1] tcltk stats graphics  grDevices utils
datasets 
[7] methods   base 

If there is something specific to my Fedora installation that is
different to Peter's that I can ascertain from installed packages/fonts
etc, then let me know and I can provide the output from my laptop.

G

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Question about lme and AR(1)

2007-08-18 Thread Francisco Redelico

Dear R users,


As far as I know, EM algorithm can be only applied to estimate parameter from a 
regular exponential family. 
A multivariate normal distribution with an AR(1) matrix as covariance matrix 
does not belong to a regular exponential family, it is belong to  a curved 
exponential family, so EM algorithm can not be applied to estimate parameters 
for this kind of  distribution.
I have used nle function from nlme package to estimate variance components with 
correlation=corAr1, this function uses first EM algorithm to refine the initial 
estimates of the random effects variance-covariance coefficients and uses them 
into a Newton-Raphson algorithm.

Do anyone know what kind of modification of the EM algorithm use lme function 
to solve the problem mentioned below?

Thank you in advance for your help

Francisco



[[alternative HTML version deleted]]


__
Correo Yahoo!
Espacio para todos tus mensajes, antivirus y antispam !gratis!

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question about sm.options sm.survival

2007-08-17 Thread Prof Brian Ripley

On Thu, 16 Aug 2007, Rachel Jia wrote:

 Hi, there:

 It's my first time to post question in this forum, so thanks for your
 tolerance if my question is too naive. I am using a nonparametric smoothing
 procedure in sm package to generate smoothed survival curves for continuous
 covariate. I want to truncate the suvival curve and only display the part
 with covariate value between 0 and 7. The following is the code I wrote:

 sm.options(list(xlab=log_BSI_min3_to_base, xlim=c(0,7), ylab=Median
 Progression Prob))
 sm.survival(min3.base.prog.cen[,3],min3.base.prog.cen[,2],min3.base.prog.cen[,1],h=sd(min3.base.prog.cen[,3]),status.code=1
 )

 But the xlim option does not work. Can anyone help me with this problem?

The help page suggests that you need to use xlim as an inline option (part 
of ...). Following the help page example

 sm.survival(x, y, status, h=2, xlim=c(0,4))

works.  So I think you need to follow the help page exactly.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Question about sm.options sm.survival

2007-08-16 Thread Rachel Jia


Hi, there:

It's my first time to post question in this forum, so thanks for your
tolerance if my question is too naive. I am using a nonparametric smoothing
procedure in sm package to generate smoothed survival curves for continuous
covariate. I want to truncate the suvival curve and only display the part
with covariate value between 0 and 7. The following is the code I wrote:

sm.options(list(xlab=log_BSI_min3_to_base, xlim=c(0,7), ylab=Median
Progression Prob))
sm.survival(min3.base.prog.cen[,3],min3.base.prog.cen[,2],min3.base.prog.cen[,1],h=sd(min3.base.prog.cen[,3]),status.code=1
)

But the xlim option does not work. Can anyone help me with this problem?
Thanks a lot.

Rachel
-- 
View this message in context: 
http://www.nabble.com/Question-about-sm.options---sm.survival-tf4279605.html#a12181263
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Question on output of vectors from a for loop into a matrix

2007-08-16 Thread Ryan Briscoe Runquist


Hello R-help,

I am a recent convert to R from SAS and I am having some difficulty with
output of a for loop into a matrix.  I have searched the help manuals and
the archives, but I can't get my code to work.  It is probably a syntax
error that I am not spotting.  I am trying to make a distance matrix by
extracting the distances from each point to all others in a vector (the for
loop).  I can get them all to print and can save each individually to a
file, but I cannot get them to be bound into a matrix.

Here is the code that I have tried:

m-length(Day.1.flower.coords$x) #31 grid points

output.matix-matrix(0.0,m,m)
for(i in 1:m){
dist-spDistsN1(Day.1.coords.matrix, Day.1.coords.matrix[i,])
dist-as.vector(dist)
output.matrix[i,]-dist
print(dist)} 

The error message that I get is:

Error in output.matrix[i,] - dist : incorrect number of subscripts on matrix

Thanks for your help.

Ryan

~~
Ryan D. Briscoe Runquist
Population Biology Graduate Group
University of California, Davis
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question on output of vectors from a for loop into a matrix

2007-08-16 Thread Dylan Beaudette

On Thursday 16 August 2007 15:35, Ryan Briscoe Runquist wrote:
 Hello R-help,

 I am a recent convert to R from SAS and I am having some difficulty with
 output of a for loop into a matrix.  I have searched the help manuals and
 the archives, but I can't get my code to work.  It is probably a syntax
 error that I am not spotting.  I am trying to make a distance matrix by
 extracting the distances from each point to all others in a vector (the for
 loop).  I can get them all to print and can save each individually to a
 file, but I cannot get them to be bound into a matrix.

 Here is the code that I have tried:

 m-length(Day.1.flower.coords$x) #31 grid points

 output.matix-matrix(0.0,m,m)
 for(i in 1:m){
   dist-spDistsN1(Day.1.coords.matrix, Day.1.coords.matrix[i,])
   dist-as.vector(dist)
   output.matrix[i,]-dist
   print(dist)}

 The error message that I get is:

 Error in output.matrix[i,] - dist : incorrect number of subscripts on
 matrix

 Thanks for your help.

 Ryan

 ~~
 Ryan D. Briscoe Runquist
 Population Biology Graduate Group
 University of California, Davis
 [EMAIL PROTECTED]



Hi Bryan, for all UCD students you have the luxury of not using a loop! :) 

Would something like this work - for the creation of a 'distance matrix' :

# example dataset: 2 x 2 grid
d - expand.grid(x=1:2, y=1:2)

# an instructive plot
plot(d, type='n')
text(d, rownames(d))

# create distance object and convert to matrix:
m - as.matrix(dist(d))
m

 1234
1 0.00 1.00 1.00 1.414214
2 1.00 0.00 1.414214 1.00
3 1.00 1.414214 0.00 1.00
4 1.414214 1.00 1.00 0.00

# is that the kind of distance matrix you are looking for : 

?dist

Distance Matrix Computation

Description:

 This function computes and returns the distance matrix computed by
 using the specified distance measure to compute the distances
 between the rows of a data matrix.
[...]

cheers,

Dylan

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html and provide commented, minimal,
 self-contained, reproducible code.

-- 
Dylan Beaudette
Soils and Biogeochemistry Graduate Group
University of California at Davis
530.754.7341

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Question about unicode characters in tcltk

2007-08-14 Thread R Help

hello list,

Can someone help me figure out why the following code doesn't work?
I'm trying to but both Greek letters and subscripts into a tcltk menu.
 The code creates all the mu's, and the 1 and 2 subscripts, but it
won't create the 0.  Is there a certain set of characters that R won't
recognize the unicode for?  Or am I input the \u2080 incorrectly?

library(tcltk)
m -tktoplevel()
frame1 - tkframe(m)
frame2 - tkframe(m)
frame3 - tkframe(m)
entry1 - tkentry(frame1,width=5,bg='white')
entry2 - tkentry(frame2,width=5,bg='white')
entry3 - tkentry(frame3,width=5,bg='white')

tkpack(tklabel(frame1,text='\u03bc\u2080'),side='left')
tkpack(tklabel(frame2,text='\u03bc\u2081'),side='left')
tkpack(tklabel(frame3,text='\u03bc\u2082'),side='left')

tkpack(frame1,entry1,side='top')
tkpack(frame2,entry2,side='top')
tkpack(frame3,entry3,side='top')

thanks
-- Sam

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question about unicode characters in tcltk

2007-08-14 Thread Peter Dalgaard

R Help wrote:
 hello list,

 Can someone help me figure out why the following code doesn't work?
 I'm trying to but both Greek letters and subscripts into a tcltk menu.
  The code creates all the mu's, and the 1 and 2 subscripts, but it
 won't create the 0.  Is there a certain set of characters that R won't
 recognize the unicode for?  Or am I input the \u2080 incorrectly?

 library(tcltk)
 m -tktoplevel()
 frame1 - tkframe(m)
 frame2 - tkframe(m)
 frame3 - tkframe(m)
 entry1 - tkentry(frame1,width=5,bg='white')
 entry2 - tkentry(frame2,width=5,bg='white')
 entry3 - tkentry(frame3,width=5,bg='white')

 tkpack(tklabel(frame1,text='\u03bc\u2080'),side='left')
 tkpack(tklabel(frame2,text='\u03bc\u2081'),side='left')
 tkpack(tklabel(frame3,text='\u03bc\u2082'),side='left')

 tkpack(frame1,entry1,side='top')
 tkpack(frame2,entry2,side='top')
 tkpack(frame3,entry3,side='top')

   
Odd, but I think not an R issue. I get weirdness in wish too. Try this

% toplevel .a
.a
% label .a.b -text \u03bc\u2080 -font {Roman -10}
.a.b
% pack .a.b
% .a.b configure
{-activebackground
[]
{-text text Text {} μ₀} {-textvariable textVariable Variable {} {}}
{-underline underline Underline -1 -1} {-width width Width 0 0}
{-wraplength wrapLength WrapLength 0 0}
% .a.b configure -font {Helvetica -12 bold} # the default, now shows \u2080
% .a.b configure -font {Roman -10} # back to Roman, *still* shows \u2080

???!!!

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] question regarding is.factor()

2007-08-13 Thread Jabez Wilson

Dear all, please help with what must be a straightforward question which I 
can't answer.
   
  I add a column of my dataframe as factor of an existing column e.g.
   
  df[,5] - factor(df[,2])
   
  and can test that it is by is.factor()
   
  but if I did not know in advance what types the columns were, is there a 
function to tell me what they are.
   
  i.e. instead of is.factor(), is.matrix(), is.list(), a function more like 
what.is()
   
   
   

   
-

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] question regarding is.factor()

2007-08-13 Thread Prof Brian Ripley

typeof() for 'types'.

However, factor is not a type but a class, so class() is probably what 
you want.

On Mon, 13 Aug 2007, Jabez Wilson wrote:

 Dear all, please help with what must be a straightforward question which 
 I can't answer.

But 'An Introduction to R' could.

  I add a column of my dataframe as factor of an existing column e.g.

  df[,5] - factor(df[,2])

  and can test that it is by is.factor()

  but if I did not know in advance what types the columns were, is 
 there a function to tell me what they are.

  i.e. instead of is.factor(), is.matrix(), is.list(), a function more 
 like what.is()


 -

   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] question regarding is.factor()

2007-08-13 Thread Henrique Dallazuanna

See ?class

-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

On 13/08/07, Jabez Wilson [EMAIL PROTECTED] wrote:

 Dear all, please help with what must be a straightforward question which I
 can't answer.

   I add a column of my dataframe as factor of an existing column e.g.

   df[,5] - factor(df[,2])

   and can test that it is by is.factor()

   but if I did not know in advance what types the columns were, is there
 a function to tell me what they are.

   i.e. instead of is.factor(), is.matrix(), is.list(), a function more
 like what.is()





 -

 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] question on glmmML compared to NLMIXED

2007-08-12 Thread Fluss

Hello!
Can anyone help me. I am using the posterior.mode from the result of glmmML.
It apears to be different from the BLUe estimate of  the RANDOM statement in
PROC NLMIXED
in SAS. Why is that?

Thank you
Ronen

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] QUESTION ON R!!!!!!!!!!!1

2007-08-11 Thread Uwe Ligges

I think I asked you yesterday on the [EMAIL PROTECTED] account to read the 
posting guide *before* posting to R-help and I asked you not to 
SHOUT!!!

Uwe Ligges


lecastil wrote:
 Good day. I am employed at a public entity that handles million 
 information and records of several variables and distributed in several 
 topics. For the statistical analyses we use a statistical package, which 
 allows us to call directly of the database (ORACLE) the information and 
 to realize the statistical analysis. Everything is done in brief time.
 
 Nevertheless, we want to know if the statistical package R is capable of 
 doing the same thing that does the statistical package with which we 
 work, that is to say, is R capable of importing information of a 
 database of million records to maximum speed and later statistical 
 analysis allows to realize once imported the information?.
 
 I am grateful for prompt response to you since it is of supreme urgency 
 to know this information.
 
 Luis Eduardo Castillo Méndez.
 
 **
  
 
 
 Buen día. Trabajo en una entidad pública que maneja millones de datos y 
 registros de varias variables y distribuidos en varios tópicos. Para los 
 análisis estadísticos usamos un paquete estadístico, el cuál nos permite 
 llamar directamente de la base de datos (ORACLE) la información y 
 realizar el análisis estadístico. Todo se hace en tiempo breve.
 
 Sin embargo, queremos saber si el paquete estadístico R es capaz de 
 hacer lo mismo que hace el paquete estadístico con el que trabajamos, es 
 decir, R es capaz de importar datos de una base de datos de millones de 
 registros a velocidad máxima y después permita realizar análisis 
 estadístico una vez importado los datos? .
 
 Te agradezco pronta respuesta ya que es de suma urgencia saber este dato.
 
 Luis Eduardo Castillo Méndez.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] QUESTION ON R!!!!!!!!!!!1

2007-08-10 Thread lecastil

Good day. I am employed at a public entity that handles million 
information and records of several variables and distributed in several 
topics. For the statistical analyses we use a statistical package, which 
allows us to call directly of the database (ORACLE) the information and 
to realize the statistical analysis. Everything is done in brief time.

Nevertheless, we want to know if the statistical package R is capable of 
doing the same thing that does the statistical package with which we 
work, that is to say, is R capable of importing information of a 
database of million records to maximum speed and later statistical 
analysis allows to realize once imported the information?.

I am grateful for prompt response to you since it is of supreme urgency 
to know this information.

Luis Eduardo Castillo Méndez.

**
 


Buen día. Trabajo en una entidad pública que maneja millones de datos y 
registros de varias variables y distribuidos en varios tópicos. Para los 
análisis estadísticos usamos un paquete estadístico, el cuál nos permite 
llamar directamente de la base de datos (ORACLE) la información y 
realizar el análisis estadístico. Todo se hace en tiempo breve.

Sin embargo, queremos saber si el paquete estadístico R es capaz de 
hacer lo mismo que hace el paquete estadístico con el que trabajamos, es 
decir, R es capaz de importar datos de una base de datos de millones de 
registros a velocidad máxima y después permita realizar análisis 
estadístico una vez importado los datos? .

Te agradezco pronta respuesta ya que es de suma urgencia saber este dato.

Luis Eduardo Castillo Méndez.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Question regarding QT device

2007-08-05 Thread Saptarshi Guha

Hi,
After a few modifications in the makefiles, I successfully compiled  
the Qt device (written by Deepayan Sirkar) for OS X 10.4.9 on a  
Powerbook.
However when loading into R

If i remove this line from zzz.R in qtutils/R

grDevices::deviceIsInteractive(QT)

and then install
library(qtutils)

loads fine and the QT() calls returns a QT window, however, if i  
switch to another application and then switch back to the R GUI, the  
menubar has disappeared.

If I do not remove the line

grDevices::deviceIsInteractive(QT)

the following error appears an qtutils does not load
Error : 'deviceIsInteractive' is not an exported object from  
'namespace:grDevices'
Error : .onLoad failed in 'loadNamespace' for 'qtutils'
Error: package/namespace load failed for 'qtutils'

Could anyone provide some pointers to get that deviceIsInteractive  
to work?

Thanks for your time
Saptarshi

Saptarshi Guha | [EMAIL PROTECTED] | http://www.stat.purdue.edu/~sguha


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question regarding QT device

2007-08-05 Thread Prof Brian Ripley

grDevices::deviceIsInteractive is only in the unreleased R-devel version 
of R: which version are you using?

Please do study the R posting guide: we do ask for basic information for a 
good reason, and do ask for questions on packages (especially unreleased 
packages) to be sent to the maintainer.


On Sun, 5 Aug 2007, Saptarshi Guha wrote:

 Hi,
   After a few modifications in the makefiles, I successfully compiled
 the Qt device (written by Deepayan Sirkar) for OS X 10.4.9 on a
 Powerbook.
   However when loading into R

   If i remove this line from zzz.R in qtutils/R

 grDevices::deviceIsInteractive(QT)

   and then install
   library(qtutils)

   loads fine and the QT() calls returns a QT window, however, if i
 switch to another application and then switch back to the R GUI, the
 menubar has disappeared.

   If I do not remove the line

 grDevices::deviceIsInteractive(QT)

   the following error appears an qtutils does not load
   Error : 'deviceIsInteractive' is not an exported object from
 'namespace:grDevices'
   Error : .onLoad failed in 'loadNamespace' for 'qtutils'
   Error: package/namespace load failed for 'qtutils'

   Could anyone provide some pointers to get that deviceIsInteractive
 to work?

   Thanks for your time
   Saptarshi

 Saptarshi Guha | [EMAIL PROTECTED] | http://www.stat.purdue.edu/~sguha


   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question regarding QT device

2007-08-05 Thread deepayan . sarkar

On 8/5/07, Saptarshi Guha [EMAIL PROTECTED] wrote:
 Hi,
   After a few modifications in the makefiles, I successfully compiled
 the Qt device (written by Deepayan Sirkar) for OS X 10.4.9 on a
 Powerbook.

Cool, can you send me the modifications? I haven't managed to compile
qtutils on OS X yet (not that I've tried too hard).

   However when loading into R

   If i remove this line from zzz.R in qtutils/R
   
 grDevices::deviceIsInteractive(QT)
   
   and then install
   library(qtutils)

   loads fine and the QT() calls returns a QT window, however, if i
 switch to another application and then switch back to the R GUI, the
 menubar has disappeared.

I can't test this, but Qt is designed to run as the main application,
so it's possible that it is overriding whatever the GUI is doing.

   If I do not remove the line

 grDevices::deviceIsInteractive(QT)
   
   the following error appears an qtutils does not load
   Error : 'deviceIsInteractive' is not an exported object from
 'namespace:grDevices'
   Error : .onLoad failed in 'loadNamespace' for 'qtutils'
   Error: package/namespace load failed for 'qtutils'

   Could anyone provide some pointers to get that deviceIsInteractive
 to work?

What's your R version? Do you see this in 2.5.1?

-Deepayan

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] question about logistic models (AIC)

2007-08-03 Thread Tom Willems

 Dear fellow R-ussers,
I have a question about the Akaike Information Criterion  in the  R 
output.
Normally you would want it to be as small as possible, yet when i check up 
the information in my study books, the AIC is usually displayed as a 
negative number. In the exercises it is given as  - AIC .
R displays it as a positive number, does this mean that a large AIC 
gives a small  - AIC , so the bigger the better?


Kind regards,
Tom.



 
Tom Willems
CODA-CERVA-VAR
Department of Virology
Epizootic Diseases Section
Groeselenberg 99
B-1180 Ukkel

Tel.: +32 2 3790522
Fax: +32 2 3790666
E-mail: [EMAIL PROTECTED]

www.var.fgov.be 




Disclaimer: click here
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] question about logistic models (AIC)

2007-08-03 Thread Kyle.

Tom:

That's a good question.  AIC, as i'm sure you know, is usually  
calculated as 2k-2ln(L), where k is the number of free parameters in  
the model, and L is the log-likelihood of the model.  The goal of  
AIC--and the reason one normally tries to select a model with minimal  
AIC--is to minimize what's referred to as the Kullback-Leibler  
distance between the distribution of your data's density from the  
theoretical true theoretical density as defined by the model.  More  
concisely, the AIC is an index of the amount of information regarding  
your data that is lost when your model is used to describe it.  To  
get back to your question, I can't say without a little more  
information why the AIC's your referring to are negative (but perhaps  
it's an issue of using the log-likelihood instead of the negative log- 
likelihood), but my first instinct is that it doesn't matter.  I  
would go with the AIC that is closest to zero.  I hope that helps.


Kyle H. Ambert
Graduate Student, Dept. Behavioral Neuroscience
Oregon Health  Science University
[EMAIL PROTECTED]


On Aug 3, 2007, at 8:41 AM, Tom Willems wrote:

  Dear fellow R-ussers,
 I have a question about the Akaike Information Criterion  in the  R
 output.
 Normally you would want it to be as small as possible, yet when i  
 check up
 the information in my study books, the AIC is usually displayed as a
 negative number. In the exercises it is given as  - AIC .
 R displays it as a positive number, does this mean that a large AIC
 gives a small  - AIC , so the bigger the better?


 Kind regards,
 Tom.




 Tom Willems
 CODA-CERVA-VAR
 Department of Virology
 Epizootic Diseases Section
 Groeselenberg 99
 B-1180 Ukkel

 Tel.: +32 2 3790522
 Fax: +32 2 3790666
 E-mail: [EMAIL PROTECTED]

 www.var.fgov.be




 Disclaimer: click here
   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting- 
 guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] question about logistic models (AIC)

2007-08-03 Thread Ben Bolker

Tom Willems Tom.Willems at var.fgov.be writes:

 I have a question about the Akaike Information Criterion  in the  R 
 output.
 Normally you would want it to be as small as possible, yet when i check up 
 the information in my study books, the AIC is usually displayed as a 
 negative number. In the exercises it is given as  - AIC .
 R displays it as a positive number, does this mean that a large AIC 
 gives a small  - AIC , so the bigger the better?
 

  I don't know about your books, but confirm that smaller AIC is better.
AIC is usually positive (likelihood is between 0 and 1,
therefore log-likelihood is negative, therefore -2L + 2k is positive),
although continuous probability densities or neglected normalization
coefficients can lead to negative AICs -- but smaller (more negative,
if AIC0) is still better.

  Ben Bolker

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] question about logistic models (AIC)

2007-08-03 Thread Ted Harding

On 03-Aug-07 16:42:33, Kyle. wrote:
 Tom:
 
 That's a good question.  AIC, as i'm sure you know, is usually  
 calculated as 2k-2ln(L), where k is the number of free parameters in  
 the model, and L is the log-likelihood of the model.  The goal of  
 AIC--and the reason one normally tries to select a model with minimal  
 AIC--is to minimize what's referred to as the Kullback-Leibler  
 distance between the distribution of your data's density from the  
 theoretical true theoretical density as defined by the model.  More  
 concisely, the AIC is an index of the amount of information regarding  
 your data that is lost when your model is used to describe it.  To  
 get back to your question, I can't say without a little more  
 information why the AIC's your referring to are negative (but perhaps  
 it's an issue of using the log-likelihood instead of the negative log- 
 likelihood), but my first instinct is that it doesn't matter.  I  
 would go with the AIC that is closest to zero.  I hope that helps.

That could be wrong! Don't forget that ln(L) is indeterminate to
within an additive constant (for given data), so one man's negative
AIC could be another mans positive AIC -- but if each compared
their AICs with different models (the same different models for each)
then they should get the same *difference* of AIC.

The correct approach is to compare AICs on the basis of algebraic
difference: AIC1 - AIC2 in comparing Model1 with Model2.

If this is positive then Model2 is better than Model1 (on the AIC
criterion). Taking the AIC that is closest to zero would be the
wrong way round for negative AICs.

Ted.


 On Aug 3, 2007, at 8:41 AM, Tom Willems wrote:
 
 Dear fellow R-ussers,
 I have a question about the Akaike Information Criterion  in the  R
 output.
 Normally you would want it to be as small as possible, yet when i  
 check up the information in my study books, the AIC is usually
 displayed as a negative number. In the exercises it is given as
  - AIC .
 R displays it as a positive number, does this mean that a large AIC
 gives a small  - AIC , so the bigger the better?

 Kind regards,
 Tom.

 Tom Willems
 CODA-CERVA-VAR
 Department of Virology
 Epizootic Diseases Section
 Groeselenberg 99
 B-1180 Ukkel

 Tel.: +32 2 3790522
 Fax: +32 2 3790666
 E-mail: [EMAIL PROTECTED]

 www.var.fgov.be


E-Mail: (Ted Harding) [EMAIL PROTECTED]
Fax-to-email: +44 (0)870 094 0861
Date: 03-Aug-07   Time: 18:31:51
-- XFMail --

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] question about logistic models (AIC)

2007-08-03 Thread Kyle Ambert

Ah.  Very good point.  I stand corrected.


---Kyle.


 [EMAIL PROTECTED] 08/03/07 10:32 AM 
On 03-Aug-07 16:42:33, Kyle. wrote:
 Tom:
 
 That's a good question.  AIC, as i'm sure you know, is usually  
 calculated as 2k-2ln(L), where k is the number of free parameters in  
 the model, and L is the log-likelihood of the model.  The goal of  
 AIC--and the reason one normally tries to select a model with minimal 

 AIC--is to minimize what's referred to as the Kullback-Leibler  
 distance between the distribution of your data's density from the  
 theoretical true theoretical density as defined by the model.  More 

 concisely, the AIC is an index of the amount of information regarding 

 your data that is lost when your model is used to describe it.  To  
 get back to your question, I can't say without a little more  
 information why the AIC's your referring to are negative (but perhaps 

 it's an issue of using the log-likelihood instead of the negative log-

 likelihood), but my first instinct is that it doesn't matter.  I  
 would go with the AIC that is closest to zero.  I hope that helps.

That could be wrong! Don't forget that ln(L) is indeterminate to
within an additive constant (for given data), so one man's negative
AIC could be another mans positive AIC -- but if each compared
their AICs with different models (the same different models for each)
then they should get the same *difference* of AIC.

The correct approach is to compare AICs on the basis of algebraic
difference: AIC1 - AIC2 in comparing Model1 with Model2.

If this is positive then Model2 is better than Model1 (on the AIC
criterion). Taking the AIC that is closest to zero would be the
wrong way round for negative AICs.

Ted.


 On Aug 3, 2007, at 8:41 AM, Tom Willems wrote:
 
 Dear fellow R-ussers,
 I have a question about the Akaike Information Criterion  in the  R
 output.
 Normally you would want it to be as small as possible, yet when i  
 check up the information in my study books, the AIC is usually
 displayed as a negative number. In the exercises it is given as
  - AIC .
 R displays it as a positive number, does this mean that a large AIC
 gives a small  - AIC , so the bigger the better?

 Kind regards,
 Tom.

 Tom Willems
 CODA-CERVA-VAR
 Department of Virology
 Epizootic Diseases Section
 Groeselenberg 99
 B-1180 Ukkel

 Tel.: +32 2 3790522
 Fax: +32 2 3790666
 E-mail: [EMAIL PROTECTED]

 www.var.fgov.be


E-Mail: (Ted Harding) [EMAIL PROTECTED]
Fax-to-email: +44 (0)870 094 0861
Date: 03-Aug-07   Time: 18:31:51
-- XFMail --

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] question on using gl1ce from lasso2 package

2007-07-25 Thread Berwin A Turlach

G'day Li,

On Wed, 25 Jul 2007 00:50:43 -0400
Li Li [EMAIL PROTECTED] wrote:

 I tried several settings by using the family=gaussian
 in gl1ce, but none of them works. [...]
  gl1ce(Petal.Width~Sepal.Length+Sepal.Width+Petal.Length,
  data=iris,family=gaussian())
 Error in eval(expr, envir, enclos) : object etastart not found
 
 Does anyone have experience with this function?
 Any help will be appreciated,

Actually,

 gl1ce(Petal.Width~Sepal.Length+Sepal.Width+Petal.Length, data=iris,
+ family=gaussian)

should work.  No need to say `family=gaussian()'.

However, omitting the brackets leads to an even more obscure error
message and using traceback() makes me believe that the function
`family()' must have been changed since this code was ported from
S-Plus to R.  

The version with brackets does not seem to work since the function that
tries to determine initial values for the iterative process of
fitting a GLM tries to access an object called `etastart', which exist
in the list of formal parameters of glm() but is not a formal parameter
of gl1ce(). I am not sure whether this problem always existed or is also
new due to changes in glm() and accompanying functions.  (Time to
re-work the whole lasso2 package, I guess.)

A way to solve your problem is to issue the following commands:

 etastart - NULL
 gl1ce(Petal.Width~Sepal.Length+Sepal.Width+Petal.Length, data=iris,
+ family=gaussian())

However, since you are using the gaussian family, why not use l1ce()
directly?  The following command leads to the same output:

 l1ce(Petal.Width~Sepal.Length+Sepal.Width+Petal.Length, data=iris,
+ absolute=TRUE)

Hope this helps.

Cheers,

Berwin

=== Full address =
Berwin A TurlachTel.: +65 6515 4416 (secr)
Dept of Statistics and Applied Probability+65 6515 6650 (self)
Faculty of Science  FAX : +65 6872 3919   
National University of Singapore
6 Science Drive 2, Blk S16, Level 7  e-mail: [EMAIL PROTECTED]
Singapore 117546http://www.stat.nus.edu.sg/~statba

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] question on using gl1ce from lasso2 package

2007-07-24 Thread Li Li

Hi,
I tried several settings by using the family=gaussian
in gl1ce, but none of them works.
For the case glm can work.
Here is the error message I got:

 glm(Petal.Width~Sepal.Length+Sepal.Width+Petal.Length
,data=iris,family=gaussian())
 gl1ce(Petal.Width~Sepal.Length+Sepal.Width+Petal.Length
,data=iris,family=gaussian())
Error in eval(expr, envir, enclos) : object etastart not found

Does anyone have experience with this function?
Any help will be appreciated,

Li

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Question about cmdscale function

2007-07-23 Thread Andrew Wood

Hi. 

I know matrices that use distances between places works fine when using
cmdscale. However, what about matrices such as: 

  A B   C   D   E 
A 01   23  12  9 
B 10   10  12  3 
C 23  10   0   23  4 
D 12  12  23  0   21 
E  9   34   21   0 

i.e. matrices which do not represent physical distances between places (as
they would not make sense for real distances such as the one above) but
other statistics instead? 

Thanks

 

Andy

 


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Question about genalg in R

2007-07-21 Thread adschai

Hi

I am first looking at genalg as a starting point for GA implementation in R. I 
have a couple of questions regarding this package and GA implementation in 
general. If you could provide some hints on these questions, I would really 
appreciate. Thank you in advance.

1. I found two methods for performing optimization based on the type of 
chromosome encoding, i.e. binary and float. I'm wondering if I have a 
representation that mixes between the two, is there any other methods or 
functions that I can use? If there is none in genalg, is there any such  
implementation in other package?

2. This is a question on multiobjective optimization base on GA. I'm wondering 
if there is any implementation in R or known package that I can use? Thank you.

- adschai

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] question about ar1 time series

2007-07-16 Thread Josué Polanco

Hello everybody,

I recently wrote a program that to generate AR1 time series, here the code:


#By Jomopo. Junio-2007, Leioa, Vizcaya
#This program to create the AR1 syntetic series (one by one)
#Where the mean is zero, but the variance of the serie AR1 and
#the coef. of AR1 are be changed. If var serie AR1 = 1 then is standarized!
#Final version for AR1 time series program
#Mon Jul 16 11:58:03 CEST 2007 Checked again in R-prompt, and it's OK!

#Creating the sintetic AR1 series... where the white-noise
#has a mean = 0, and the var = sigmaz_c = stand_dev^2 is whatever value,
#if sigmaz_c = 1 then this white-noise is a Gaussian-noise.
#rho1 (or alpha in another text-books ;-))  1 (in fact 0  rho1  1) so that
#the system can be stationary.
#Where var_serie is the variance of the serie

cat(\n Hello, this is creat_AR1_synt_ser.R. \n These are the input
parameters: synt_series(ar1_length, rho1, ...), where rho1 is the
correlat. coef.\n)


ar1 - function(x, rho1, af)
{
   return(x*rho1 + runif(1, -af, af))
}

#Spin-up for the AR1 series. For this case is enough with this amount
spinup - function(x0, rho1, af)
{
   xt - x0
   for (i in 1:100) {
 xtp1 - ar1(xt, rho1, af)
 xt   - xtp1
   }
   return(xt)
}

#Wherein ar1_length is the number of data in AR1 series
#rho1 is a correlation coef.
#sigmaz_c is optional
synt_series - function(ar1_length, rho1, var_serie)
{
   if( (var_serie = 0) || rho1 = -1 || rho1 = 1 )
 stop(The variance of the serie should be  0, or the rho1
parameter should be between (-1, 1) for that the serie can be
stationary, be careful with this, bye. \n)

   syntdata- rep(0, ar1_length)
#af = adjustement factor, i.e. for that the var of random numbers =
var of white noise (check the manual of runif)
   af  - sqrt( 3 * var_serie * (1 - rho1) * (1 + rho1) )
   x0  - runif(1, -af, af)
   syntdata[1] - spinup(x0, rho1, af)

   for (i in 2:ar1_length) {
 syntdata[i]  - ar1(syntdata[i - 1], rho1, af)
   }
   return(syntdata)
}


I would like some suggestions and hints.

Thanks a lot for your help!

-- 
Josué Mosés Polanco Martínez
Correo-e alternativo [EMAIL PROTECTED]

It is a wasted day unless you have learned something new and made
someone smile -Mark Weingartz.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question about acception rejection sampling - NOT R question

2007-07-16 Thread Greg Snow

Not a strict proof, but think of it this way:

The liklihood of getting a particular value of x has 2 parts.  1st x has
to be generated from h, the liklihood of this happening is h(x), 2nd the
point has to be accepted with conditional probability f(x)/(c*h(x)).  If
we multiply we get h(x) * f(x)/ ( c* h(x) ) and the 2 h(x)'s cancel out
leaving the liklihood of getting x as f(x)/c.  The /c just indicates
that approximately 1-1/c points will be rejected and thrown out and the
final normalized distribution is f(x), which was the goal.

Hope this helps,

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
(801) 408-8111
 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Leeds, 
 Mark (IED)
 Sent: Friday, July 13, 2007 2:45 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Question about acception rejection sampling - 
 NOT R question
 
 This is not related to R but I was hoping that someone could 
 help me. I am reading the Understanding the Metropolis 
 Hastings Algorithm
 paper from the American Statistician by Chip and Greenberg, 
 1995, Vol 49, No 4. Right at the beginning they explain the 
 algorithm for basic acceptance  rejection sampling in which 
 you want to simulate a density from f(x) but it's not easy 
 and you are able to generate from another density called 
 h(x). The assumption is that there exists some c such that 
 f(x) = c(h(x) for all x
 
 They clearly explain the steps as follows :
 
 1) generate Z from h(x).
 
 2) generate u from a Uniform(0,1)
 
 3) if u is less than or equal to f(Z)/c(h(Z) then return Z as 
 the generated RV; otherwise reject it and try again.
 
 I think that, since f(Z)/c(h(z) is U(0,1), then u has the 
 distrbution as f(Z)/c(h(Z).
  
 But, I don't understand why the generated and accepted Z's 
 have the same density  as f(x) ?
 
 Does someone know where there is a proof of this or if it's 
 reasonably to explain, please feel free to explain it.
 They authors definitely believe it's too trivial because they 
 don't. The reason I ask is because, if I don't understand 
 this then I definitely  won't understand the rest of the 
 paper because it gets much more complicated.  I willing to 
 track down the proof but I don't know where to look. Thanks.
 
 
 This is not an offer (or solicitation of an offer) to 
 buy/se...{{dropped}}
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Question about Network library

2007-07-16 Thread Joe Swiatek

Hello,
   
  I'm a bit new to R and am trying to plot a network using the network package. 
 Based on the documentation, if I do the following steps, things work fine:
  
 test = c(0,1,1,1,0,0,0,1,0)
   m = matrix(test, 3)
   g = network(m)
   plot(g)
   
  However, if I try to read in the above nine numbers (space-delimted) from a 
file into a table using read.table(), things don't work:
   test = read.table(test.dat, header=FALSE)
   m = matrix(test,3)
   g = network(m) - doesn't work - g is not created
   
  Can someone please offer a suggestion about what to do?  I'm guessing that I 
maybe shouldn't be using read.table() for this, or maybe I need to do something 
to convert  the table into something compatible with matrix() and/or 
network()?
   
  Thank you,
  Joe
   
   
 
   
-
Got a little couch potato? 
Check out fun summer activities for kids.
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] question about ar1 time series

2007-07-16 Thread Vincent Goulet

Le 07-07-16 à 09:50, Josué Polanco a écrit :

 Hello everybody,

 I recently wrote a program that to generate AR1 time series, here  
 the code:


 #By Jomopo. Junio-2007, Leioa, Vizcaya
 #This program to create the AR1 syntetic series (one by one)
 #Where the mean is zero, but the variance of the serie AR1 and
 #the coef. of AR1 are be changed. If var serie AR1 = 1 then is  
 standarized!
 #Final version for AR1 time series program
 #Mon Jul 16 11:58:03 CEST 2007 Checked again in R-prompt, and it's OK!

 #Creating the sintetic AR1 series... where the white-noise
 #has a mean = 0, and the var = sigmaz_c = stand_dev^2 is whatever  
 value,
 #if sigmaz_c = 1 then this white-noise is a Gaussian-noise.
 #rho1 (or alpha in another text-books ;-))  1 (in fact 0  rho1   
 1) so that
 #the system can be stationary.
 #Where var_serie is the variance of the serie

 cat(\n Hello, this is creat_AR1_synt_ser.R. \n These are the input
 parameters: synt_series(ar1_length, rho1, ...), where rho1 is the
 correlat. coef.\n)


 ar1 - function(x, rho1, af)
 {
return(x*rho1 + runif(1, -af, af))
 }

 #Spin-up for the AR1 series. For this case is enough with this amount
 spinup - function(x0, rho1, af)
 {
xt - x0
for (i in 1:100) {
  xtp1 - ar1(xt, rho1, af)
  xt   - xtp1
}
return(xt)
 }

 #Wherein ar1_length is the number of data in AR1 series
 #rho1 is a correlation coef.
 #sigmaz_c is optional
 synt_series - function(ar1_length, rho1, var_serie)
 {
if( (var_serie = 0) || rho1 = -1 || rho1 = 1 )
  stop(The variance of the serie should be  0, or the rho1
 parameter should be between (-1, 1) for that the serie can be
 stationary, be careful with this, bye. \n)

syntdata- rep(0, ar1_length)
 #af = adjustement factor, i.e. for that the var of random numbers =
 var of white noise (check the manual of runif)
af  - sqrt( 3 * var_serie * (1 - rho1) * (1 + rho1) )
x0  - runif(1, -af, af)
syntdata[1] - spinup(x0, rho1, af)

for (i in 2:ar1_length) {
  syntdata[i]  - ar1(syntdata[i - 1], rho1, af)
}
return(syntdata)
 }


 I would like some suggestions and hints.

Here's one: look at arima.sim() and ease your life a lot. ;-)


 Thanks a lot for your help!

 -- 
 Josué Mosés Polanco Martínez
 Correo-e alternativo [EMAIL PROTECTED]
 
 It is a wasted day unless you have learned something new and made
 someone smile -Mark Weingartz.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting- 
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

---
   Vincent Goulet, Associate Professor
   École d'actuariat
   Université Laval, Québec
   [EMAIL PROTECTED]   http://vgoulet.act.ulaval.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Question about acception rejection sampling - NOT R question

2007-07-13 Thread Leeds, Mark \(IED\)

This is not related to R but I was hoping that someone could help me. I
am reading the Understanding the Metropolis Hastings Algorithm
paper from the American Statistician by Chip and Greenberg, 1995, Vol
49, No 4. Right at the beginning they explain the algorithm for basic
acceptance  rejection sampling in which you want to simulate a density
from f(x) but it's not easy and you are able
to generate from another density called h(x). The assumption is that
there exists some c such that f(x) = c(h(x) for all x

They clearly explain the steps as follows :

1) generate Z from h(x).

2) generate u from a Uniform(0,1)

3) if u is less than or equal to f(Z)/c(h(Z) then return Z as the
generated RV; otherwise reject it and try again.

I think that, since f(Z)/c(h(z) is U(0,1), then u has the distrbution as
f(Z)/c(h(Z).
 
But, I don't understand why the generated and accepted Z's have the same
density  as f(x) ?

Does someone know where there is a proof of this or if it's reasonably
to explain, please feel free to explain it.
They authors definitely believe it's too trivial because they don't. The
reason I ask is because, if I don't understand this then 
I definitely  won't understand the rest of the paper because it gets
much more complicated.  I willing to track down the proof but I don't
know where to look. Thanks.


This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question about acception rejection sampling - NOT R question

2007-07-13 Thread Charles C. Berry

On Fri, 13 Jul 2007, Leeds, Mark (IED) wrote:

 This is not related to R but I was hoping that someone could help me. I
 am reading the Understanding the Metropolis Hastings Algorithm
 paper from the American Statistician by Chip and Greenberg, 1995, Vol
 49, No 4. Right at the beginning they explain the algorithm for basic
 acceptance  rejection sampling in which you want to simulate a density
 from f(x) but it's not easy and you are able
 to generate from another density called h(x). The assumption is that
 there exists some c such that f(x) = c(h(x) for all x

 They clearly explain the steps as follows :

 1) generate Z from h(x).

 2) generate u from a Uniform(0,1)

 3) if u is less than or equal to f(Z)/c(h(Z) then return Z as the
 generated RV; otherwise reject it and try again.

 I think that, since f(Z)/c(h(z) is U(0,1), then u has the distrbution as
 f(Z)/c(h(Z).

 But, I don't understand why the generated and accepted Z's have the same
 density  as f(x) ?

 Does someone know where there is a proof of this or if it's reasonably
 to explain, please feel free to explain it.

The original reference to J. von Neumann's work, which no doubt has a 
proof, is in

http://en.wikipedia.org/wiki/Rejection_sampling

along with a suggestive graph.

Here is another graph that may give your intuition a boost:

f - function(x) dnorm(x)/2+dnorm(x,sd=0.1)/2
h - dnorm
Z - rnorm(100)
u - runif(100)
cee - f(0)/h(0) # as is obvious
u.lt.f.over.ch - u  f(Z)/cee/h(Z)
cutpts - seq(-5,5,by=0.1)
midpts - head(cutpts,n=-1)/2 + tail(cutpts,n=-1)/2
tab - table( !u.lt.f.over.ch, cut( Z, cutpts ) )
bp - barplot( tab )
lines( bp, f(midpts)*sum(u.lt.f.over.ch)*0.1,col='red',lwd=2)

Of course, there is some discreteness in this plot.
If you have a 1280x1024 or wider screen or want to zoom in on a pdf(), 
you might try 0.025 or even 0.0125 in place of 0.1.

Turn this graph on its side and think about what u.lt.f.over.ch is doing 
conditionally on Z, i.e. look at one bar and think about why it is split 
where it is.


HTH,

Chuck

 They authors definitely believe it's too trivial because they don't. The
 reason I ask is because, if I don't understand this then
 I definitely  won't understand the rest of the paper because it gets
 much more complicated.  I willing to track down the proof but I don't
 know where to look. Thanks.
 

 This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


Charles C. Berry(858) 534-2098
 Dept of Family/Preventive Medicine
E mailto:[EMAIL PROTECTED]  UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] question about gamm models

2007-07-11 Thread Matthias an der Heiden

Dear R-Users,

I have a question concerning mixed models in R.
The strata of my model are the counties of Germany. The differencies 
between these counties should be modelled as realizations of a normally 
distributed random variable X.
Moreover, the model contains a 0/1 variable A that enters as a fixed 
effect.
The only special feature that should be additionally in the model 
is the following:  

In a usual mixed model the (constant) variance of X will be chosen 
in an optimal way, but I want to fit 2 constant variances, one for 
the subset {A=0} and the other one for the subset {A=1}. Nevertheless 
it should be one and the same random variable X.

I know that it is possible to fit a model with two independent random 
variables X1 and X2 for the subsets {A=0} and {A=1} respectively.
But I want it to be the same! Equivalently the correlation between 
X1 and X2 should be 1.

Can anybody help me in this respect?

Yours sincerely Matthias an der Heiden

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question for svm function in e1071

2007-07-09 Thread adschai

thank you. This is definitely doable. - adschai- Original Message 
-From: David Meyer Date: Friday, July 6, 2007 6:34 amSubject: Re: [R] 
Question for svm function in e1071To: [EMAIL PROTECTED]: 
r-help@stat.math.ethz.ch Adschai:  The function is written in C++, so 
debugging the source code of  the R  svm() function will not really help. 
What you can do to make the  C-code  more verbose is the following:  - get 
the sources of e1071 - in the src/ directory, look up the svm.cpp file - In 
line 37, there is:   #if 0 void info(char *fmt,...)  [...]  replace the 
first line by:  #if 1  - build and install the package again.  Best 
David    Sorry that I have many questions 
today. I am using svm function  on about 180,000 points of training set. It 
takes very long time to run.  However,I would like it to spit out something to 
make sure that  the run is not dead in between.  Would you please suggest 
anyway to do !
 so? 

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question for svm function in e1071

2007-07-06 Thread David Meyer

Adschai:

The function is written in C++, so debugging the source code of the R 
svm() function will not really help. What you can do to make the C-code 
more verbose is the following:

- get the sources of e1071
- in the src/ directory, look up the svm.cpp file
- In line 37, there is:


#if 0
void info(char *fmt,...)

[...]

replace the first line by:

#if 1

- build and install the package again.

Best
David



Sorry that I have many questions today. I am using svm function on about
180,000 points of training set. It takes very long time to run. However,
I would like it to spit out something to make sure that the run is not
dead in between.  Would you please suggest anyway to do so?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question for svm function in e1071

2007-07-05 Thread Leeds, Mark \(IED\)

adschai : this isn't particularly helpful but when I am using a
function from a package called xxx that
I have little knowledge about, I take the source as is and create my own
function out of
It called my.xxx and then put print statements
Inside it to see what's going on. This is probably an extremely kludgy
way of dealing with that problem
But it is a way. I'm not R expert enough to know any other way but I
would
Be interested if you get a better private reply. 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of
[EMAIL PROTECTED]
Sent: Thursday, July 05, 2007 12:36 AM
To: r-help@stat.math.ethz.ch
Subject: [R] Question for svm function in e1071

Hi,

Sorry that I have many questions today. I am using svm function on about
180,000 points of training set. It takes very long time to run. However,
I would like it to spit out something to make sure that the run is not
dead in between.  Would you please suggest anyway to do so? 

And is there anyway to speed up the performance of this svm function?
Thank you.

- adschai

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question about framework to weighting different classes in SVM

2007-07-05 Thread David Meyer

Adschai:

here is an example for class.weights (isn't it on the help page?):

  data(iris)
  i2 - iris
  levels(i2$Species)[3] - versicolor
  summary(i2$Species)
  wts - 100 / table(i2$Species)
  wts
  m - svm(Species ~ ., data = i2, class.weights = wts)

Cheers,
David

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Question on Rmpi looping

2007-07-05 Thread Kathy Gerber

Dear R list,

In the course of learning to work with Rmpi, we are confused about a few 
points.  The following simple program is based on some examples we retrieved 
from the web. Each slave is writing the same output line multiple times (a 
multiple equal to the number of slaves). In other words, the write 
statements are being executed a number of times equal to the number of 
slaves.

I am trying to print out the slave number to a file (once), but it is 
printing it 4 times (since the number of slaves is 4). The code is as 
follows:

# Initialize the Rmpi environment
library(Rmpi)

# We are spawning four slaves
mpi.spawn.Rslaves(nslaves=4)

if (mpi.comm.size() != 5)
 {
 print(Please initialize an MPI cluster of at least 5 processors.)
 print(Then, try again)
 mpi.quit()
 }

.Last - function()
 {
 if (is.loaded(mpi_initialize))
 {
 if (mpi.comm.size(1)  0)
 {
 print(Please use mpi.close.Rslaves() to close slaves.)
 mpi.close.Rslaves()
 }
 print(Please use mpi.quit() to quit R)
 .Call(mpi_finalize)
 }
 }


# Define the base directory
basedir - /home/user/directory/

fcnfishtest - function()
   {
  wrout = paste(basedir, paste(processor,my_rank, sep=), sep=)
  write(my_rank, wrout, append=TRUE)
   }
## END OF SLAVES ##


# We're in the parent.

#Have each slave get its rank
  mpi.bcast.cmd(my_rank - mpi.comm.rank())
  mpi.bcast.Robj2slave(basedir)


# Send the function to the slaves
   
  mpi.bcast.Robj2slave(fcnfishtest)

# Call the function

x- mpi.remote.exec(fcnfishtest())
x

# close slaves and exit
  mpi.close.Rslaves()
  mpi.quit(save = no)

# end code


The output is as follows:

file 1:
1
1
1
1
  
file 2:
2
2
2
2

file 3:
3
3
3
3

file 4:
4
4
4
4


We would like to call four slaves with output files like:

file 1:
1

file 2:
2

file 3:
3

file 4:
4


Any help would be greatly appreciated. Thank you!

Kathy Gerber
University of Virginia - ITC - Research Computing Support
[EMAIL PROTECTED] 434-982-4986

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Question about framework to weighting different classes in SVM

2007-07-04 Thread adschai

Hi gurus,

I have a doubt about multiclass classification SVM. The population in my data 
includes a couple of class labels that have relatively small proportion of the 
entire population compared to other classes. I would like SVM to pay more 
attention to these classes. However, the question I am having here is that is 
there any systematic/theoretic framework to determine the weights for each 
class? 

My second question is directly related to R. I would like to use the 
class.weights attribute in svm function. However, I'm quite confused a bit 
about how to use it from the description I got from ?svm. Below is the quote.

'a named vector of weights for the different classes, used for asymetric class 
sizes. Not all factor levels have to be supplied (default weight: 1). All 
components have to be named.'

Is the name of the vector has to match the levels in my factor used as target 
labels for my classification? Any simple example would be really appreciated. 
Thank you!

- adschai

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Question for svm function in e1071

2007-07-04 Thread adschai

Hi,

Sorry that I have many questions today. I am using svm function on about 
180,000 points of training set. It takes very long time to run. However, I 
would like it to spit out something to make sure that the run is not dead in 
between.  Would you please suggest anyway to do so? 

And is there anyway to speed up the performance of this svm function? Thank you.

- adschai

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Statistics Question not R question: competing risks and non-informative censoring

2007-07-03 Thread sj

All,

I am working with Emergency Department (ED) Length of Stay Data.  The ED
visit can end in one of a variety of ways (Admit, discharge, transfer,
etc...) Initially, I have modeled the time to event  by fitting a survival
model to  the time the outcome of interest and treat all other outcomes as
censoring. However I recently came across the cmprsk package in R which
seems to be developed specifically for competing risks survival models. I
have been able to gather that the key issue in deciding whether to model
each outcome separately or to use something like the cmprsk package is to
determine whether or not the censoring is non-informative. I have read a
up a little about informative vs. no-informative censoring, but remain
confused, is their a standard approach to determining whether or not
censoring is informative, can anyone suggest some good
(approachable/non-technical) references on the subject?

best,

Spencer

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Question about PCA with prcomp

2007-07-02 Thread James R. Graham

Hello All,

The basic premise of what I want to do is the following:

I have 20 entities for which I have ~500 measurements each. So, I  
have a matrix of 20 rows by ~500 columns.

The 20 entities fall into two classes: good and bad.

I eventually would like to derive a model that would then be able to  
classify new entities as being in good territory or bad territory  
based upon my existing data set.

I know that not all ~500 measurements are meaningful, so I thought  
the best place to begin would be to do a PCA in order to reduce the  
amount of data with which I have to work.

I did this using the prcomp function and found that nearly 90% of the  
variance in the data is explained by PC1 and 2.

So far, so good.

I would now like to find out which of the original ~500 measurements  
contribute to PC1 and 2 and by how much.

Any tips would be greatly appreciated! And apologies in advance if  
this turns out to be an idiotic question.


james

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question about PCA with prcomp

2007-07-02 Thread Mark Difford


Hi James,

Have a look at Cadima et al.'s subselect package [Cadima worked with/was a
student of Prof Jolliffe, one of _the_ experts on PCA; Jolliffe devotes part
of a Chapter to this question in his text (Principal Component Analysis,
pub. Springer)].  Then you should look at psychometric stuff: a good place
to start would be Professor Revelle's psych package.

BestR,
Mark.


James R. Graham wrote:
 
 Hello All,
 
 The basic premise of what I want to do is the following:
 
 I have 20 entities for which I have ~500 measurements each. So, I  
 have a matrix of 20 rows by ~500 columns.
 
 The 20 entities fall into two classes: good and bad.
 
 I eventually would like to derive a model that would then be able to  
 classify new entities as being in good territory or bad territory  
 based upon my existing data set.
 
 I know that not all ~500 measurements are meaningful, so I thought  
 the best place to begin would be to do a PCA in order to reduce the  
 amount of data with which I have to work.
 
 I did this using the prcomp function and found that nearly 90% of the  
 variance in the data is explained by PC1 and 2.
 
 So far, so good.
 
 I would now like to find out which of the original ~500 measurements  
 contribute to PC1 and 2 and by how much.
 
 Any tips would be greatly appreciated! And apologies in advance if  
 this turns out to be an idiotic question.
 
 
 james
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/Question-about-PCA-with-prcomp-tf4012919.html#a11398608
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question about PCA with prcomp

2007-07-02 Thread Ravi Varadhan

Mark,

What you are referring to deals with the selection of covariates, since PC
doesn't do dimensionality reduction in the sense of covariate selection.
But what Mark is asking for is to identify how much each data point
contributes to individual PCs.  I don't think that Mark's query makes much
sense, unless he meant to ask: which individuals have high/low scores on
PC1/PC2.  Here are some comments that may be tangentially related to Mark's
question:

1.  If one is worried about a few data points contributing heavily to the
estimation of PCs, then one can use robust PCA, for example, using robust
covariance matrices.  MASS has some tools for this.
2.  The biplot for the first 2 PCs can give some insights
3. PCs, especially, the last few PCs, can be used to identify outliers.
  
Hope this is helpful,
Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Mark Difford
Sent: Monday, July 02, 2007 1:55 PM
To: r-help@stat.math.ethz.ch
Subject: Re: [R] Question about PCA with prcomp


Hi James,

Have a look at Cadima et al.'s subselect package [Cadima worked with/was a
student of Prof Jolliffe, one of _the_ experts on PCA; Jolliffe devotes part
of a Chapter to this question in his text (Principal Component Analysis,
pub. Springer)].  Then you should look at psychometric stuff: a good place
to start would be Professor Revelle's psych package.

BestR,
Mark.


James R. Graham wrote:
 
 Hello All,
 
 The basic premise of what I want to do is the following:
 
 I have 20 entities for which I have ~500 measurements each. So, I  
 have a matrix of 20 rows by ~500 columns.
 
 The 20 entities fall into two classes: good and bad.
 
 I eventually would like to derive a model that would then be able to  
 classify new entities as being in good territory or bad territory  
 based upon my existing data set.
 
 I know that not all ~500 measurements are meaningful, so I thought  
 the best place to begin would be to do a PCA in order to reduce the  
 amount of data with which I have to work.
 
 I did this using the prcomp function and found that nearly 90% of the  
 variance in the data is explained by PC1 and 2.
 
 So far, so good.
 
 I would now like to find out which of the original ~500 measurements  
 contribute to PC1 and 2 and by how much.
 
 Any tips would be greatly appreciated! And apologies in advance if  
 this turns out to be an idiotic question.
 
 
 james
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context:
http://www.nabble.com/Question-about-PCA-with-prcomp-tf4012919.html#a1139860
8
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question about PCA with prcomp

2007-07-02 Thread Patrick Connolly

On Mon, 02-Jul-2007 at 03:16PM -0400, Ravi Varadhan wrote:

| Mark,
| 
| What you are referring to deals with the selection of covariates, since PC
| doesn't do dimensionality reduction in the sense of covariate selection.
| But what Mark is asking for is to identify how much each data point
| contributes to individual PCs.  I don't think that Mark's query makes much
| sense, unless he meant to ask: which individuals have high/low scores on
| PC1/PC2.  Here are some comments that may be tangentially related to Mark's
| question:
| 
| 1.  If one is worried about a few data points contributing heavily to the
| estimation of PCs, then one can use robust PCA, for example, using robust
| covariance matrices.  MASS has some tools for this.
| 2.  The biplot for the first 2 PCs can give some insights
| 3. PCs, especially, the last few PCs, can be used to identify outliers.

What is meant by last few PCs?

-- 
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.   
   ___Patrick Connolly   
 {~._.~} Great minds discuss ideas
 _( Y )_Middle minds discuss events 
(:_~*~_:)Small minds discuss people  
 (_)-(_)   . Anon
  
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question about PCA with prcomp

2007-07-02 Thread Ravi Varadhan

The PCs that are associated with the smaller eigenvalues. 


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: Patrick Connolly [mailto:[EMAIL PROTECTED] 
Sent: Monday, July 02, 2007 4:23 PM
To: Ravi Varadhan
Cc: 'Mark Difford'; r-help@stat.math.ethz.ch
Subject: Re: [R] Question about PCA with prcomp

On Mon, 02-Jul-2007 at 03:16PM -0400, Ravi Varadhan wrote:

| Mark,
| 
| What you are referring to deals with the selection of covariates, since
PC
| doesn't do dimensionality reduction in the sense of covariate selection.
| But what Mark is asking for is to identify how much each data point
| contributes to individual PCs.  I don't think that Mark's query makes
much
| sense, unless he meant to ask: which individuals have high/low scores on
| PC1/PC2.  Here are some comments that may be tangentially related to
Mark's
| question:
| 
| 1.  If one is worried about a few data points contributing heavily to the
| estimation of PCs, then one can use robust PCA, for example, using robust
| covariance matrices.  MASS has some tools for this.
| 2.  The biplot for the first 2 PCs can give some insights
| 3. PCs, especially, the last few PCs, can be used to identify outliers.

What is meant by last few PCs?

-- 
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.   
   ___Patrick Connolly   
 {~._.~} Great minds discuss ideas
 _( Y )_Middle minds discuss events 
(:_~*~_:)Small minds discuss people  
 (_)-(_)   . Anon
  
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question about PCA with prcomp

2007-07-02 Thread Mark Difford


Hi James, Ravi:

James wrote:
...
 I have 20 entities for which I have ~500 measurements each. So, I   
 have a matrix of 20 rows by ~500 columns.
...

Perhaps I misread James' question, but I don't think so.  As James described
it, we have ~500 measurements made on 20 objects.  A PCA on this [20
rows/observations by ~ 500 columns/descriptors/variables] should return ~
500 eigenvalues.  And each of these columns/descriptors/variables will have
a loading on each PC.

James wants to reduce his descriptors/measurements/variables to the most
important (variance).  A primitive way of doing this would be to examine
the loadings on the first 2--3 PCs and choose those
columns/descriptors/variables with the highest loadings, and throw away the
rest.  [He has already decided that he can throw away all but the first two
PCs.]  In fact, it would be a very good idea to do a coinertia analysis on
the pre- and post-selected sets, and look at the RV value.  If this is above
[thumbsuck] 0.9, then you're doing very well (there's a good plot method for
this in ade4, cf coinertia c).

But see Cadima et al. (+refs for caution; and elsewhere) for more
sophisticated methods of subsetting.

Regards,
Mark.


Ravi Varadhan wrote:
 
 Mark,
 
 What you are referring to deals with the selection of covariates, since PC
 doesn't do dimensionality reduction in the sense of covariate selection.
 But what Mark is asking for is to identify how much each data point
 contributes to individual PCs.  I don't think that Mark's query makes much
 sense, unless he meant to ask: which individuals have high/low scores on
 PC1/PC2.  Here are some comments that may be tangentially related to
 Mark's
 question:
 
 1.  If one is worried about a few data points contributing heavily to the
 estimation of PCs, then one can use robust PCA, for example, using robust
 covariance matrices.  MASS has some tools for this.
 2.  The biplot for the first 2 PCs can give some insights
 3. PCs, especially, the last few PCs, can be used to identify outliers.
   
 Hope this is helpful,
 Ravi.
 
 
 ---
 
 Ravi Varadhan, Ph.D.
 
 Assistant Professor, The Center on Aging and Health
 
 Division of Geriatric Medicine and Gerontology 
 
 Johns Hopkins University
 
 Ph: (410) 502-2619
 
 Fax: (410) 614-9625
 
 Email: [EMAIL PROTECTED]
 
 Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html
 
  
 
 
 
 
 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Mark Difford
 Sent: Monday, July 02, 2007 1:55 PM
 To: r-help@stat.math.ethz.ch
 Subject: Re: [R] Question about PCA with prcomp
 
 
 Hi James,
 
 Have a look at Cadima et al.'s subselect package [Cadima worked with/was a
 student of Prof Jolliffe, one of _the_ experts on PCA; Jolliffe devotes
 part
 of a Chapter to this question in his text (Principal Component Analysis,
 pub. Springer)].  Then you should look at psychometric stuff: a good place
 to start would be Professor Revelle's psych package.
 
 BestR,
 Mark.
 
 
 James R. Graham wrote:
 
 Hello All,
 
 The basic premise of what I want to do is the following:
 
 I have 20 entities for which I have ~500 measurements each. So, I  
 have a matrix of 20 rows by ~500 columns.
 
 The 20 entities fall into two classes: good and bad.
 
 I eventually would like to derive a model that would then be able to  
 classify new entities as being in good territory or bad territory  
 based upon my existing data set.
 
 I know that not all ~500 measurements are meaningful, so I thought  
 the best place to begin would be to do a PCA in order to reduce the  
 amount of data with which I have to work.
 
 I did this using the prcomp function and found that nearly 90% of the  
 variance in the data is explained by PC1 and 2.
 
 So far, so good.
 
 I would now like to find out which of the original ~500 measurements  
 contribute to PC1 and 2 and by how much.
 
 Any tips would be greatly appreciated! And apologies in advance if  
 this turns out to be an idiotic question.
 
 
 james
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 
 -- 
 View this message in context:
 http://www.nabble.com/Question-about-PCA-with-prcomp-tf4012919.html#a1139860
 8
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help

Re: [R] Question about PCA with prcomp

2007-07-02 Thread Bill.Venables

...but with 500 variables and only 20 'entities' (observations) you will
have 481 PCs with dead zero eigenvalues.  How small is 'smaller' and how
many is a few?

Everyone who has responded to this seems to accept the idea that PCA is
the way to go here, but that is not clear to me at all.  There is a
2-sample structure in the 20 observations that you have.  If you simply
ignore that in doing your PCA you are making strong assumptions about
sampling that would seem to me unlikely to be met.  If you allow for the
structure and project orthogonal to it then you are probably throwing
the baby out with the bathwater - you want to choose variables which
maximise separation between the 2 samples (and now you are up to 482
zero principal variances, if that matters...).

I think this problem probably needs a bit of a re-think.  Some variant
on singular LDA, for example, may be a more useful way to think about
it.

Bill Venables.  

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Ravi Varadhan
Sent: Monday, 2 July 2007 1:29 PM
To: 'Patrick Connolly'
Cc: r-help@stat.math.ethz.ch; 'Mark Difford'
Subject: Re: [R] Question about PCA with prcomp

The PCs that are associated with the smaller eigenvalues. 



---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:
http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: Patrick Connolly [mailto:[EMAIL PROTECTED]
Sent: Monday, July 02, 2007 4:23 PM
To: Ravi Varadhan
Cc: 'Mark Difford'; r-help@stat.math.ethz.ch
Subject: Re: [R] Question about PCA with prcomp

On Mon, 02-Jul-2007 at 03:16PM -0400, Ravi Varadhan wrote:

| Mark,
| 
| What you are referring to deals with the selection of covariates, 
| since
PC
| doesn't do dimensionality reduction in the sense of covariate
selection.
| But what Mark is asking for is to identify how much each data point 
| contributes to individual PCs.  I don't think that Mark's query makes
much
| sense, unless he meant to ask: which individuals have high/low scores

| on PC1/PC2.  Here are some comments that may be tangentially related 
| to
Mark's
| question:
| 
| 1.  If one is worried about a few data points contributing heavily to

| the estimation of PCs, then one can use robust PCA, for example, 
| using robust covariance matrices.  MASS has some tools for this.
| 2.  The biplot for the first 2 PCs can give some insights 3. PCs, 
| especially, the last few PCs, can be used to identify outliers.

What is meant by last few PCs?

-- 
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.

   ___Patrick Connolly   
 {~._.~} Great minds discuss ideas
 _( Y )_Middle minds discuss events 
(:_~*~_:)Small minds discuss people  
 (_)-(_)   . Anon
  
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question about PCA with prcomp

2007-07-02 Thread Mark Difford


To all ...,

Bill's lateral wisdom is almost certainly a better solution.  So thanks
for the advice (and everything else that went before it [Bill: apropos of
termplot, what happened to tplot ?]).  And I will [almost] desist from
asking the obvious: and if there were 10 000 observations ?

BestR,
Mark.


Bill.Venables wrote:
 
 ...but with 500 variables and only 20 'entities' (observations) you will
 have 481 PCs with dead zero eigenvalues.  How small is 'smaller' and how
 many is a few?
 
 Everyone who has responded to this seems to accept the idea that PCA is
 the way to go here, but that is not clear to me at all.  There is a
 2-sample structure in the 20 observations that you have.  If you simply
 ignore that in doing your PCA you are making strong assumptions about
 sampling that would seem to me unlikely to be met.  If you allow for the
 structure and project orthogonal to it then you are probably throwing
 the baby out with the bathwater - you want to choose variables which
 maximise separation between the 2 samples (and now you are up to 482
 zero principal variances, if that matters...).
 
 I think this problem probably needs a bit of a re-think.  Some variant
 on singular LDA, for example, may be a more useful way to think about
 it.
 
 Bill Venables.  
 
 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Ravi Varadhan
 Sent: Monday, 2 July 2007 1:29 PM
 To: 'Patrick Connolly'
 Cc: r-help@stat.math.ethz.ch; 'Mark Difford'
 Subject: Re: [R] Question about PCA with prcomp
 
 The PCs that are associated with the smaller eigenvalues. 
 
 
 
 ---
 
 Ravi Varadhan, Ph.D.
 
 Assistant Professor, The Center on Aging and Health
 
 Division of Geriatric Medicine and Gerontology 
 
 Johns Hopkins University
 
 Ph: (410) 502-2619
 
 Fax: (410) 614-9625
 
 Email: [EMAIL PROTECTED]
 
 Webpage:
 http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html
 
  
 
 
 
 
 
 -Original Message-
 From: Patrick Connolly [mailto:[EMAIL PROTECTED]
 Sent: Monday, July 02, 2007 4:23 PM
 To: Ravi Varadhan
 Cc: 'Mark Difford'; r-help@stat.math.ethz.ch
 Subject: Re: [R] Question about PCA with prcomp
 
 On Mon, 02-Jul-2007 at 03:16PM -0400, Ravi Varadhan wrote:
 
 | Mark,
 | 
 | What you are referring to deals with the selection of covariates, 
 | since
 PC
 | doesn't do dimensionality reduction in the sense of covariate
 selection.
 | But what Mark is asking for is to identify how much each data point 
 | contributes to individual PCs.  I don't think that Mark's query makes
 much
 | sense, unless he meant to ask: which individuals have high/low scores
 
 | on PC1/PC2.  Here are some comments that may be tangentially related 
 | to
 Mark's
 | question:
 | 
 | 1.  If one is worried about a few data points contributing heavily to
 
 | the estimation of PCs, then one can use robust PCA, for example, 
 | using robust covariance matrices.  MASS has some tools for this.
 | 2.  The biplot for the first 2 PCs can give some insights 3. PCs, 
 | especially, the last few PCs, can be used to identify outliers.
 
 What is meant by last few PCs?
 
 -- 
 ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
 
___Patrick Connolly   
  {~._.~}   Great minds discuss ideas
  _( Y )_  Middle minds discuss events 
 (:_~*~_:)  Small minds discuss people  
  (_)-(_) . Anon
 
 ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/Question-about-PCA-with-prcomp-tf4012919.html#a11402204
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Evaluating predictive power with no intercept-statistics question - not R question

2007-06-28 Thread Leeds, Mark \(IED\)

I realize that the following has been talked about on this list many
times before in some related way but I
am going to ask for help anyway because I still don't know what to do. 

Suppose I have no intercept models such as the following :

Y = B*X_1 + error
Y = B*X_2 + error
Y = B*X_3 + error
Y = B*X_4 + error

and I run regressions on each ( over the same sample of Y ) and now I
want to evaluate which X has the greatest predictive power.
I'm fairly certain that R squared is not applicable because of the lack
of an intercept but I was wondering what was ? 
Any references to this particular problem or suggestions are
appreciated. I honestly believe that including an intercept is incorrect
For my particular problem. Thanks.

Maybe I could put all the X's in one regression and some kind of
topdownselect or StepAIC algorithm for example ?  Thanks.


This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] question

2007-06-26 Thread Amir_17

Hi
  I have two tree and  i want compute error of each tree and then cmparison 
with t.test.
  I cant coding this problem.Could i help me?
  Regards

   
-

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Question on package building

2007-06-20 Thread Ev Whin

Hi dear R-Help members:

When I was building my R package, using
$R CMD build --binary pkg
it came out the following error:

** libs
Error: package 'methods' was built for x86_64-unknown-linux-gnu
Execution halted

The information about $R_HOME/library/methods/libs/methods.so is:
$file methods.so
methods.so: ELF 32-bit LSB shared object, Intel 80386, version 1 (SYSV), not
stripped

my machine is:
$uname -a
Linux whinev 2.4.21-37.EL #1 SMP Wed Sep 7 13:32:18 EDT 2005 x86_64 x86_64
x86_64 GNU/Linux

and I used the options below to build R src:
../configure --prefix=/home/whinev/R --enable-R-shlib CC=gcc -m32
CXXFLAGS=-m32 -O2 -g FFLAGS=-m32 -O2 -g FCFLAGS=-m32 -O2 -g
LDFLAGS=-L/usr/local/lib LIBnn=lib --with-x=no

Would you please tell me what the matter is?
Thank you all!

Whinev

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Question about lmer

2007-06-18 Thread Julia Proudnikova

Hello,

We have a problem with function lmer. This is our code:

Get_values-function(ff_count, fixed_factors, rf_count, random_factors, 
y_values)
{   
  SA-matrix(as.array(c(fixed_factors, random_factors)), ncol=3)
  data-as.data.frame(SA)
  y-as.array(y_values)

  dd-data.frame(SA)
  for(i in 1:(ff_count+rf_count)){
dd[,i]-as.factor(data[,i])
  }
  
  fit_full=lmer(y~dd[,1]+dd[,2]+(1|dd[,3]),method=ML)
  fit_full
}

A-c(0,0,0,0,1,1,1,1,0,0,0,0,1,1,1,1,0,0,0,0,1,1,1,1,0,0,0,0,1,1,1,1)
B-c(0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1)
C-c(0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1)
Y-c(5,3,4,1,1,2,6,1,5,3,7,1,2,3,1,1,5,3,4,1,1,2,6,1,5,3,7,1,2,3,1,1)
r-Get_values(2, c(A,B),1,c(C),Y)
r 

R output:
Error in inherits(x, factor) : object dd not found

Can this function work with random array? Because this code is
working:

D-as.factor(data[,3])
fit_full=lmer(y~dd[,1]+dd[,2]+(1|D),method=ML)
 

-- 
Truly yours,
Julia mailto:[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question about lmer

2007-06-18 Thread Douglas Bates

On 6/18/07, Julia Proudnikova [EMAIL PROTECTED] wrote:
 Hello,

 We have a problem with function lmer. This is our code:

 Get_values-function(ff_count, fixed_factors, rf_count, random_factors, 
 y_values)
 {
   SA-matrix(as.array(c(fixed_factors, random_factors)), ncol=3)
   data-as.data.frame(SA)
   y-as.array(y_values)

   dd-data.frame(SA)
   for(i in 1:(ff_count+rf_count)){
 dd[,i]-as.factor(data[,i])
   }

   fit_full=lmer(y~dd[,1]+dd[,2]+(1|dd[,3]),method=ML)
   fit_full
 }

 A-c(0,0,0,0,1,1,1,1,0,0,0,0,1,1,1,1,0,0,0,0,1,1,1,1,0,0,0,0,1,1,1,1)
 B-c(0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1)
 C-c(0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1)
 Y-c(5,3,4,1,1,2,6,1,5,3,7,1,2,3,1,1,5,3,4,1,1,2,6,1,5,3,7,1,2,3,1,1)
 r-Get_values(2, c(A,B),1,c(C),Y)
 r

 R output:
 Error in inherits(x, factor) : object dd not found

 Can this function work with random array? Because this code is
 working:

The full explanation of why lmer fails to find dd has to do with the
way names are resolved in a call to model.frame.  However, there may
be a way to solve your problem by redesigning your function so you
don't need to worry about what model.frame does.

Why not pass the data as a data frame and pass the names of the fixed
factors, random factors and response variable as character strings?
Your current design of creating a matrix, then converting it to a data
frame then converting numeric variables back to factors is a bit
convoluted.

If you knew that you were only going to have one random factor you
could generate the formula as

substitute(y ~ ff + (1|rf), list(y = as.name(y_name), ff =
parse(paste(ff_names, collapse = +)), rf = as.name(rf_name))

It gets a bit trickier with multiple random factors.

Having said all this, it does appear that the call to model.frame
inside lmer is getting the wrong environment from the formula and I
will correct that.

If you need more detail about the redesign I am suggesting, feel free
to contact me off-list.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [Not R question]: Better fit for order probit model

2007-06-17 Thread adschai

Thank you so much Robert. Please find the information below. 

The scale 1-10 are subjective physical condition ratings scored by inspection 
engineers at the site. 1-5 are in very bad conditions (bridge close down to 
seriously deteriorated). The rest from 6-10 are categorized as good 
condition.However, the proportion of samples in my data are as follows. Bridges 
with 1-5 scale covers about 2-3% of total population. The majority of the 
bridges are in 7-8. Certainly, I have enough data for model estimation but the 
mix of the proportion is good. I attached the detail of condition rating scale 
at the end of this message.

As a result, my ordered probit model yield cutting points that cannot capture 
level 1-5 well. Even in my in-sample population, the model cannot capture level 
2-5 at all. In other words, with the estimated cutting points for level 1-5, I 
have zero bridges that belong to level 2-5. Unfortunately, my objective is 
especially to analyze statistically what kind of design attributes lead to such 
a bad condition. So I would like my model to be able to capture bad condition 
bridges as much as it could.


9  EXCELLENT CONDITION

8  VERY GOOD CONDITION - no problems noted.

7  GOOD CONDITION - some minor problems.

6  SATISFACTORY CONDITION - structural elements show some minor 
deterioration.

5  FAIR CONDITION - all primary structural elements are sound 
but may have minor section loss, cracking, 
spalling or scour.

4  POOR CONDITION - advanced section loss, deterioration, 
spalling or scour.

3  SERIOUS CONDITION - loss of section, deterioration of 
primary structural elements.  Fatigue cracks 
in steel or shear cracks in concrete may be present.

2  CRITICAL CONDITION - advanced deterioration of primary 
structural elements.  Fatigue cracks in steel 
or shear cracks in concrete may be present or scour may 
have removed substructure support.  Unless 
closely monitored it may be necessary to close the bridge 
until corrective action is taken.

1  IMMINANT FAILURE CONDITION - major deterioration or 
section loss present in critical sructural 
components or obvious vertical or horizontal movement 
affecting structure stability.  Bridge is 
closed to traffic but corrective action may put it back in 
light service.

0  FAILED CONDITION - out of service; beyond corrective action.


- Original Message -From: Robert A LaBudde Date: Saturday, June 16, 
2007 9:59 amSubject: Re: [R] [Not R question]: Better fit for order probit 
modelTo: R-help@stat.math.ethz.ch At 03:17 AM 6/16/2007, adschai wrote: 
Thank you so much Robert. I haven't thought about the idea of  clumping 
categories together. One of the reason is because  these  categories are 
bridge condition rating scores. They indeed  represent  different meaning 
and serviceability conditions. They vary from  0-9.  I have about 300,000 
data in which the first 5 labels, i.e. 0- 4, are  bad condition bridge and 
there are only less than 1000  instances in  total. The worst case, is for 
example, score 0 (meaning the  bridge  is not operatable), I have 60 
instances. Score 1 I have about 100.  I would appreciate if you could 
provide some opinion as to how  you  would make the order probit fits better 
in this case? Thank you  so  much in advance.  You certainly appear to h!
 ave enough data to populate these  categories. Your problems in a getting a 
good fit may relate to  other problems.  You need to supply more information 
in order to say more.  What are the definitions of each category?  Is the 
ordering consistent, or are there really two different  scales,  one for 
bridge with essentially no problems, and another for  those  with serious 
damage?  What evidence do you have that your fit is poor?  What model are 
you fitting?  Etc.  
 Robert A. 
LaBudde, PhD, PAS, Dpl. ACAFS  e-mail: [EMAIL PROTECTED] Least Cost 
Formulations, Ltd.URL: http://lcfltd.com/ 824 Timberlake Drive 
Tel: 757-467-0954 Virginia Beach, VA 23464-3239
Fax: 757-467-2947  Vere scire est per causas scire  
__ R-help@stat.math.ethz.ch 
mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do r!
 ead the posting guide http://www.R- project.org/posting-guide.html a
nd provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [Not R question]: Better fit for order probit model

2007-06-17 Thread Robert A LaBudde

At 01:29 PM 6/17/2007, adschai wrote:
Thank you so much Robert. Please find the information below.

The scale 1-10 are subjective physical condition ratings scored by 
inspection engineers at the site. 1-5 are in very bad conditions 
(bridge close down to seriously deteriorated). The rest from 6-10 
are categorized as good condition.However, the proportion of samples 
in my data are as follows. Bridges with 1-5 scale covers about 2-3% 
of total population. The majority of the bridges are in 7-8. 
Certainly, I have enough data for model estimation but the mix of 
the proportion is good. I attached the detail of condition rating 
scale at the end of this message.
snip

My guess is that you really have two distributions here: 1) Bridges 
that are basically under proper supervision and repair (Categories 
6-10), and those that are not Categories 1-5). These two classes 
would probably have dramatically different relations to the 
covariates your are using.

So my recommendation would be to consider splitting your response 
into two classes (0/1), each with 5 sub categories.

Keeping the two classes merged together in a single group is 
equivalent to merging two different distributions with a weighting 
factor. This may be causing a bimodal distribution which is giving 
you problems.

You could try your modeling on each of the two classes separately 
before continuing with your full dataset modeling. You may learn 
something useful to help you with your problems.

For the full model, you would need to include a full set of 
interaction terms with class.

Robert A. LaBudde, PhD, PAS, Dpl. ACAFS  e-mail: [EMAIL PROTECTED]
Least Cost Formulations, Ltd.URL: http://lcfltd.com/
824 Timberlake Drive Tel: 757-467-0954
Virginia Beach, VA 23464-3239Fax: 757-467-2947

Vere scire est per causas scire

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [Not R question]: Better fit for order probit model

2007-06-16 Thread adschai

Thank you so much Robert. I haven't thought about the idea of clumping 
categories together. One of the reason is because these categories are bridge 
condition rating scores. They indeed represent different meaning and 
serviceability conditions. They vary from 0-9. I have about 300,000 data in 
which the first 5 labels, i.e. 0-4, are bad condition bridge and there are only 
less than 1000 instances in total. The worst case, is for example, score 0 
(meaning the bridge is not operatable), I have 60 instances. Score 1 I have 
about 100. I would appreciate if you could provide some opinion as to how you 
would make the order probit fits better in this case? Thank you so much in 
advance.- adschai- Original Message -From: Robert A LaBudde Date: 
Friday, June 15, 2007 9:52 pmSubject: Re: [R] [Not R question]: Better fit for 
order probit modelTo: R-help@stat.math.ethz.ch At 09:31 PM 6/15/2007, adschai 
wrote: I have a model which tries to fit a set of data with 10-level  order!
 ed responses. Somehow, in my data, the majority of the  observations are 
from level 6-10 and leave only about 1-5% of  total  observations 
contributed to level 1-10. As a result, my model  tends  to perform badly on 
points that have lower level than 6.  I would like to ask if there's any 
way to circumvent this  problem or  not. I was thinking of the followings 
ideas. But I am opened to  any  suggestions if you could please.  1. 
Bootstrapping with small size of samples each time.  Howevever, in  each 
sample basket, I intentionally sample in such a way that  there  is a good 
mix between observations from each level. Then I have  to  do this many 
times. But I don't know how to obtain the true  standard  error of estimated 
parameters after all bootstrapping has been  done.  Is it going to be simply 
the average of all standard errors  estimated each time?  2. Weighting 
points with level 1-6 more. But it's unclear to me  how  to put t!
 his weight back to maximum likelihood when estimating  parameters. I
t's unlike OLS where your objective is to minimize  error or, if you'd like, 
a penalty function. But MLE is  obviously  not a penalty function.  3. 
Do step-wise regression. I will segment the data into two  regions, first 
points with response less than 6 and the rest  with  those above 6. The 
first step is a binary regression to  determine if  the point belongs to 
which of the two groups. Then in the  second  step, estimate ordered probit 
model for each group separately.  The  question here is then, why I am 
choosing 6 as a cutting point  instead of others?  Any suggestions would 
be really appreciated. Thank you.  You could do the obvious, and lump 
categories such as 1-6 or 1-7  together to make a composite category.  You 
don't mention the size of your dataset. If there are 10,000  data,  you might 
live with a 1% category. If you only have 100 data,  you  have too many 
categories.  Also, next time plan your study and training better so!
  that next  time  your categories are fully utilized. And don't use so many 
 categories.  People have trouble even selecting responses on a 5-level 
scale.  
Robert A. LaBudde, PhD, PAS, Dpl. ACAFS  e-mail: [EMAIL PROTECTED] Least Cost 
Formulations, Ltd.URL: http://lcfltd.com/ 824 Timberlake Drive 
Tel: 757-467-0954 Virginia Beach, VA 23464-3239
Fax: 757-467-2947  Vere scire est per causas scire  
__ R-help@stat.math.ethz.ch 
mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the 
posting guide http://www.R- project.org/posting-guide.html and provide 
commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] question about formula for lm

2007-06-16 Thread Mike Lawrence

Yet another solution:

with(X,lm(get(Ytext)~Xvar))

On 14-Jun-07, at 5:18 PM, Greg Snow wrote:


 Try:

 lm( formula( paste( Ytext, '~ Xvar' ) ), data=X)

 --  
 Gregory (Greg) L. Snow Ph.D.
 Statistical Data Center
 Intermountain Healthcare
 [EMAIL PROTECTED]
 (801) 408-8111



 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Pedro Mardones
 Sent: Thursday, June 14, 2007 1:14 PM
 To: R-help@stat.math.ethz.ch
 Subject: [R] question about formula for lm

 Dear all;

 Is there any way to make this to work?:

 .x-rnorm(50,10,3)
 .y-.x+rnorm(50,0,1)

 X-data.frame(.x,.y)
 colnames(X)-c(Xvar,Yvar)

 Ytext-Yvar

 lm(Ytext~Xvar,data=X) # doesn't run

 lm(Yvar~Xvar,data=X) # does run

 The main idea is to use Ytext as input in a function, so you
 just type Yvar and the model should fit
 Also, I need to avoid the expression X$Yvar~X$Xvar

 Thanks for any idea

 PM

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting- 
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

--
Mike Lawrence
Graduate Student, Department of Psychology, Dalhousie University

Website: http://myweb.dal.ca/mc973993
Public calendar: http://icalx.com/public/informavore/Public

The road to wisdom? Well, it's plain and simple to express:
Err and err and err again, but less and less and less.
- Piet Hein

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [Not R question]: Better fit for order probit model

2007-06-16 Thread Robert A LaBudde

At 03:17 AM 6/16/2007, adschai wrote:
Thank you so much Robert. I haven't thought about the idea of 
clumping categories together. One of the reason is because these 
categories are bridge condition rating scores. They indeed represent 
different meaning and serviceability conditions. They vary from 0-9. 
I have about 300,000 data in which the first 5 labels, i.e. 0-4, are 
bad condition bridge and there are only less than 1000 instances in 
total. The worst case, is for example, score 0 (meaning the bridge 
is not operatable), I have 60 instances. Score 1 I have about 100.

I would appreciate if you could provide some opinion as to how you 
would make the order probit fits better in this case? Thank you so 
much in advance.

You certainly appear to have enough data to populate these 
categories. Your problems in a getting a good fit may relate to other problems.

You need to supply more information in order to say more.

What are the definitions of each category?

Is the ordering consistent, or are there really two different scales, 
one for bridge with essentially no problems, and another for those 
with serious damage?

What evidence do you have that your fit is poor?

What model are you fitting?

Etc.


Robert A. LaBudde, PhD, PAS, Dpl. ACAFS  e-mail: [EMAIL PROTECTED]
Least Cost Formulations, Ltd.URL: http://lcfltd.com/
824 Timberlake Drive Tel: 757-467-0954
Virginia Beach, VA 23464-3239Fax: 757-467-2947

Vere scire est per causas scire

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] question about formula for lm

2007-06-16 Thread Douglas Bates

On 6/14/07, Greg Snow [EMAIL PROTECTED] wrote:

 Try:

  lm( formula( paste( Ytext, '~ Xvar' ) ), data=X)

That type of construction is perilously close to parse(paste(...)) and
we know what Thomas said about that (see fortune(parse)).

A safer way of constructing a formula from names stored in a character
variable is

substitute(foo ~ Xvar, list(foo = as.name(Ytext))

  -Original Message-
  From: [EMAIL PROTECTED]
  [mailto:[EMAIL PROTECTED] On Behalf Of Pedro Mardones
  Sent: Thursday, June 14, 2007 1:14 PM
  To: R-help@stat.math.ethz.ch
  Subject: [R] question about formula for lm
 
  Dear all;
 
  Is there any way to make this to work?:
 
  .x-rnorm(50,10,3)
  .y-.x+rnorm(50,0,1)
 
  X-data.frame(.x,.y)
  colnames(X)-c(Xvar,Yvar)
 
  Ytext-Yvar
 
  lm(Ytext~Xvar,data=X) # doesn't run
 
  lm(Yvar~Xvar,data=X) # does run
 
  The main idea is to use Ytext as input in a function, so you
  just type Yvar and the model should fit
  Also, I need to avoid the expression X$Yvar~X$Xvar
 
  Thanks for any idea
 
  PM
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] question about formula for lm

2007-06-16 Thread Gabor Grothendieck

A couple of easy ways are to create the calling sequence as a call or
character string and then evaluate it:

eval(bquote(lm(.(as.name(Ytext)) ~ Xvar, X)))

eval(parse(text = paste(lm(, Ytext, ~ Xvar, X

Note that these solutions both have the advantage over some of the prior
solutions that the output from print.lm shows which variable was actually used
after Call:


 eval(bquote(lm(.(as.name(Ytext)) ~ Xvar, X)))
Call:
lm(formula = Yvar ~ Xvar, data = X)  --- This line comes out meaningfully!!!

Coefficients:
(Intercept) Xvar
 0.3300   0.9729


On 6/14/07, Pedro Mardones [EMAIL PROTECTED] wrote:
 Dear all;

 Is there any way to make this to work?:

 .x-rnorm(50,10,3)
 .y-.x+rnorm(50,0,1)

 X-data.frame(.x,.y)
 colnames(X)-c(Xvar,Yvar)

 Ytext-Yvar

 lm(Ytext~Xvar,data=X) # doesn't run

 lm(Yvar~Xvar,data=X) # does run

 The main idea is to use Ytext as input in a function, so you just type
 Yvar and the model should fit
 Also, I need to avoid the expression X$Yvar~X$Xvar

 Thanks for any idea

 PM

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] [Not R question]: Better fit for order probit model

2007-06-15 Thread adschai

Hi,

I have a model which tries to fit a set of data with 10-level ordered 
responses. Somehow, in my data, the majority of the observations are from level 
6-10 and leave only about 1-5% of total observations contributed to level 1-10. 
As a result, my model tends to perform badly on points that have lower level 
than 6. 

I would like to ask if there's any way to circumvent this problem or not. I was 
thinking of the followings ideas. But I am opened to any suggestions if you 
could please. 

1. Bootstrapping with small size of samples each time. Howevever, in each 
sample basket, I intentionally sample in such a way that there is a good mix 
between observations from each level. Then I have to do this many times. But I 
don't know how to obtain the true standard error of estimated parameters after 
all bootstrapping has been done. Is it going to be simply the average of all 
standard errors estimated each time?

2. Weighting points with level 1-6 more. But it's unclear to me how to put this 
weight back to maximum likelihood when estimating parameters. It's unlike OLS 
where your objective is to minimize error or, if you'd like, a penalty 
function. But MLE is obviously not a penalty function.

3. Do step-wise regression. I will segment the data into two regions, first 
points with response less than 6 and the rest with those above 6. The first 
step is a binary regression to determine if the point belongs to which of the 
two groups. Then in the second step, estimate ordered probit model for each 
group separately. The question here is then, why I am choosing 6 as a cutting 
point instead of others? 

Any suggestions would be really appreciated. Thank you.

- adschai

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [Not R question]: Better fit for order probit model

2007-06-15 Thread Robert A LaBudde

At 09:31 PM 6/15/2007, adschai wrote:
I have a model which tries to fit a set of data with 10-level 
ordered responses. Somehow, in my data, the majority of the 
observations are from level 6-10 and leave only about 1-5% of total 
observations contributed to level 1-10. As a result, my model tends 
to perform badly on points that have lower level than 6.

I would like to ask if there's any way to circumvent this problem or 
not. I was thinking of the followings ideas. But I am opened to any 
suggestions if you could please.

1. Bootstrapping with small size of samples each time. Howevever, in 
each sample basket, I intentionally sample in such a way that there 
is a good mix between observations from each level. Then I have to 
do this many times. But I don't know how to obtain the true standard 
error of estimated parameters after all bootstrapping has been done. 
Is it going to be simply the average of all standard errors 
estimated each time?

2. Weighting points with level 1-6 more. But it's unclear to me how 
to put this weight back to maximum likelihood when estimating 
parameters. It's unlike OLS where your objective is to minimize 
error or, if you'd like, a penalty function. But MLE is obviously 
not a penalty function.

3. Do step-wise regression. I will segment the data into two 
regions, first points with response less than 6 and the rest with 
those above 6. The first step is a binary regression to determine if 
the point belongs to which of the two groups. Then in the second 
step, estimate ordered probit model for each group separately. The 
question here is then, why I am choosing 6 as a cutting point 
instead of others?

Any suggestions would be really appreciated. Thank you.

You could do the obvious, and lump categories such as 1-6 or 1-7 
together to make a composite category.

You don't mention the size of your dataset. If there are 10,000 data, 
you might live with a 1% category. If you only have 100 data, you 
have too many categories.

Also, next time plan your study and training better so that next time 
your categories are fully utilized. And don't use so many categories. 
People have trouble even selecting responses on a 5-level scale.

Robert A. LaBudde, PhD, PAS, Dpl. ACAFS  e-mail: [EMAIL PROTECTED]
Least Cost Formulations, Ltd.URL: http://lcfltd.com/
824 Timberlake Drive Tel: 757-467-0954
Virginia Beach, VA 23464-3239Fax: 757-467-2947

Vere scire est per causas scire

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] question about formula for lm

2007-06-14 Thread Pedro Mardones

Dear all;

Is there any way to make this to work?:

.x-rnorm(50,10,3)
.y-.x+rnorm(50,0,1)

X-data.frame(.x,.y)
colnames(X)-c(Xvar,Yvar)

Ytext-Yvar

lm(Ytext~Xvar,data=X) # doesn't run

lm(Yvar~Xvar,data=X) # does run

The main idea is to use Ytext as input in a function, so you just type
Yvar and the model should fit
Also, I need to avoid the expression X$Yvar~X$Xvar

Thanks for any idea

PM

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] question about formula for lm

2007-06-14 Thread Thomas Petzoldt

Why not using:

lm(X[[Ytext]]~Xvar,data=X)


ThPe


Pedro Mardones wrote:
 Dear all;
 
 Is there any way to make this to work?:
 
 .x-rnorm(50,10,3)
 .y-.x+rnorm(50,0,1)
 
 X-data.frame(.x,.y)
 colnames(X)-c(Xvar,Yvar)
 
 Ytext-Yvar
 
 lm(Ytext~Xvar,data=X) # doesn't run
 
 lm(Yvar~Xvar,data=X) # does run
 
 The main idea is to use Ytext as input in a function, so you just type
 Yvar and the model should fit
 Also, I need to avoid the expression X$Yvar~X$Xvar
 
 Thanks for any idea
 
 PM
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] question about formula for lm

2007-06-14 Thread John Kane

First check the value of Ytext. 

Try 
Ytext - X$Yvar

--- Pedro Mardones [EMAIL PROTECTED] wrote:

 Dear all;
 
 Is there any way to make this to work?:
 
 .x-rnorm(50,10,3)
 .y-.x+rnorm(50,0,1)
 
 X-data.frame(.x,.y)
 colnames(X)-c(Xvar,Yvar)
 
 Ytext-Yvar
 
 lm(Ytext~Xvar,data=X) # doesn't run
 
 lm(Yvar~Xvar,data=X) # does run
 
 The main idea is to use Ytext as input in a
 function, so you just type
 Yvar and the model should fit
 Also, I need to avoid the expression X$Yvar~X$Xvar
 
 Thanks for any idea
 
 PM
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] question about formula for lm

2007-06-14 Thread Greg Snow


Try:

 lm( formula( paste( Ytext, '~ Xvar' ) ), data=X)

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
(801) 408-8111
 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Pedro Mardones
 Sent: Thursday, June 14, 2007 1:14 PM
 To: R-help@stat.math.ethz.ch
 Subject: [R] question about formula for lm
 
 Dear all;
 
 Is there any way to make this to work?:
 
 .x-rnorm(50,10,3)
 .y-.x+rnorm(50,0,1)
 
 X-data.frame(.x,.y)
 colnames(X)-c(Xvar,Yvar)
 
 Ytext-Yvar
 
 lm(Ytext~Xvar,data=X) # doesn't run
 
 lm(Yvar~Xvar,data=X) # does run
 
 The main idea is to use Ytext as input in a function, so you 
 just type Yvar and the model should fit
 Also, I need to avoid the expression X$Yvar~X$Xvar
 
 Thanks for any idea
 
 PM
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Question with nlm

2007-06-14 Thread adschai

Hi,

I would really appreciate if I could get some help here. I'm using nlm to 
minimize my negative log likelihood function. What I did is as follows:

My log likelihood function (it returns negative log likelihood) with 'gradient' 
attribute defined inside as follows:

# ==Method definition==
logLikFunc3 - function(sigma, object, totalTime) {
y - as.matrix([EMAIL PROTECTED]:totalTime,1]);
x - as.matrix([EMAIL PROTECTED]:totalTime,]);
# compute necessary matrices
M - as.matrix([EMAIL PROTECTED]);
P - diag(sigma*sigma);
A - AMatrix(totalTime, M, [EMAIL PROTECTED]:totalTime,]);
Q - IMatrix(totalTime)+A %*% outerM(IMatrix(totalTime-1),P) %*% t(A);
invQ - solve(Q,IMatrix(dim(Q)[1]));
xM - matrix(rep(0, dim(M)[2]*totalTime), ncol=dim(M)[2], nrow=totalTime);
for (i in 1:totalTime) {
   xM[i,] - x[i,] %*% powerM(M, -totalTime+i);
}
tmp - solve((t(xM) %*% invQ %*% xM), IMatrix(dim(xM)[2]));
Bt - (tmp %*% t(xM)) %*% (invQ %*% y);
N - IMatrix(totalTime)-(xM %*% tmp %*% t(xM) %*% invQ);

sigma2 - (1/totalTime) * t(y- xM %*% Bt)%*% invQ %*% (y- xM %*% Bt);
# log likelihood function
loglik - 
-0.5*log(abs(det(diag(rep(sigma2,totalTime)-0.5*log(abs(det(Q)))-
   (0.5/sigma2)* (t(y- (xM%*% Bt)) %*% invQ %*% (y-(xM %*% Bt)));

sgm - sigma;
# gradients eq. (4.16)
gr - function(sgm) {
   gradVecs - c();
   # sgm - c(sigma1, sigma2);
   sgm - sgm*sgm;
   for (i in 1:length(sgm)) {
  Eij - matrix(rep(0, length(sgm)^2), nrow=length(sgm), 
ncol=length(sgm));
  Eij[i,i] - 1.0;
  # trace term
  term1 - -sum(diag((invQ %*% A) %*% outerM(IMatrix(totalTime-1),Eij) 
%*% t(A)));
  # very long term
  term2 - (1/totalTime)*solve((t(y) %*% t(N) %*% invQ %*% y), 
IMatrix(dim(y)[2]));
  term3 - (t(y) %*% t(N) %*% invQ %*% A) %*% 
outerM(IMatrix(totalTime-1),Eij) %*% (t(A) %*% invQ %*% N %*% y);
  gradVecs - -1*c(gradVecs, term1+ (term2 %*% term3));
   } # end for
   print(paste(Gradient has length:, length(gradVecs)));
   return(gradVecs);
}
res - -loglik;
attr(res, gradient) -  gradVecs;
return(res);
}
#=end method definition=

Then when I call the nlm on this function, i.e.

nlm(f=logLikFunc3, p=as.numeric(c(1,1)), object=this, totalTime=200, 
print.level=2)

It complains that my analytic gradient returns vector length different from 
number of my unknowns. In this case, I tried print the length of gradient 
vector that I returned (as you could see in the code). It has the same length 
as my input parameter vectors. Did I do anything wrong here?

Also, I would like to be able to put some constraints on this optimization as 
well. I tried constrOptim with:

ui - diag(c(1,1));
ci - matrix(rep(0,2), ncol=1, nrow=2);

using the same parameters passed to nlm above. constrOptim gives me an error 
that initial value is in infeasible region which I don't quite understand. As 
my constraints simply says that the two parameters must be greater than zero. 
My assigned initial values are both 1. So it should be ok. Any help would be 
really appreciated. Thank you.

- adschai

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] question about data availale in .RData file using the biobase package

2007-06-11 Thread ramakanth reddy

Hi all

I am analyzing micro array data and I have R workspace images as my source of 
the data(.Rdata format).That was in the biobase package format,so I used some 
commands from the bio base package manual and could write the data into excel 
files.

The data I am working on is the cancer data.

I could get microarray information and recurrence information by using commands 
like

x-pData(oncogene)

y-exprs(oncogene)

I think the survival information should also be in the .RData file.How can i 
know what all information is available in the give file.

Please let me know any commands that show what type of information is available 
in the given file from a bio base package.


Thank You

rama kanth




  Download prohibited? No problem! To chat from any browser without 
download, Click Here: http://in.messenger.yahoo.com/webmessengerpromo.php

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Question on weighted Kaplan-Meier analysis of case-cohort design

2007-06-10 Thread xiao-jun ma

I have a study best described as a retrospective case-cohort design:  
the cases were all the events in a given time span surveyed, and the  
controls (event-free during the follow-up period) were selected in  
2:1 ratio (2 controls per case).  The sampling frequency for the  
controls was about 0.27, so I used a weight vector consisting of 1  
for cases and 1/0.27 for controls for coxph to adjust for sampling  
bias. Using the same weights in Kaplan-Meier analysis (survfit) gave  
very inaccurate survival curves (much lower event rate than expected  
from population). Are weighting handled differently between coxph and  
survfit? How should I conduct a weighted Kaplan-Meier analysis (given  
that survfit doesn't accept a weighted cox model) for such a design?

Any explanations or suggestions are highly appreciated,

xiaojun

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Svar: Re: help with simple R-question

2007-06-06 Thread Rina Miehs

thanks, that works great!!
 
just have another thing...i the same area
What if the class is list instead of array, how can you name the first
unrecognized column?
 
Rina

 John Kane [EMAIL PROTECTED] 06/05/07 3:17 

--- Rina Miehs [EMAIL PROTECTED] wrote:

 hello
  
 what do i write for R to recognize both columns?
  
 In the R-script downunder you can see that i use
 tapply to get my
 information out of my data, and then i need to use
 it as a dataframe
 with both columns! It is like R is using the first
 column as an
 observationnumber or something, how can i change
 that?? 

It is using the names of the variables as rownames.

try 
n.ant - names(antall)
antal1 - data.frame(n.antal1, antal1)


  
  antal1 -tapply(l1$omlob1,l1$farid,length)
  antal1
 1437987  1100  10007995  10008295  10008792 
 10010203  10018703 
 10033401 
 2 3 3 2 3   
  1 1  
   2 
  10048900  10050492  10055897  10076495  10081892 
 10094801  10100692 
 10101395 
 3 1 3 3 6   
  2 5  
  20 
  10101495  10101595  10104692  10113592  10113697 
 10114297  10120797 
 10120897 
 1 5 4 2 6   
 11 1  
   4 
  10121697  10121897  10121997  10133592  10142892 
 10142995  10146495 
 10150497 
16 3 6 1 1   
  6 4  
   4 
  10150692  10157092  10157292  10164792  10170892 
 10171795  10171895 
 10172300 
 5 2 4 4 4   
  4 4  
   1 
  10175195  10187802  10192499  10192897  10198295 
 10200493  10201693 
 10211593 
 1 2 2 3 5   
  1 3  
   5 
  antal1 - data.frame(antal1)
  antal1
   antal1
 14379872
 1100   3
 10007995   3
 10008295   2
 10008792   3
 10010203  NA
 10018703  NA
 10033401   2
 10048900   3
 10050492   1
 10055897   3
 10076495   3
 10081892   6
 10094801   2
 10100692   5
  
 Thanks
 Rina
 
 [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help 
 PLEASE do read the posting guide
 http://www.R ( http://www.r/ )-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.
 



  Get news delivered with the All new Yahoo! Mail.  Enjoy RSS feeds
right on your Mail page. Start today at
http://mrd.mail.yahoo.com/try_beta?.intl=ca 


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Svar: Re: help with simple R-question

2007-06-06 Thread John Kane


--- Rina Miehs [EMAIL PROTECTED] wrote:

 thanks, that works great!!
  
 just have another thing...i the same area
 What if the class is list instead of array, how can
 you name the first unrecognized column?

I am not sure that I understand the question.  You
don't really have an unrecognised column in the
dataframe but an array of rownames ( I don't know how
they are stored).

I think you can do the same thing as you did for the
data.frame but as I say, I am not sure I understand
the question.  Would you post a little example?


  
 Rina
 
  John Kane [EMAIL PROTECTED] 06/05/07 3:17 
 
 --- Rina Miehs [EMAIL PROTECTED] wrote:
 
  hello
   
  what do i write for R to recognize both columns?
   
  In the R-script downunder you can see that i use
  tapply to get my
  information out of my data, and then i need to use
  it as a dataframe
  with both columns! It is like R is using the first
  column as an
  observationnumber or something, how can i change
  that?? 
 
 It is using the names of the variables as rownames.
 
 try 
 n.ant - names(antall)
 antal1 - data.frame(n.antal1, antal1)
 
 
   
   antal1 -tapply(l1$omlob1,l1$farid,length)
   antal1
  1437987  1100  10007995  10008295  10008792 
  10010203  10018703 
  10033401 
  2 3 3 2 3 
  
   1 1  
2 
   10048900  10050492  10055897  10076495  10081892 
  10094801  10100692 
  10101395 
  3 1 3 3 6 
  
   2 5  
   20 
   10101495  10101595  10104692  10113592  10113697 
  10114297  10120797 
  10120897 
  1 5 4 2 6 
  
  11 1  
4 
   10121697  10121897  10121997  10133592  10142892 
  10142995  10146495 
  10150497 
 16 3 6 1 1 
  
   6 4  
4 
   10150692  10157092  10157292  10164792  10170892 
  10171795  10171895 
  10172300 
  5 2 4 4 4 
  
   4 4  
1 
   10175195  10187802  10192499  10192897  10198295 
  10200493  10201693 
  10211593 
  1 2 2 3 5 
  
   1 3  
5 
   antal1 - data.frame(antal1)
   antal1
antal1
  14379872
  1100   3
  10007995   3
  10008295   2
  10008792   3
  10010203  NA
  10018703  NA
  10033401   2
  10048900   3
  10050492   1
  10055897   3
  10076495   3
  10081892   6
  10094801   2
  10100692   5
   
  Thanks
  Rina
  
  [[alternative HTML version deleted]]
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help 
  PLEASE do read the posting guide
  http://www.R ( http://www.r/
 )-project.org/posting-guide.html
  and provide commented, minimal, self-contained,
  reproducible code.
  
 
 
 
   Get news delivered with the All new Yahoo!
 Mail.  Enjoy RSS feeds
 right on your Mail page. Start today at
 http://mrd.mail.yahoo.com/try_beta?.intl=ca 
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Svar: Re: help with simple R-question

2007-06-06 Thread Rina Miehs

 
The left column is boar id number, and the right is the random effect
estimate. I need the numbers in the left column when i merge far1
together with other data.frames based on the id numbers. When i use
ranef the output is the class list and R only sees the intercepts, but i
need a data.frame with 'boar id' and 'niveau', two columns not just
one...
 
fx
 far1 - ranef(resultat1)[1]
 far1
$farid
(Intercept)
1437987   -3.775851e-03
1100  -3.220044e-03
10007995   1.848914e-02
10008295  -4.583903e-03
10008792  -9.518371e-03
10033401  -7.538132e-03
10048900   1.540309e-02
 far1 - as.data.frame(far1)
 far1
   X.Intercept.
1437987   -3.775851e-03
1100  -3.220044e-03
10007995   1.848914e-02
10008295  -4.583903e-03
10008792  -9.518371e-03
10033401  -7.538132e-03
10048900   1.540309e-02
Thanks again
 
Rina


 John Kane [EMAIL PROTECTED] 06/06/07 11:38 

--- Rina Miehs [EMAIL PROTECTED] wrote:

 thanks, that works great!!
  
 just have another thing...i the same area
 What if the class is list instead of array, how can
 you name the first unrecognized column?

I am not sure that I understand the question.  You
don't really have an unrecognised column in the
dataframe but an array of rownames ( I don't know how
they are stored).

I think you can do the same thing as you did for the
data.frame but as I say, I am not sure I understand
the question.  Would you post a little example?


  
 Rina
 
  John Kane [EMAIL PROTECTED] 06/05/07 3:17 
 
 --- Rina Miehs [EMAIL PROTECTED] wrote:
 
  hello
   
  what do i write for R to recognize both columns?
   
  In the R-script downunder you can see that i use
  tapply to get my
  information out of my data, and then i need to use
  it as a dataframe
  with both columns! It is like R is using the first
  column as an
  observationnumber or something, how can i change
  that?? 
 
 It is using the names of the variables as rownames.
 
 try 
 n.ant - names(antall)
 antal1 - data.frame(n.antal1, antal1)
 
 
   
   antal1 -tapply(l1$omlob1,l1$farid,length)
   antal1
  1437987  1100  10007995  10008295  10008792 
  10010203  10018703 
  10033401 
  2 3 3 2 3 
  
   1 1  
2 
   10048900  10050492  10055897  10076495  10081892 
  10094801  10100692 
  10101395 
  3 1 3 3 6 
  
   2 5  
   20 
   10101495  10101595  10104692  10113592  10113697 
  10114297  10120797 
  10120897 
  1 5 4 2 6 
  
  11 1  
4 
   10121697  10121897  10121997  10133592  10142892 
  10142995  10146495 
  10150497 
 16 3 6 1 1 
  
   6 4  
4 
   10150692  10157092  10157292  10164792  10170892 
  10171795  10171895 
  10172300 
  5 2 4 4 4 
  
   4 4  
1 
   10175195  10187802  10192499  10192897  10198295 
  10200493  10201693 
  10211593 
  1 2 2 3 5 
  
   1 3  
5 
   antal1 - data.frame(antal1)
   antal1
antal1
  14379872
  1100   3
  10007995   3
  10008295   2
  10008792   3
  10010203  NA
  10018703  NA
  10033401   2
  10048900   3
  10050492   1
  10055897   3
  10076495   3
  10081892   6
  10094801   2
  10100692   5
   
  Thanks
  Rina
  
  [[alternative HTML version deleted]]
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help 
  PLEASE do read the posting guide
  http://www.R ( http://www.r/ ) ( http://www.r/ 
 )-project.org/posting-guide.html
  and provide commented, minimal, self-contained,
  reproducible code.
  
 
 
 
   Get news delivered with the All new Yahoo!
 Mail.  Enjoy RSS feeds
 right on your Mail page. Start today at
 http://mrd.mail.yahoo.com/try_beta?.intl=ca 
 
 



  Be smarter than spam. See how smart SpamGuard is at giving junk
email the boot with the All-new Yahoo! Mail at
http://mrd.mail.yahoo.com/try_beta?.intl=ca 



[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Odp: Svar: Re: help with simple R-question

2007-06-06 Thread Petr PIKAL

[EMAIL PROTECTED] napsal dne 06.06.2007 10:56:18:

 thanks, that works great!!
 
 just have another thing...i the same area
 What if the class is list instead of array, how can you name the first
 unrecognized column?

Hi

look in some intro manual and learn about R data structures. Matrix, 
array, vector, data.frame, list and some others have some distinct 
features and they sometimes can be interchanged and sometimes not. To 
learn how objects are organised see str(), class(), typeof(), mode().

Regards
Petr
 

 
 Rina
 
  John Kane [EMAIL PROTECTED] 06/05/07 3:17 
 
 --- Rina Miehs [EMAIL PROTECTED] wrote:
 
  hello
  
  what do i write for R to recognize both columns?
  
  In the R-script downunder you can see that i use
  tapply to get my
  information out of my data, and then i need to use
  it as a dataframe
  with both columns! It is like R is using the first
  column as an
  observationnumber or something, how can i change
  that?? 
 
 It is using the names of the variables as rownames.
 
 try 
 n.ant - names(antall)
 antal1 - data.frame(n.antal1, antal1)
 
 
  
   antal1 -tapply(l1$omlob1,l1$farid,length)
   antal1
  1437987  1100  10007995  10008295  10008792 
  10010203  10018703 
  10033401 
  2 3 3 2 3 
   1 1 
2 
   10048900  10050492  10055897  10076495  10081892 
  10094801  10100692 
  10101395 
  3 1 3 3 6 
   2 5 
   20 
   10101495  10101595  10104692  10113592  10113697 
  10114297  10120797 
  10120897 
  1 5 4 2 6 
  11 1 
4 
   10121697  10121897  10121997  10133592  10142892 
  10142995  10146495 
  10150497 
 16 3 6 1 1 
   6 4 
4 
   10150692  10157092  10157292  10164792  10170892 
  10171795  10171895 
  10172300 
  5 2 4 4 4 
   4 4 
1 
   10175195  10187802  10192499  10192897  10198295 
  10200493  10201693 
  10211593 
  1 2 2 3 5 
   1 3 
5 
   antal1 - data.frame(antal1)
   antal1
antal1
  14379872
  1100   3
  10007995   3
  10008295   2
  10008792   3
  10010203  NA
  10018703  NA
  10033401   2
  10048900   3
  10050492   1
  10055897   3
  10076495   3
  10081892   6
  10094801   2
  10100692   5
  
  Thanks
  Rina
  
  [[alternative HTML version deleted]]
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help 
  PLEASE do read the posting guide
  http://www.R ( http://www.r/ )-project.org/posting-guide.html
  and provide commented, minimal, self-contained,
  reproducible code.
  
 
 
 
   Get news delivered with the All new Yahoo! Mail.  Enjoy RSS feeds
 right on your Mail page. Start today at
 http://mrd.mail.yahoo.com/try_beta?.intl=ca 
 
 
[[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Svar: Re: help with simple R-question

2007-06-06 Thread John Kane

I think that you have the same situation as before
though I have never used ranef().  The boar ids are
acting as the row names and are not really part of the
data.frame. It just looks like that when R prints the
data.frame.  

Try 

boars - rownames(far1) 
far1 - cbind( boars, far1) 

The results may look a bit funny with two printed
columns looking the same but one will be a column in
the data.frame and the other will be the row names
column.  I hope :)

--- Rina Miehs [EMAIL PROTECTED] wrote:

  
 The left column is boar id number, and the right is
 the random effect
 estimate. I need the numbers in the left column when
 i merge far1
 together with other data.frames based on the id
 numbers. When i use
 ranef the output is the class list and R only sees
 the intercepts, but i
 need a data.frame with 'boar id' and 'niveau', two
 columns not just
 one...
  
 fx
  far1 - ranef(resultat1)[1]
  far1
 $farid
 (Intercept)
 1437987   -3.775851e-03
 1100  -3.220044e-03
 10007995   1.848914e-02
 10008295  -4.583903e-03
 10008792  -9.518371e-03
 10033401  -7.538132e-03
 10048900   1.540309e-02
  far1 - as.data.frame(far1)
  far1
X.Intercept.
 1437987   -3.775851e-03
 1100  -3.220044e-03
 10007995   1.848914e-02
 10008295  -4.583903e-03
 10008792  -9.518371e-03
 10033401  -7.538132e-03
 10048900   1.540309e-02
 Thanks again
  
 Rina
 
 
  John Kane [EMAIL PROTECTED] 06/06/07 11:38
 
 
 --- Rina Miehs [EMAIL PROTECTED] wrote:
 
  thanks, that works great!!
   
  just have another thing...i the same area
  What if the class is list instead of array, how
 can
  you name the first unrecognized column?
 
 I am not sure that I understand the question.  You
 don't really have an unrecognised column in the
 dataframe but an array of rownames ( I don't know
 how
 they are stored).
 
 I think you can do the same thing as you did for the
 data.frame but as I say, I am not sure I understand
 the question.  Would you post a little example?
 
 
   
  Rina
  
   John Kane [EMAIL PROTECTED] 06/05/07 3:17
 
  
  --- Rina Miehs [EMAIL PROTECTED] wrote:
  
   hello

   what do i write for R to recognize both columns?

   In the R-script downunder you can see that i use
   tapply to get my
   information out of my data, and then i need to
 use
   it as a dataframe
   with both columns! It is like R is using the
 first
   column as an
   observationnumber or something, how can i change
   that?? 
  
  It is using the names of the variables as
 rownames.
  
  try 
  n.ant - names(antall)
  antal1 - data.frame(n.antal1, antal1)
  
  

antal1 -tapply(l1$omlob1,l1$farid,length)
antal1
   1437987  1100  10007995  10008295  10008792 
   10010203  10018703 
   10033401 
   2 3 3 2
 3 
   
1 1  
 2 
10048900  10050492  10055897  10076495 
 10081892 
   10094801  10100692 
   10101395 
   3 1 3 3
 6 
   
2 5  
20 
10101495  10101595  10104692  10113592 
 10113697 
   10114297  10120797 
   10120897 
   1 5 4 2
 6 
   
   11 1  
 4 
10121697  10121897  10121997  10133592 
 10142892 
   10142995  10146495 
   10150497 
  16 3 6 1
 1 
   
6 4  
 4 
10150692  10157092  10157292  10164792 
 10170892 
   10171795  10171895 
   10172300 
   5 2 4 4
 4 
   
4 4  
 1 
10175195  10187802  10192499  10192897 
 10198295 
   10200493  10201693 
   10211593 
   1 2 2 3
 5 
   
1 3  
 5 
antal1 - data.frame(antal1)
antal1
 antal1
   14379872
   1100   3
   10007995   3
   10008295   2
   10008792   3
   10010203  NA
   10018703  NA
   10033401   2
   10048900   3
   10050492   1
   10055897   3
   10076495   3
   10081892   6
   10094801   2
   10100692   5

   Thanks
   Rina
   
   [[alternative HTML version deleted]]
   
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help 
   PLEASE do read the posting guide
   http://www.R ( http://www.r/ ) ( http://www.r/ 
  )-project.org/posting-guide.html
   and provide commented, minimal, self-contained,
   reproducible code.
   
  
  
  
Get news delivered with the All new Yahoo!
  Mail.  Enjoy RSS feeds
  right on your Mail page. Start today at
  http://mrd.mail.yahoo.com/try_beta?.intl=ca 
  
  
 
 
 
   Be smarter than spam. See how smart SpamGuard
 is at giving junk

 http://mrd.mail.yahoo.com/try_beta?.intl=ca 
 
 
 
=== message truncated ===

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE

Re: [R] Svar: Re: help with simple R-question

2007-06-06 Thread Rina Miehs

Thanks that helped in the right way:D
Thanks alot!
Rina

 John Kane [EMAIL PROTECTED] 06/06/07 1:46 
I think that you have the same situation as before
though I have never used ranef().  The boar ids are
acting as the row names and are not really part of the
data.frame. It just looks like that when R prints the
data.frame.  

Try 

boars - rownames(far1) 
far1 - cbind( boars, far1) 

The results may look a bit funny with two printed
columns looking the same but one will be a column in
the data.frame and the other will be the row names
column.  I hope :)

--- Rina Miehs [EMAIL PROTECTED] wrote:

  
 The left column is boar id number, and the right is
 the random effect
 estimate. I need the numbers in the left column when
 i merge far1
 together with other data.frames based on the id
 numbers. When i use
 ranef the output is the class list and R only sees
 the intercepts, but i
 need a data.frame with 'boar id' and 'niveau', two
 columns not just
 one...
  
 fx
  far1 - ranef(resultat1)[1]
  far1
 $farid
 (Intercept)
 1437987   -3.775851e-03
 1100  -3.220044e-03
 10007995   1.848914e-02
 10008295  -4.583903e-03
 10008792  -9.518371e-03
 10033401  -7.538132e-03
 10048900   1.540309e-02
  far1 - as.data.frame(far1)
  far1
X.Intercept.
 1437987   -3.775851e-03
 1100  -3.220044e-03
 10007995   1.848914e-02
 10008295  -4.583903e-03
 10008792  -9.518371e-03
 10033401  -7.538132e-03
 10048900   1.540309e-02
 Thanks again
  
 Rina
 
 
  John Kane [EMAIL PROTECTED] 06/06/07 11:38
 
 
 --- Rina Miehs [EMAIL PROTECTED] wrote:
 
  thanks, that works great!!
   
  just have another thing...i the same area
  What if the class is list instead of array, how
 can
  you name the first unrecognized column?
 
 I am not sure that I understand the question.  You
 don't really have an unrecognised column in the
 dataframe but an array of rownames ( I don't know
 how
 they are stored).
 
 I think you can do the same thing as you did for the
 data.frame but as I say, I am not sure I understand
 the question.  Would you post a little example?
 
 
   
  Rina
  
   John Kane [EMAIL PROTECTED] 06/05/07 3:17
 
  
  --- Rina Miehs [EMAIL PROTECTED] wrote:
  
   hello

   what do i write for R to recognize both columns?

   In the R-script downunder you can see that i use
   tapply to get my
   information out of my data, and then i need to
 use
   it as a dataframe
   with both columns! It is like R is using the
 first
   column as an
   observationnumber or something, how can i change
   that?? 
  
  It is using the names of the variables as
 rownames.
  
  try 
  n.ant - names(antall)
  antal1 - data.frame(n.antal1, antal1)
  
  

antal1 -tapply(l1$omlob1,l1$farid,length)
antal1
   1437987  1100  10007995  10008295  10008792 
   10010203  10018703 
   10033401 
   2 3 3 2
 3 
   
1 1  
 2 
10048900  10050492  10055897  10076495 
 10081892 
   10094801  10100692 
   10101395 
   3 1 3 3
 6 
   
2 5  
20 
10101495  10101595  10104692  10113592 
 10113697 
   10114297  10120797 
   10120897 
   1 5 4 2
 6 
   
   11 1  
 4 
10121697  10121897  10121997  10133592 
 10142892 
   10142995  10146495 
   10150497 
  16 3 6 1
 1 
   
6 4  
 4 
10150692  10157092  10157292  10164792 
 10170892 
   10171795  10171895 
   10172300 
   5 2 4 4
 4 
   
4 4  
 1 
10175195  10187802  10192499  10192897 
 10198295 
   10200493  10201693 
   10211593 
   1 2 2 3
 5 
   
1 3  
 5 
antal1 - data.frame(antal1)
antal1
 antal1
   14379872
   1100   3
   10007995   3
   10008295   2
   10008792   3
   10010203  NA
   10018703  NA
   10033401   2
   10048900   3
   10050492   1
   10055897   3
   10076495   3
   10081892   6
   10094801   2
   10100692   5

   Thanks
   Rina
   
   [[alternative HTML version deleted]]
   
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help 
   PLEASE do read the posting guide
   http://www.R ( http://www.r/ ) ( http://www.r/ ) ( http://www.r/

  )-project.org/posting-guide.html
   and provide commented, minimal, self-contained,
   reproducible code.
   
  
  
  
Get news delivered with the All new Yahoo!
  Mail.  Enjoy RSS feeds
  right on your Mail page. Start today at
  http://mrd.mail.yahoo.com/try_beta?.intl=ca 
  
  
 
 
 
   Be smarter than spam. See how smart SpamGuard
 is at giving junk
 email the boot with the All-new Yahoo! Mail at
 http://mrd.mail.yahoo.com/try_beta?.intl=ca

[R] Question on RandomForest in unsupervised mode

2007-06-06 Thread Irilenia Nobeli

Hi,

I attempted to run the randomForest() function on a dataset without  
predefined classes. According to the manual, running randomForest  
without a response variable/class labels should result in the  
function assuming you are running in unsupervised mode. In this case,  
I understand that my data is all assigned to one class whereas a  
second synthetic class is made up, which is assigned to a second  
class. The online manual suggests that an oob misclassification error  
in this two-class problem of ~40% or more would indicate that the x- 
variables look like independent variables to random forests (and I  
assume that in this case the proximities obtained by the randomForest  
would not be informative for clustering).

When I run randomForest() in the unsupervised mode my first problem  
is that I get NULL entries for the confusion matrix and the err.rate,  
but I suppose this is normal behaviour. My only information (apart  
from the proximities of course), seems to be the votes, from which I  
can deduce whether the variables are meaningful or not. The second  
problem is that, in my case, all my observations seem to have about  
20-40% of the votes from class 1 and the rest from class 2 (i.e.  
class 2 wins always). Assuming that class 1 was assigned to my  
original data, I'd say this is rather surprising.
Initially I thought this was simply a problem of my data not being  
meaningful, but I repeated simply the forest with the iris example  
data and I get more or less the same result.
I did simply:

iris.urf - randomForest(iris[,-5])
iris.urf$votes

and I got again most of the votes coming from class 2, although here  
vote percentages are slightly more balanced than with my data  
(approximately 40 to 60% most of the time).

Has anyone got experience with unsupervised randomForest() in R and  
can explain to me why I'm observing this behaviour? In general, any  
hints about pitfalls regarding random forests in unsupervised mode  
would be very much appreciated.

Many thanks in advance,

Irilenia

-
Irilenia (Irene) Nobeli
Randall Division of Cell and Molecular Biophysics
New Hunt's House (room 3.14)
King's College London, Guy's Campus
London, SE1 1UL
U.K.
[EMAIL PROTECTED]
+44(0)207-8486329

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Question: RMySQL bulk load/update one column, dbWriteTable()?

2007-06-06 Thread Waverley

Hi,

I have a question reading using RMySQL trying to load one R vector into a
table column.  To be more specifically, the table is there populated.  Now I
add a new column and want to populate this.

Can some colleagues on this list teach me how to do this?  I know how to
write one R object/table into MYSQL table using dbWriteTable.  But in this
situation, I just want to do one column.

Thanks much in advance.


-- 
Waverley

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question: RMySQL bulk load/update one column, dbWriteTable()?

2007-06-06 Thread Chris Stubben


 I have a question reading using RMySQL trying to load one R vector into a
 table column.  To be more specifically, the table is there populated.  Now I
 add a new column and want to populate this.
 


Okay, this is more of an SQL question now, but you could just use dbWriteTable
and then do an multi-table update.



dbGetQuery(con, select * from tmp)

  id name
1  1A
2  2B
3  3C
4  4D
5  5E


dbSendQuery(con, alter table tmp add column r2 float)

## calculate some statistic for all or some ids in table


x-dataframe(id=1:5, r2=c(.1, .4, .9, .4,.7))


dbWriteTable(con, r2tmp,  x )


dbSendQuery(con, update tmp t, r2tmp r set t.r2=r.r2 where t.id=r.id)


dbGetQuery(con, select * from tmp)

  id name  r2
1  1A 0.1
2  2B 0.4
3  3C 0.9
4  4D 0.4
5  5E 0.7


Chris

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question: RMySQL bulk load/update one column, dbWriteTable()?

2007-06-06 Thread Waverley

Thanks Chris.  I am trying almost the same solution while I have failed the
dbWriteTable.

The problem of using update is that it is way TOO slow when the row size is
~20.

That is why I hope I can still get dbWriteTable way to add one column.
dbWriteTable is very efficient and fast.  The problem of dbWriteTable, so
far I know and so far I have read, is that you have to load one data frame
which covers all the columns of one table.  Now I want to do is bulky load
one column in stead of ALL columns.  Supposedly underneath dbWriteTable is
load data infile, which according to my reading should allow you to load
data infile to one table column.

can someone help?

Thanks.


On 6/6/07, Chris Stubben [EMAIL PROTECTED] wrote:


  I have a question reading using RMySQL trying to load one R vector into
 a
  table column.  To be more specifically, the table is there
 populated.  Now I
  add a new column and want to populate this.
 


 Okay, this is more of an SQL question now, but you could just use
 dbWriteTable
 and then do an multi-table update.



 dbGetQuery(con, select * from tmp)

 id name
 1  1A
 2  2B
 3  3C
 4  4D
 5  5E


 dbSendQuery(con, alter table tmp add column r2 float)

 ## calculate some statistic for all or some ids in table


 x-dataframe(id=1:5, r2=c(.1, .4, .9, .4,.7))


 dbWriteTable(con, r2tmp,  x )


 dbSendQuery(con, update tmp t, r2tmp r set t.r2=r.r2 where t.id=r.id)


 dbGetQuery(con, select * from tmp)

 id name  r2
 1  1A 0.1
 2  2B 0.4
 3  3C 0.9
 4  4D 0.4
 5  5E 0.7


 Chris

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Waverley @ Palo Alto

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Question about parse and expression

2007-06-06 Thread Nitin Jain

Dear R-users,

In the following example, I would like to see my ylabel as: beta[3] * x[3] + 
epsilon (where beta and epsilon are replaced by their mathematical symbols).

Please advise.

Thanks.

Nitin


i - 3

ee - expression(beta[i] * x[i] + epsilon)

xyplot(1:10~ 11:20,
   ylab = parse(text=ee)
   )
   



 

8:00? 8:25? 8:40? Find a flick in no time

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question about parse and expression

2007-06-06 Thread Charles C. Berry

On Wed, 6 Jun 2007, Nitin Jain wrote:

 Dear R-users,

 In the following example, I would like to see my ylabel as: beta[3] * 
 x[3] + epsilon (where beta and epsilon are replaced by their 
 mathematical symbols).

 Please advise.

?plotmath
?bquote
example(plotmath)


 Thanks.

 Nitin


 i - 3

 ee - expression(beta[i] * x[i] + epsilon)

 xyplot(1:10~ 11:20,
   ylab = parse(text=ee)
   )





 
 8:00? 8:25? 8:40? Find a flick in no time

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


Charles C. Berry(858) 534-2098
  Dept of Family/Preventive Medicine
E mailto:[EMAIL PROTECTED]   UC San Diego
http://biostat.ucsd.edu/~cberry/ La Jolla, San Diego 92093-0901

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question about parse and expression

2007-06-06 Thread Gabor Grothendieck

Try this:

library(lattice)
x - 1:10
i - 3
xlab - as.expression(substitute(beta[i] * x[i] + epsilon, list(i = i)))
xyplot(x ~ x, xlab = xlab)


On 6/6/07, Nitin Jain [EMAIL PROTECTED] wrote:
 Dear R-users,

 In the following example, I would like to see my ylabel as: beta[3] * x[3] + 
 epsilon (where beta and epsilon are replaced by their mathematical symbols).

 Please advise.

 Thanks.

 Nitin


 i - 3

 ee - expression(beta[i] * x[i] + epsilon)

 xyplot(1:10~ 11:20,
   ylab = parse(text=ee)
   )





 
 8:00? 8:25? 8:40? Find a flick in no time

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 922 matches

Mail list logo