date:20110124

Re: [R] Implementing step-wise linear regression

2011-01-24 Thread Tal Galili

Hello Troy.

A tiny question (without answering your question), why did you choose to do
it this way instead of using
?step
or
?stepAIC


?

Best,
Tal

Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--




On Mon, Jan 24, 2011 at 3:47 AM, Troy S troysocks-tw...@yahoo.com wrote:

 Dear R fans,

 I am trying to do step-wise linear regression using the F-test to decide
 which variables to admit.  Ewout Steyerberg suggests using the F-test for
 this purpose.

 I first build a model using no variables using lm(y ~ 1) and then using one
 variable that is a strong predictor using lm(y ~ x).  When I call var.test
 on these two models, I do not get a significant p-value0.07.  But a
 summary
 of the second model gives a F-test p-value that is very small.

 My questions are:

 Should I be using var.test to run the F-test to decide which variable to
 add
 next?

 What is the difference between the F-test run by var.test and summary.lm?

 Has step-wise model building using the F-test been programmed already?

 Thanks!

 Troy

[[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to measure/rank “variable importance” when using rpart?

2011-01-24 Thread Tal Galili

Hello all,

When building a CART model (specifically classification tree) using rpart,
it is sometimes interesting to know what is the importance of the various
variables introduced to the model.

Thus, my question is: *What common measures exists for ranking/measuring
variable importance of participating variables in a CART model? And how can
this be computed using R (for example, when using the rpart package)*

For example, here is some dummy code, created so you might show your
solutions on it. This example is structured so that it is clear that
variable x1 and x2 are important while (in some sense) x1 is more
important then x2 (since x1 should apply to more cases, thus make more
influence on the structure of the data, then x2).

set.seed(31431)

n - 400

x1 - rnorm(n)

x2 - rnorm(n)

x3 - rnorm(n)

x4 - rnorm(n)

x5 - rnorm(n)

X - data.frame(x1,x2,x3,x4,x5)

y - sample(letters[1:4], n, T)

y - ifelse(X[,2]  -1 , b, y)

y - ifelse(X[,1]  0 , a, y)

require(rpart)

fit - rpart(y~., X)

plot(fit); text(fit)

info.gain.rpart(fit) # your function - telling us on each variable how
important it is

(references are always welcomed)


Thanks!

Tal

Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Vectorization

2011-01-24 Thread Petr Savicky

On Sun, Jan 23, 2011 at 07:29:16PM -0800, eric wrote:
 
 Is there a way to vectorize this loop or a smarter way to do it ?
 
 y
  [1]  0.003990746 -0.037664639  0.005397999  0.010415496  0.003500676
  [6]  0.001691775  0.008170774  0.011961998 -0.016879531  0.007284486
 [11] -0.015083581 -0.006645958 -0.013153103  0.028148639 -0.005724317
 [16] -0.027408025  0.014767422 -0.001619691  0.018334730 -0.009747171
 
 x -numeric(length(y))
 for (i in 1 :length(y)) {
 x[i] - ifelse( i==1, 1*(1+y[i]), (1+y[i])*x[i-1])
 }
 
 x
  [1] 10039.907  9661.758  9713.912  9815.087  9849.447  9866.110  9946.724
  [8] 10065.706  9895.802  9967.888  9817.536  9752.289  9624.016  9894.919
 [15]  9838.278  9568.630  9709.934  9694.207  9871.948  9775.724
 
 Basically trying to see how the equity of an investment changes after each
 return period. Start with $10,000 and a series of returns over time. Figure
 out the equity after each time period (return).

Hello.

The cycle computes a cumulative product. The initialization may
be add as a common multiplier. So, z in the following should be equal
to x up to the machine rounding error.

  y - c(
0.003990746, -0.037664639,  0.005397999,  0.010415496,  0.003500676,
0.001691775,  0.008170774,  0.011961998, -0.016879531,  0.007284486,
   -0.015083581, -0.006645958, -0.013153103,  0.028148639, -0.005724317,
   -0.027408025,  0.014767422, -0.001619691,  0.018334730, -0.009747171)
 
  x - numeric(length(y))
  for (i in 1:length(y)) {
  x[i] - ifelse(i==1, 1*(1+y[i]), (1+y[i])*x[i-1])
  }
 
  z - 1*cumprod(1 + y)
 
  max(abs(x - z))
  # [1] 1.818989e-12

Petr Savicky.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] sensitivity logical operators in R

2011-01-24 Thread Petr Savicky

On Sun, Jan 23, 2011 at 11:13:11PM +0100, Marc Jekel wrote:
 Hello R Fans,
 
 Another question for the community that really frightened me today. The 
 following logical comparison produces a false as output:
 
 t = sum((c(.7,.69,.68,.67,.66)-.5)*c(1,1,-1,-1,1))
 tt = sum((c(.7,.69,.68,.67,.66)-.5)*c(1,-1,1,1,-1))
 
 t == tt
 
 This is really strange behavior. Most likely this has something to do 
 how R represents numbers internally and the possible sensitivity of a 
 computer? Does anyone know when this strange behavior occurs and how to 
 fix it?

The number 0.7 has infinite expansion in binary
  0.1011001100110011001100110011...
so is rounded in the standard numeric data type, which is used for
speed needed in complex computations. If you know in advance that
the result has at most 2 decimal positions, then round(, digits=2)
yields the correct comparison

  round(t, 2) == round(tt, 2)
  # [1] TRUE

athough 0.2 is also not exactly representable. Both sides are rounded
to the same representable number.

See also
  http://rwiki.sciviews.org/doku.php?id=misc:r_accuracy
for other examples.

Petr Savicky.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] glitch in building R package

2011-01-24 Thread Uwe Ligges




On 22.01.2011 00:16, Horace Tso wrote:

I follow Alan Lenarcic's very helpful tutorial on building R package for Windows 
(XP), which could be found in 
www.stat.columbia.edu/~gelman/stuff_for_blog/AlanRPackageTutorial.pdfhttp://www.stat.columbia.edu/~gelman/stuff_for_blog/AlanRPackageTutorial.pdf.
 The package involves a small dll compiled from some very simple C++ codes.


1. Although the tutorial was certainly very helpful at the time it was 
written, some parts are outdated these days. Please read Writing R 
Extensions and the R Installation and Administration manual.


2. You probably forgot to tell your package to do something that 
corresponds to dyn.load, either in a .FirstLib or in a NAMESPACE directive.


Best,
Uwe Ligges







The build process seemed to work smoothly, until i install. Then I got an error saying 
the C function was not in the load table. This is rather mysterious because I've been 
able to call this function from R with dyn.load(name.dll). So the dll is 
working.

The install error says :

C:\R-testR CMD INSTALL --build FirstPack_0.1.tar.gz
* installing to library 'c:/R/R-2.12.0/library'
* installing *source* package 'FirstPack' ...
** libs
cygwin warning:
   MS-DOS style path detected: c:/R/R-2.12.0/etc/i386/Makeconf
   Preferred POSIX equivalent is: /cygdrive/c/R/R-2.12.0/etc/i386/Makeconf
   CYGWIN environment variable option nodosfilewarning turns off this warning.
   Consult the user's guide for more details about POSIX paths:
 http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
g++ -Ic:/R/R-2.12.0/include -O2 -Wall  -c XDemo.cpp -o XDemo.o
g++ -Ic:/R/R-2.12.0/include -O2 -Wall  -c XDemo_main.cpp -o XDemo_main
.o
g++ -shared -s -static-libgcc -o FirstPack.dll tmp.def XDemo.o XDemo_main.o -Lc:
/R/R-2.12.0/bin/i386 -lR
installing to c:/R/R-2.12.0/library/FirstPack/libs/i386
** R
** data
Warning: empty 'data' directory
** preparing package for lazy loading
Error in .C(DemoAutoCor, OutVec = as.double(vector(numeric, OutLength)),  :

   C symbol name DemoAutoCor not in load table
ERROR: lazy loading failed for package 'FirstPack'
* removing 'c:/R/R-2.12.0/library/FirstPack'
Here is how i built the package. I have the directory structure as described in 
Writing R Extensions and I issued the following command in DOS prompt,

C:\R-testR CMD build FirstPack
* checking for file 'FirstPack/DESCRIPTION' ... OK
* preparing 'FirstPack':
* checking DESCRIPTION meta-information ... OK
* cleaning src
cygwin warning:
   MS-DOS style path detected: C:/R-test/FirstPack_0.1.tar
   Preferred POSIX equivalent is: /cygdrive/c/R-test/FirstPack_0.1.tar
   CYGWIN environment variable option nodosfilewarning turns off this warning.
   Consult the user's guide for more details about POSIX paths:
 http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
cygwin warning:
   MS-DOS style path detected: C:/R-test/FirstPack_0.1.tar
   Preferred POSIX equivalent is: /cygdrive/c/R-test/FirstPack_0.1.tar
   CYGWIN environment variable option nodosfilewarning turns off this warning.
   Consult the user's guide for more details about POSIX paths:
 http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
Warning in readLines(ldpath) :
   incomplete final line found on 'FirstPack/DESCRIPTION'
* checking for LF line-endings in source and make files
* checking for empty or unneeded directories
WARNING: directory 'FirstPack/data' is empty
* building 'FirstPack_0.1.tar.gz'
cygwin warning:
   MS-DOS style path detected: C:/R-test/FirstPack_0.1.tar
   Preferred POSIX equivalent is: /cygdrive/c/R-test/FirstPack_0.1.tar
   CYGWIN environment variable option nodosfilewarning turns off this warning.
   Consult the user's guide for more details about POSIX paths:
 http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
cygwin warning:
   MS-DOS style path detected: C:/R-test/FirstPack_0.1.tar
   Preferred POSIX equivalent is: /cygdrive/c/R-test/FirstPack_0.1.tar
   CYGWIN environment variable option nodosfilewarning turns off this warning.
   Consult the user's guide for more details about POSIX paths:
 http://cygwin.com/cygwin-ug-net/using.html#using-pathnames

Thanks in advance.

H



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] From two polynomials one multivariate

2011-01-24 Thread Alaios

Hello I have one function that creates polynomials (i.e legendre.polynomials) I 
want to use this one to create polynomials for variable x and variable y.

legendre.polynomials(2)
[[1]]
1 

[[2]]
x 

[[3]]
-0.5 + 1.5*x^2 

the ideal would be to receive the same output but for another variable (eg. y)

Then having two equation with different x and y I can create the multinomial I 
want to.

I checked the multipolynom package but it can not do what I am looking for.

Best Regards
Alex

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Passing in arguments into function

2011-01-24 Thread Ivan Calandra


Hi,

If you have the formula stored in a string, you could also use 
as.formula in your call to lm, like this:

form - x ~ y + z
lm(as.formula(form))

HTH,
Ivan


Le 1/23/2011 21:38, Joshua Wiley a écrit :

Hi Paul,

You need to pass the formula object, not a string.  If you have a
function that is passing one of its arguments down to lm(), just pass
the argument directly, no need to do anything special.  Here are some
examples using a built in dataset:

## wrapper function
foo- function(fooform, ...) {
   summary(lm(formula = fooform, ...))
}

## seeing it in action
foo(mpg ~ hp * wt, data = mtcars)

## save a formula in an object
myform- mpg ~ hp * wt

## pass the object to foo() which passes it down
foo(myform, data = mtcars)

## pass the formula object myform directly to lm()
summary(lm(myform, data = mtcars))

Do one of those answer your question or do what you want?

Hope this helps,

Josh

On Sun, Jan 23, 2011 at 8:46 AM, Paul Evansp.evan...@yahoo.com  wrote:

Hi,

I had a function that looked like:

diff- lm(x ~ y + z)

How can I pass the argument to the 'lm' function on the fly? E.g., if I pass it
in as a string (e.g. x ~ y + z), then the lm function treats it as a string
and not a proper argument.

many thanks




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.






--
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calan...@uni-hamburg.de

**
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Gstat error message.

2011-01-24 Thread Uwe Ligges




On 23.01.2011 21:47, Kamina Chororoka wrote:

Hi,
I am a student at the University of Twente ( ITC).
I am using the R packages for my data analysis, but for the last few weeks now 
, I have been getting the error message when trying to work on variograms or 
krigging.


Error : .onLoad failed in loadNamespace() for 'gstat', details:
   call: fun(...)
   error: .Random.seed is not an integer vector but of type 'list'
Error: package/namespace load failed for 'gstat'



You have a .Random.seed in your Workspace that is not compatoible with 
gstat obviously.


Hence type

 rm(.Random.seed)

and try again.


Uwe Ligges






I have tried several  options in vain. I have tried to reinstall, to load the 
extension from the local drive but, again in vain.

I would like to have technical assistance from your desk.

Looking to hearing from you soon.

Kamina Chororoka.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] how to slice a zoo object

2011-01-24 Thread Blair Sutton

Hi

Would anyone have any pointers on how to slice up a large zoo table. I
have the following structure: -

 str(ZOO_OBJ)
 zoo [1:632, 1:83] 30.4 30.4 30.4 30.4 30.3 ...
 - attr(*, dimnames)=List of 2
  ..$ : NULL
  ..$ : chr [1:83] COL1 COL2 COL3 COL4 ...
 - attr(*, index)= POSIXct[1:632], format: 2009-05-01 01:00:00
2009-05-02 01:00:00 ...

and I would just like to take only arbitrary columns, i.e. another zoo
object with only the columns COL2, COL5, etc..

I've tried various syntactical combinations such as those for
data.frames and also tried manipulating the coredata(). What would be
nice would be the ability to do this by column names and not their
indexes.

Any help appreciated and thanks in advance,
Blair

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] no font could be found for family Arial

2011-01-24 Thread Prof Brian Ripley

Please do read the posting guide: what OS, what version of R, what
graphics device

At a guess this was Mac OS X (and this was the wrong list) and you
need to repair your Mac OS fonts. There are threads on R-sig-mac about
that every couple of months, including this month.

On Sun, 23 Jan 2011, emmats wrote:

I was re-running some code that I hadn't run in a couple of months to make
barplots in R. I didn't change a single thing in the script, but the plots
wouldn't work this time around. The plot itself (the bars and axes) will
graph in the window, but no text appears. In the console it says I have a
number of errors, all of which say no font could be found for family
'Arial'.
I have not knowingly changed anything in R and I would like to be able to
make barplots with labels and titles again. Does anyone know how to fix
this?

Thank you!
--
View this message in context:
http://r.789695.n4.nabble.com/no-font-could-be-found-for-family-Arial-tp3233322p3233322.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
Brian D. Ripley, rip...@stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax: +44 1865 272595

1 2 >

1 - 100 of 112 matches

Mail list logo