[R] Importing data

2008-01-06 Thread Simo Vundla
Hi,
I'm trying to import categorical data from SPSS to R using the script:
xxx -spss.get(xxx.por, use.value.labels=TRUE) but unfortunately am getting 
an error message 'error reading portable-file dictionary'.

I have successfully imported data in the past. 

What could be the problem with this data?

Thanks

Simo







  

Be a better friend, newshound, and 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] run setwd at the launch of R

2008-01-06 Thread bunny , lautloscrew.com
Dear all,

my R files (and the .csv files as well) are saved somewhere pretty  
deep down my hard disk.
i have to chage to working directory therefore everytime i run R (i  
run it on powerPC mac), which is disgusting.
using the setwd command at the beginning of an R script doesnt really  
help because i have to find this file first by hand.

I am looking for possibility to run setwd during the launch process  
of R are straight after it ... any suggestions ?

i would be very glad about good ideas or help !

thanks in advance

matthias

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error .. missing value where TRUE/FALSE needed

2008-01-06 Thread jim holtman
When the error occurs, valueDiff is NA:

Error in if ((seedCount = seedNumber)  (valueDiff  sup)) { :
  missing value where TRUE/FALSE needed
 valueDiff
[1] NA


Look at your loop; you are going through 100 times so on the last time
you are trying to access:  fcsPar[k+1] which is the 101th entry which
is NA.  Your program has a bug in it.

On Jan 6, 2008 1:22 AM, Nicholas Crosbie [EMAIL PROTECTED] wrote:
 Can any explain the following error:

 Error in if ((seedCount = seedNumber)  (valueDiff 
 sup)) { :
  missing value where TRUE/FALSE needed

 which I get upon running this script:

 seedNumber - 10
 seeds -  array(dim = seedNumber)
 seedCount - 1

 maxValue - 100
 sup - maxValue / 2

 fcsPar - array(as.integer(rnorm(100, 50, 10)))

 while (seedCount = seedNumber) {
  for(k in 1:100) {
valueDiff - abs(fcsPar[k] - fcsPar[k+1])
  if((seedCount = seedNumber)  (valueDiff 
 sup)) {#error
seeds[seedCount] - fcsPar[k]
seedCount - seedCount + 1
   }
 }
  sup - sup / 2
 }


 many thanks.




  Make the switch to the world's best email. Get the new Yahoo!7 Mail now. 
 www.yahoo7.com.au/worldsbestemail

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] run setwd at the launch of R

2008-01-06 Thread Mark Wardle
It may be disgusting, but I'm not how you expect R to know where to startup.

On my Mac, I keep all my scripts in a per-project working directory.

I therefore type

cd ~/Documents/ataxia

If you have multiple nested directories then why not create a
directory alias (soft-link) so it is easy to cd to? Or move the
relevant folders to a better place?

Alternatively, use the Mac OS X GUI, which has an option in
preferences about initial working directory.

Mark

On 06/01/2008, bunny , lautloscrew.com [EMAIL PROTECTED] wrote:
 Dear all,

 my R files (and the .csv files as well) are saved somewhere pretty
 deep down my hard disk.
 i have to chage to working directory therefore everytime i run R (i
 run it on powerPC mac), which is disgusting.
 using the setwd command at the beginning of an R script doesnt really
 help because i have to find this file first by hand.

 I am looking for possibility to run setwd during the launch process
 of R are straight after it ... any suggestions ?

 i would be very glad about good ideas or help !

 thanks in advance

 matthias

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 __
 This email has been scanned by the MessageLabs Email Security System.
 For more information please visit http://www.messagelabs.com/email
 __



-- 
Dr. Mark Wardle
Specialist registrar, Neurology
Cardiff, UK

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] run setwd at the launch of R

2008-01-06 Thread bunny , lautloscrew.com
Thanks folks fo all the help.
i just missed the part where to set the initial starting directory. i 
´ll try Rprofile.

thanks so much
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to use R for Beta Negative Binomial

2008-01-06 Thread Nasser Abbasi
I think I should have posted this question here as well. I am posting my 
question here since it is R related. Please see below. I originally posted 
this to sci.stat.math


Nasser Abbasi [EMAIL PROTECTED] wrote in message
news:[EMAIL PROTECTED]

 I think R documentation is a bit hard for me to sort out at this time.

 I was wondering if someone who knows R better than I do could please let
 me know the command syntax to find the mean of Beta Negative Binomial
 Distribution for the following parameters:

 n=3
 alpha=0.5
 beta=3

 Here is the documenation page for R which mentions this distribution

 http://rweb.stat.umn.edu/R/library/SuppDists/html/ghyper.html

 Using Mathematica, I get  (-18) for the mean and -150 for the variance,
 and wanted to verify this with R, since there is a negative sign which is
 confusing me.

 Mathematica says the formula for the mean is   n*beta/(alpha-1)  and that
 is why the negative sign comes up.
 alpha, beta, n can be any positive real numbers.

 If someone can just show me the R command for this, that will help, I have
 the R package SuppDists installed, I am just not sure how to use it for
 this distribution.

 thanks,
 Nasser


I thought I should show what I did, this is R 2.6.1:

 tghyper(a=-1, k=-1, N=5)   %I think this makes it do Beta Negative Binomail

and now I used summary command, right?

 sghyper(3, .5, 3)

But I do not think this is correct.Tried few other permitations. Hard for me
to see how to set the parameters correctly for this distribution.

thanks,
Nasser

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Behavior of ordered factors in glm

2008-01-06 Thread Marc Schwartz
David Winsemius wrote:
 I have a variable which is roughly age categories in decades. In the 
 original data, it came in coded:
 str(xxx)
 'data.frame':   58271 obs. of  29 variables:
  $ issuecat   : Factor w/ 5 levels 0 - 39,40 - 49,..: 1 1  1 1...
 snip
 
 I then defined issuecat as ordered:
 xxx$issuecat-as.ordered(xxx$issuecat)
 
 When I include issuecat in a glm model, the result makes me think I 
 have asked R for a linear+quadratic+cubic+quartic polynomial fit. The 
 results are not terribly surprising under that interpretation, but I 
 was hoping for only a linear term (which I was taught to called a test 
 of trend), at least as a starting point.
 
 age.mdl-glm(actual~issuecat,data=xxx,family=poisson)
 summary(age.mdl)
 
 Call:
 glm(formula = actual ~ issuecat, family = poisson, data = xxx)
 
 Deviance Residuals: 
 Min   1Q   Median   3Q  Max  
 -0.3190  -0.2262  -0.1649  -0.1221   5.4776  
 
 Coefficients:
 Estimate Std. Error z value Pr(|z|)
 (Intercept) -4.313210.04865 -88.665   2e-16 ***
 issuecat.L   2.127170.13328  15.960   2e-16 ***
 issuecat.Q  -0.065680.11842  -0.5550.579
 issuecat.C   0.088380.09737   0.9080.364
 issuecat^4  -0.027010.07786  -0.3470.729 
 
 This also means my advice to a another poster this morning may have 
 been misleading. I have tried puzzling out what I don't understand by 
 looking at indices or searching in MASSv2, the Blue Book, Thompson's 
 application of R to Agresti's text, and the FAQ, so far without 
 success. What I would like to achieve is having the lowest age category 
 be a reference category (with the intercept being the log-rate) and 
 each succeeding age category  be incremented by 1. The linear estimate 
 would be the log(risk-ratio) for increasing ages. I don't want the 
 higher order polynomial estimates. Am I hoping for too much?



David,

What you are seeing is the impact of using ordered factors versus 
unordered factors.

Reading ?options, you will note:

contrasts:
the default contrasts used in model fitting such as with aov or lm. A 
character vector of length two, the first giving the function to be used 
with unordered factors and the second the function to be used with 
ordered factors. By default the elements are named c(unordered, 
ordered), but the names are unused.


The default in R (which is not the same as S-PLUS) is:

  options(contrasts)
$contrasts
 unordered   ordered
contr.treatment  contr.poly


Thus, note that when using ordered factors, the default handling of 
factors is contr.poly. Reading ?contrast, you will note:

   contr.poly returns contrasts based on orthogonal polynomials.


To show a quick and dirty example from ?glm:


counts - c(18,17,15,20,10,20,25,13,12)
outcome - gl(3,1,9)


# First, the default with outcome as an unordered factor:
  summary(glm(counts ~ outcome, family=poisson()))

Call:
glm(formula = counts ~ outcome, family = poisson())

Deviance Residuals:
 Min   1Q   Median   3Q  Max
-0.9666  -0.6712  -0.1696   0.8472   1.0494

Coefficients:
 Estimate Std. Error z value Pr(|z|)
(Intercept)   3.0445 0.1260  24.165   2e-16 ***
outcome2 -0.4543 0.2022  -2.247   0.0246 *
outcome3 -0.2930 0.1927  -1.520   0.1285
...



# Now using outcome as an ordered factor:
  summary(glm(counts ~ as.ordered(outcome), family=poisson()))

Call:
glm(formula = counts ~ as.ordered(outcome), family = poisson())

Deviance Residuals:
 Min   1Q   Median   3Q  Max
-0.9666  -0.6712  -0.1696   0.8472   1.0494

Coefficients:
   Estimate Std. Error z value Pr(|z|)
(Intercept) 2.7954 0.0831  33.640   2e-16 ***
as.ordered(outcome).L  -0.2072 0.1363  -1.520   0.1285
as.ordered(outcome).Q   0.2513 0.1512   1.662   0.0965 .
...


Unfortunately, MASSv2 is the only one of the four editions that I do not 
have for some reason. In MASSv4, this is covered starting on page 146. 
This is also covered in an Intro to R, in section 11.1.1 on contrasts.

For typical clinical applications, the default treatment contrasts are 
sufficient, whereby the first level of the factor is considered the 
reference level and all others are compared against it. Thus, using 
unordered factors is more common, at least in my experience and likely 
the etiology of the difference between S-PLUS and R in this regard.

HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Cumulative sum of vector

2008-01-06 Thread Keith Jones
Hi,

Maybe I have not been looking in the right spot, but, I have not been 
able to fine a command to automatically calculate the running 
cumulative sum of a vector.  Is there such a command?

Example of current code:
  eig$values
[1] 678.365651   6.769697   2.853783
  prop-eig$values/sum(eig$values)
  prop
[1] 0.986012163 0.009839832 0.004148005
  cum-c(prop[1],sum(prop[1:2]),sum(prop[1:3]))
  cum
[1] 0.9860122 0.9958520 1.000

This works, but, if the length of the vector changes I have to 
manually change the code.

Thanks,

Keith Jones

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Importing data

2008-01-06 Thread Frank E Harrell Jr
Simo Vundla wrote:
 Hi,
 I'm trying to import categorical data from SPSS to R using the script:
 xxx -spss.get(xxx.por, use.value.labels=TRUE) but unfortunately am getting 
 an error message 'error reading portable-file dictionary'.
 
 I have successfully imported data in the past. 
 
 What could be the problem with this data?
 
 Thanks
 
 Simo

First of all, follow the posting guide.  Second, state which package you 
are using (in this case Hmisc).

spss.get in Hmisc uses read.spss in the foreign package.  See the 
documentation of read.spss for more details. You will find there:

'read.spss' reads a file stored by the SPSS 'save' and 'export'
  commands and returns a list.

read.spss does not claim to be able to read SPSS .por files.

Frank
-- 
Frank E Harrell Jr   Professor and Chair   School of Medicine
  Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cumulative sum of vector

2008-01-06 Thread Gabor Csardi
Keith, are you looking for 'cumsum' ?

Gabor

On Sat, Jan 05, 2008 at 08:32:41AM -0600, Keith Jones wrote:
 Hi,
 
 Maybe I have not been looking in the right spot, but, I have not been 
 able to fine a command to automatically calculate the running 
 cumulative sum of a vector.  Is there such a command?
 
 Example of current code:
   eig$values
 [1] 678.365651   6.769697   2.853783
   prop-eig$values/sum(eig$values)
   prop
 [1] 0.986012163 0.009839832 0.004148005
   cum-c(prop[1],sum(prop[1:2]),sum(prop[1:3]))
   cum
 [1] 0.9860122 0.9958520 1.000
 
 This works, but, if the length of the vector changes I have to 
 manually change the code.
 
 Thanks,
 
 Keith Jones
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Csardi Gabor [EMAIL PROTECTED]MTA RMKI, ELTE TTK

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cumulative sum of vector

2008-01-06 Thread hadley wickham
On Jan 5, 2008 8:32 AM, Keith Jones [EMAIL PROTECTED] wrote:
 Hi,

 Maybe I have not been looking in the right spot, but, I have not been
 able to fine a command to automatically calculate the running
 cumulative sum of a vector.  Is there such a command?

Try
help.search(cumulative sum)

Hadley

-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to use R for Beta Negative Binomial

2008-01-06 Thread Duncan Murdoch
On 06/01/2008 9:36 AM, Nasser Abbasi wrote:
 I think I should have posted this question here as well. I am posting my 
 question here since it is R related. Please see below. I originally posted 
 this to sci.stat.math
 
 
 Nasser Abbasi [EMAIL PROTECTED] wrote in message
 news:[EMAIL PROTECTED]
 I think R documentation is a bit hard for me to sort out at this time.

 I was wondering if someone who knows R better than I do could please let
 me know the command syntax to find the mean of Beta Negative Binomial
 Distribution for the following parameters:

 n=3
 alpha=0.5
 beta=3

 Here is the documenation page for R which mentions this distribution

 http://rweb.stat.umn.edu/R/library/SuppDists/html/ghyper.html

 Using Mathematica, I get  (-18) for the mean and -150 for the variance,
 and wanted to verify this with R, since there is a negative sign which is
 confusing me.

A variance could not be negative, so clearly Mathematica has it wrong.

 Mathematica says the formula for the mean is   n*beta/(alpha-1)  and that
 is why the negative sign comes up.
 alpha, beta, n can be any positive real numbers.

 If someone can just show me the R command for this, that will help, I have
 the R package SuppDists installed, I am just not sure how to use it for
 this distribution.

 thanks,
 Nasser

 
 I thought I should show what I did, this is R 2.6.1:
 
  tghyper(a=-1, k=-1, N=5)   %I think this makes it do Beta Negative Binomail

It reports itself as

  tghyper(a=-1, k=-1, N=5)
[1] type = IV -- x = 0,1,2,...

which I believe indicates Beta-negative-binomial.

 
 and now I used summary command, right?
 
  sghyper(3, .5, 3)

Why did you change the parameters?  If you used the same ones as above, 
you get

  sghyper(a=-1, k=-1, N=5)
$title
[1] Generalized Hypergeometric

$a
[1] -1

$k
[1] -1

$N
[1] 5

$Mean
[1] 0.2

$Median
[1] 0

$Mode
[1] 0

$Variance
[1] 0.36

$SD
[1] 0.6

$ThirdCentralMoment
[1] 1.176

$FourthCentralMoment
[1] 8.9712

$PearsonsSkewness...mean.minus.mode.div.SD
[1] 0.333

$Skewness...sqrtB1
[1] 5.44

$Kurtosis...B2.minus.3
[1] 66.2

I don't know if those values are correct, but at least they aren't 
nonsensical like the ones you report from Mathematica.

Duncan Murdoch

 
 But I do not think this is correct.Tried few other permitations. Hard for me
 to see how to set the parameters correctly for this distribution.
 
 thanks,
 Nasser
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Behavior of ordered factors in glm

2008-01-06 Thread David Winsemius

Thank you, Dr Ripley. After some false starts and consulting MASS2, 
ChambersHastie and the help files, this worked acceptably.

 xxx$issuecat2-C(xxx$issuecat2,poly,1)
 attr(xxx$issuecat2,contrasts)
 .L
0-39  -6.324555e-01
40-49 -3.162278e-01
50-59 -3.287978e-17
60-69  3.162278e-01
70+6.324555e-01

 exp.mdl-glm(actual~gendercat+issuecat2+smokecat,   
data=xxx,family=poisson,offset=expected)
 summary(exp.mdl)

Deviance Residuals: 
Min   1Q   Median   3Q  Max  
-0.5596  -0.2327  -0.1671  -0.1199   5.2386  

Coefficients:
Estimate Std. Error z value Pr(|z|)
(Intercept) -4.571250.06650 -68.743   2e-16 ***
gendercatMale0.296600.06426   4.615 3.92e-06 ***
issuecat2.L  2.091610.09354  22.360   2e-16 ***
smokecatSmoker   0.221780.07870   2.818  0.00483 ** 
smokecatUnknown  0.023780.08607   0.276  0.78233 

The reference category is different, but the effect of a one category 
increase in age-decade on the log(rate) is(2.09*0.316) = 0.6604 which 
seems acceptable agreement with my earlier as.numeric(factor) estimate 
of 0.6614.

-- 
David Winsemius

Prof Brian Ripley [EMAIL PROTECTED] wrote in
news:[EMAIL PROTECTED]: 

 Further to Duncan's comments, you can control factor codings via 
 options(contrasts=), by setting contrasts() on the factor and via
 C(). This does enable you to code an ordered factor as a linear
 term, for example.
 
 The only place I know that this is discussed in any detail is in
 Bill Venables' account in MASS chapter 6.
 
 On Sat, 5 Jan 2008, Duncan Murdoch wrote:
 
 On 05/01/2008 7:16 PM, David Winsemius wrote:
 David Winsemius [EMAIL PROTECTED] wrote in
 news:[EMAIL PROTECTED]:

 I have a variable which is roughly age categories in decades. In
 the original data, it came in coded:
 str(xxx)
 'data.frame':   58271 obs. of  29 variables:
  $ issuecat   : Factor w/ 5 levels 0 - 39,40 - 49,..: 1 1  1
  1...
 snip

 I then defined issuecat as ordered:
 xxx$issuecat-as.ordered(xxx$issuecat)
 When I include issuecat in a glm model, the result makes me think
 I have asked R for a linear+quadratic+cubic+quartic polynomial
 fit. The results are not terribly surprising under that
 interpretation, but I was hoping for only a linear term (which I
 was taught to call a test of trend), at least as a starting
 point. 

 age.mdl-glm(actual~issuecat,data=xxx,family=poisson)
 summary(age.mdl)
 Call:
 glm(formula = actual ~ issuecat, family = poisson, data = xxx)

 Deviance Residuals:
 Min   1Q   Median   3Q  Max
 -0.3190  -0.2262  -0.1649  -0.1221   5.4776

 Coefficients:
 Estimate Std. Error z value Pr(|z|)
 (Intercept) -4.313210.04865 -88.665   2e-16 ***
 issuecat.L   2.127170.13328  15.960   2e-16 ***
 issuecat.Q  -0.065680.11842  -0.5550.579
 issuecat.C   0.088380.09737   0.9080.364
 issuecat^4  -0.027010.07786  -0.3470.729

 This also means my advice to a another poster this morning may
 have been misleading. I have tried puzzling out what I don't
 understand by looking at indices or searching in MASSv2, the Blue
 Book, Thompson's application of R to Agresti's text, and the FAQ,
 so far without success. What I would like to achieve is having
 the lowest age category be a reference category (with the
 intercept being the log-rate) and each succeeding age category 
 be incremented by 1. The linear estimate would be the
 log(risk-ratio) for increasing ages. I don't want the higher
 order polynomial estimates. Am I hoping for too much?


 I acheived what I needed by:

 xxx$agecat-as.numeric(xxx$issuecat)
 xxx$agecat-xxx$agecat-1

 The results look quite sensible:
 exp.mdl-glm(actual~gendercat+agecat+smokecat, data=xxx,
 family=poisson, offset=expected)
 summary(exp.mdl)

 Call:
 glm(formula = actual ~ gendercat + agecat + smokecat, family =
 poisson,
 data = xxx, offset = expected)

 Deviance Residuals:
 Min   1Q   Median   3Q  Max
 -0.5596  -0.2327  -0.1671  -0.1199   5.2386

 Coefficients:
 Estimate Std. Error z value Pr(|z|)
 (Intercept) -5.894100.11009 -53.539   2e-16 ***
 gendercatMale0.296600.06426   4.615 3.92e-06 ***
 agecat   0.661430.02958  22.360   2e-16 ***
 smokecatSmoker   0.221780.07870   2.818  0.00483 **
 smokecatUnknown  0.023780.08607   0.276  0.78233

 I remain curious about how to correctly control ordered factors,
 or I should just simply avoid them.

 If you're using a factor, R generally assumes you mean each level
 is a different category, so you get levels-1 parameters.  If you
 don't want this, you shouldn't use a factor:  convert to a numeric
 scale, just as you did.

 Duncan Murdoch


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible 

[R] Can a dynamic graphic produced by rgl be saved?

2008-01-06 Thread Michael Kubovy
Dear r-helpers,

Can one save a dynamic graphic produced by rgl, e.g.:
open3d();   x - sort(rnorm(1000));   y - rnorm(1000);   z -  
rnorm(1000) + atan2(x,y);   plot3d(x, y, z, col=rainbow(1000), size=2)
as a dynamic figure that can be embedded in a pdf?
_
Professor Michael Kubovy
University of Virginia
Department of Psychology
USPS: P.O.Box 400400Charlottesville, VA 22904-4400
Parcels:Room 102Gilmer Hall
 McCormick RoadCharlottesville, VA 22903
Office:B011+1-434-982-4729
Lab:B019+1-434-982-4751
Fax:+1-434-982-4766
WWW:http://www.people.virginia.edu/~mk9y/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] CSVSource in tm Package

2008-01-06 Thread Armin Goralczyk
Hello

I tried to use the CSVSource in the TextDocCol function in the tm package. But
a) data from several columns is concatenated in one entry and
b) data in a large text column is broken into several entries
I hoped that it would be possible to assign columns as metadata to one
entry with one specific column being the original text to analyze.

Here is an example from the vignette (the backslash in the output is
not in the original data):

 cars - system.file(texts, cars.csv, package = tm);
 tdc - TextDocCol(CSVSource(cars))
Read 5 items
 inspect(tdc)
A text document collection with 5 text documents

The metadata consists of 2 tag-value pairs and a data frame
Available tags are:
  create_date creator
Available variables in the data frame are:
  MetaID

[[1]]
[1] 1997,\Ford\,\Mustang\,\3000.00\

[[2]]
[1] 1999,\Chevy\,\Venture\,4900.00

[[3]]
[1] 1996,\Chrylser\,\Cherokee\,\4799.00\

[[4]]
[1] 2005,\Ferrari\,\Modena\,\80999.00\

[[5]]
[1] 1973,\Tank\,\\,\9900.00\

Also I have a question about the best workflow for text mining/analysis:

My original data is in a mySQL table. Is it possible to import the
data directly into TextDocCol without creating an intermediate csv
file?

I am using

 R.Version()
$platform
[1] powerpc-apple-darwin8.10.1

$arch
[1] powerpc

$os
[1] darwin8.10.1

$system
[1] powerpc, darwin8.10.1

$status
[1] 

$major
[1] 2

$minor
[1] 6.1

$year
[1] 2007

$month
[1] 11

$day
[1] 26

$`svn rev`
[1] 43537

$language
[1] R

$version.string
[1] R version 2.6.1 (2007-11-26)

-- 
Armin Goralczyk, M.D.
--
Universitätsmedizin Göttingen
Abteilung Allgemein- und Viszeralchirurgie
Rudolf-Koch-Str. 40
39099 Göttingen
--
Dept. of General Surgery
University of Göttingen
Göttingen, Germany
--
http://www.gwdg.de/~agoralc
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Importing data

2008-01-06 Thread Rense Nieuwenhuis
Hi,

you might try to use the foreign-package, which contains the function  
read.spss. This works fine most of the time,

For a description of its usage, see the help-files or my own website:  
http://www.rensenieuwenhuis.nl/r-project/manual/basics/getting-data- 
into-r-2/

Remember, you'll need to install the foreign-package first.

Hope this helps,

Rense Nieuwenhuis
On Jan 6, 2008, at 12:46 , Simo Vundla wrote:

 Hi,
 I'm trying to import categorical data from SPSS to R using the script:
 xxx -spss.get(xxx.por, use.value.labels=TRUE) but unfortunately  
 am getting an error message 'error reading portable-file dictionary'.

 I have successfully imported data in the past.

 What could be the problem with this data?

 Thanks

 Simo








 __ 
 __
 Be a better friend, newshound, and


   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting- 
 guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to get residuals in factanal

2008-01-06 Thread Yijun Zhao
In R factanal output, I can't find a function to give me residuals e.

I mannually got it by using x -lamda1*f1 -lamda2*f2  - ... -lamdan*fn, but the e
I got are not uncorrelated with all the f's. 

What did I do wrong? Please help.

Yijun


  

Be a better friend, newshound, and

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data frame manipulation - newbie question

2008-01-06 Thread Rense Nieuwenhuis
Hi,

you may want to use that apply / tapply function. Some find it a bit  
hard to grasp at first, but it will help you many times in many  
situations when you get the hang of it.

Maybe you can get some information on my site: http:// 
www.rensenieuwenhuis.nl/r-project/manual/basics/tables/


Hope this helps,

Rense Nieuwenhuis



On Jan 3, 2008, at 11:53 , José Augusto M. de Andrade Junior wrote:

 Hi all,

 Could someone please explain how can i efficientily query a data frame
 with several factors, as shown below:

 -- 
 ---
 Data frame: pt.knn
 -- 
 ---
 row | k.idx   |   step.forwd  |  pt.num |   model |   prev  |  value
 |  abs.error
 1  2000  1 lm  09
 10.5   1.5
 2  2000  2 lm  11
 10.5   1.5
 3  2011  1 lm  10
 12  2.0
 4  2011  2 lm  12
 12  2.0
 5  2022  1 lm  12
 12.1   0.1
 6  2022  2 lm  12
 12.1   0.1
 7  2000  1 rlm 10.1
 10.5   0.4
 8  2000  2 rlm 10.3
 10.5   0.2
 9  2011  1 rlm 11.6
 12  0.4
 102011  2 rlm 11.4
 12  0.6
 112022  1 rlm 11.8
 12.1   0.1
 122022  2 rlm 11.9
 12.1   0.2
 -- 
 

 k.idx, step.forwd, pt.num and model columns are FACTORS.
 prev, value, abs.error are numeric

 I need to take the mean value of the numeric columns  (prev, value and
 abs.error) for each k.idx and step.forwd and model. So: rows 1 and 2,
 3 and 4, 5 and 6,7 and 8, 9 and 10, 11 and 12 must be grouped
 together.

 Next, i need to plot a boxplot of the mean(abs.error) of each model
 for each k.idx.
 I need to compare the abs.error of the two models for each step and
 the mean overall abs.error of each model. And so on.

 I read the manuals, but the examples there are too simple. I know how
 to do this manipulation in a brute force manner, but i wish to learn
 how to work the right way with R.

 Could someone help me?
 Thanks in advance.

 José Augusto
 Undergraduate student
 University of São Paulo
 Business Administration Faculty

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting- 
 guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Importing data

2008-01-06 Thread Prof Brian Ripley
On Sun, 6 Jan 2008, Rense Nieuwenhuis wrote:

 Hi,

 you might try to use the foreign-package, which contains the function
 read.spss. This works fine most of the time,

 For a description of its usage, see the help-files or my own website:
 http://www.rensenieuwenhuis.nl/r-project/manual/basics/getting-data-
 into-r-2/

 Remember, you'll need to install the foreign-package first.

You shouldn't have to: it is supposed to come with every installation of 
R, and be installed unless you specifically opt out.

Perhaps you meant 'first load the foreign package via library(foreign)'?

[Re: Frank Harrell's comment, many people use .por for SPSS export 
files; that is the extension used in package foreign's tests directory.
But the issue may well be that xxx.por is not an SPSS export file.]


 Hope this helps,

 Rense Nieuwenhuis
 On Jan 6, 2008, at 12:46 , Simo Vundla wrote:

 Hi,
 I'm trying to import categorical data from SPSS to R using the script:
 xxx -spss.get(xxx.por, use.value.labels=TRUE) but unfortunately
 am getting an error message 'error reading portable-file dictionary'.

 I have successfully imported data in the past.

 What could be the problem with this data?

 Thanks

 Simo


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Can a dynamic graphic produced by rgl be saved?

2008-01-06 Thread Eric R.

?rgl.snapshot


Michael Kubovy wrote:
 
 Dear r-helpers,
 
 Can one save a dynamic graphic produced by rgl, e.g.:
 open3d();   x - sort(rnorm(1000));   y - rnorm(1000);   z -  
 rnorm(1000) + atan2(x,y);   plot3d(x, y, z, col=rainbow(1000), size=2)
 as a dynamic figure that can be embedded in a pdf?
 _
 Professor Michael Kubovy
 University of Virginia
 Department of Psychology
 USPS: P.O.Box 400400Charlottesville, VA 22904-4400
 Parcels:Room 102Gilmer Hall
  McCormick RoadCharlottesville, VA 22903
 Office:B011+1-434-982-4729
 Lab:B019+1-434-982-4751
 Fax:+1-434-982-4766
 WWW:http://www.people.virginia.edu/~mk9y/
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/Can-a-dynamic-graphic-produced-by-rgl-be-saved--tp14649977p14650635.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to get residuals in factanal

2008-01-06 Thread Prof Brian Ripley
On Sun, 6 Jan 2008, Yijun Zhao wrote:

 In R factanal output, I can't find a function to give me residuals e.

 I mannually got it by using x -lamda1*f1 -lamda2*f2  - ... -lamdan*fn, but 
 the e
 I got are not uncorrelated with all the f's.

 What did I do wrong? Please help.

What did you use for 'f'?  The factors ('scores') are latent quantities in 
factor analysis, and there is more than one way to predict them.  Most 
likely your assumption of uncorrelatedness is not correct for the 
residuals and scores as you computed them.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Is there a R function for seasonal adjustment

2008-01-06 Thread tom soyer
Hi,

I just discovered decompose() and stl(), both are very nice! I am wondering
if R also has a function that calculates the seasonal index, or make the
seasonal adjustment directly using the results generated from either
decompose() or stl(). It seems that there should be one, but I couldn't find
it. Does anyone know?

Thanks,

-- 
Tom

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cubic splines in package mgcv

2008-01-06 Thread Simon Wood
On Wednesday 26 December 2007 04:14, Kunio takezawa wrote:
 R-users
 E-mail: r-help@r-project.org
My understanding is that package mgcv is based on
 Generalized Additive Models: An Introduction with R (by Simon N. Wood).
 On the page 126 of this book, eq(3.4) looks a quartic equation with respect
 to
 x, not a cubic equation. I am wondering if all routines which uses
 cubic splines in mgcv are based on this quartic equation.
--- No, `mgcv' does not use the basis given on page 126. See sections 
4.1.2-4.1.8 of the same book for the bases used.

In my humble opinion, the '^4' in the first term
 of the second line of this equation should be '^3'.
--- Perhaps take a look at section 2.3.3 of Gu (2002) Smoothing Spline ANOVA 
for a bit more detail on this/


 K. Takezawa

-- 
 Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK
 +44 1225 386603  www.maths.bath.ac.uk/~sw283

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] run setwd at the launch of R

2008-01-06 Thread John Kane
I have not tried this but once you know where the
(relatively permanent)working directory is then
putting setwd(my.directory) in your R.profile should
work'
--- bunny , lautloscrew.com [EMAIL PROTECTED]
wrote:

 Dear all,
 
 my R files (and the .csv files as well) are saved
 somewhere pretty  
 deep down my hard disk.
 i have to chage to working directory therefore
 everytime i run R (i  
 run it on powerPC mac), which is disgusting.
 using the setwd command at the beginning of an R
 script doesnt really  
 help because i have to find this file first by hand.
 
 I am looking for possibility to run setwd during the
 launch process  
 of R are straight after it ... any suggestions ?
 
 i would be very glad about good ideas or help !
 
 thanks in advance
 
 matthias
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] GLM results different from GAM results without smoothing terms

2008-01-06 Thread Simon Wood
On Thursday 03 January 2008 13:54, Prof Brian Ripley wrote:
  fit1 - glm(factor(x1)~factor(Round)+x2,family=binomial(link=probit))
  fit2 - gam(factor(x1)~factor(Round)+x2,family=binomial(link=probit))
  all.equal(fitted(fit1), fitted(fit2))

 [1] TRUE

 so the fits to the data are the same: your error was in over-interpreting
 the parameters in the presence on non-identifiability.

-- so coming back to the original question, mgcv::gam is using an SVD approach 
to rank deficiency in this case (so the minumum norm parameter vector is 
chosen amongst all those corresponding to the best fit), while glm is using a 
pivoted QR approach to rank deficiency, and effectively constraining 
redundant parameters to zero.

 On Thu, 3 Jan 2008, Daniel Malter wrote:
  Thanks much for your response. My apologies for not putting sample code
  in the first place. Here it comes:
 
  Round=rep(1:10,each=10)
  x1=rbinom(100,1,0.3)
  x2=rep(rnorm(10,0,1),each=10)
 
  summary(glm(factor(x1)~factor(Round)+x2,family=binomial(link=probit)))
 
  library(mgcv)
  summary(gam(factor(x1)~factor(Round)+x2,family=binomial(link=probit)))
 
  Cheers,
  Daniel
 
  -
  cuncta stricte discussurus
  -
 
  -Ursprüngliche Nachricht-
  Von: Prof Brian Ripley [mailto:[EMAIL PROTECTED]
  Gesendet: Thursday, January 03, 2008 2:13 AM
  An: Daniel Malter
  Cc: [EMAIL PROTECTED]
  Betreff: Re: [R] GLM results different from GAM results without smoothing
  terms
 
  On Wed, 2 Jan 2008, Daniel Malter wrote:
  Hi, I am fitting two models, a generalized linear model and a
  generalized additive model, to the same data. The R-Help tells that A
  generalized additive model (GAM) is a generalized linear model (GLM)
  in which the linear predictor is given by a user specified sum of
  smooth functions of the covariates plus a conventional parametric
  component of the linear predictor. I am fitting the GAM without
  smooth functions and would have expected the parameter estimates to be
 
  equal to the GLM.
 
  I am fitting the following model:
 
  reg.glm=glm(YES~factor(RoundStart)+DEP+SPD+S.S+factor(LOST),family=bin
  omial(
  link=probit))
  reg.gam=gam(YES~factor(RoundStart)+DEP+SPD+S.S+factor(LOST),family=bin
  omial(
  link=probit))
 
  DEP, SPD, S.S, and LOST are invariant across the observations within
  the same RoundStart. Therefore, I would expect to get NAs for these
  parameter estimates.
 
  So your design matrix is rank-deficient and there is an identifiability
  problem.
 
  I get NAs in GLM, but I get estimates in GAM. Can anyone explain why
  that is?
 
  Because there is more than one way to handle rank deficiency.  There are
  two different 'gam' functions in contributed packages for R (and none in
  R itself), so we need more details: see the footer of this message. In
  glm() the NA estimates are treated as zero for computing predictions.
 
  Thanks much,
  Daniel
 
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

-- 
 Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK
 +44 1225 386603  www.maths.bath.ac.uk/~sw283 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to get residuals in factanal

2008-01-06 Thread Yijun Zhao
The factanal was called with 'varimax' rotation. The factors scores are 
uncorrelated. But
the residuals I got by using

X - sum(loadings*factors-scores) is not uncorrelated to the factor scores. 

I thought the residuals should be independent to the factor scores as ?factanal 
says:

==
The factor analysis model is 

x = Lambda f + e

for a p¨Celement row-vector x, a p x k matrix of loadings, a k¨Celement vector 
of scores
and a p¨Celement vector of errors. None of the components other than x is 
observed, but
the major restriction is that the scores be uncorrelated and of unit variance, 
and that
the errors be independent with variances Phi, the uniquenesses. 
===

Thank you.

Yijun

--- Prof Brian Ripley [EMAIL PROTECTED] wrote:

 On Sun, 6 Jan 2008, Yijun Zhao wrote:
 
  In R factanal output, I can't find a function to give me residuals e.
 
  I mannually got it by using x -lamda1*f1 -lamda2*f2  - ... -lamdan*fn, but 
  the e
  I got are not uncorrelated with all the f's.
 
  What did I do wrong? Please help.
 
 What did you use for 'f'?  The factors ('scores') are latent quantities in 
 factor analysis, and there is more than one way to predict them.  Most 
 likely your assumption of uncorrelatedness is not correct for the 
 residuals and scores as you computed them.
 
 -- 
 Brian D. Ripley,  [EMAIL PROTECTED]
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595
 



  

Never miss a thing.  Make Yahoo your home page.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Can a dynamic graphic produced by rgl be saved?

2008-01-06 Thread Duncan Murdoch
On 06/01/2008 10:46 AM, Michael Kubovy wrote:
 Dear r-helpers,
 
 Can one save a dynamic graphic produced by rgl, e.g.:
 open3d();   x - sort(rnorm(1000));   y - rnorm(1000);   z -  
 rnorm(1000) + atan2(x,y);   plot3d(x, y, z, col=rainbow(1000), size=2)
 as a dynamic figure that can be embedded in a pdf?

rgl doesn't produce any format that remains dynamic.  You can produce 
bitmap or (with some limitations) vector format snapshots, and you can 
put multiple bitmaps together into a movie (see movie3d(), for example). 
  I don't know how to embed a movie into a pdf, but I assume it's possible.

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] GLMMs fitted with lmer (R) glimmix (SAS

2008-01-06 Thread Douglas Bates
On Jan 4, 2008 6:21 PM, Andrea Previtali [EMAIL PROTECTED] wrote:

 Sorry, I realized that somehow the message got truncated. Here is the
 remaining part of the SAS output:

 Solutions for Fixed Effects:

 Effect   DIST DW  ELI  SEX  SEAS  Estimate   Std. Error
 DF  t Value  Pr  |t|

 Intercept  -4.6540
 0.6878 17-6.77.0001

 DIST*DW 00  1.4641
 0.4115   3077
 3.560.0004
 DIST*DW 01  1.1333
 0.4028   3077
 2.810.0049
 DIST*DW 10  1.3456
 0.3745
 3077 3.590.0003
 DIST*DW 11   0 .  
 .
 .   .

 SEX*ELI   0  01.2633
 0.4155
 3077 3.040.0024
 SEX*ELI   0  10.6569
 0.4140   3077 1.59
 0.1126
 SEX*ELI   1  01.0728
 0.4364
 3077 2.460.0140
 SEX*ELI   1  1  0 
 .
 .  .   .

 WT0.00758 
   0.01912
 3077 0.400.6918

 SEAS0   0.7839
 0.1588
 3077 4.94.0001

 SEAS1   0 
 .
 .  .   .

 DEN -0.01343  
 0.002588
 3077-5.19.0001

  Type III Tests of Fixed Effects

   EffectNUM.DF  DEN.DFF Value
 Pr  F
   DIST*DW   33077   6.06
 0.0004
   SEX*ELI 33077   6.30
 0.0003
   WT13077   
 0.16
 0.6918
   SEAS  13077  
 24.37
 .0001
   DEN   13077  
 26.94
 .0001

At least on my mail reader the copies of the output ended up with
wrapped lines and, apparently, some changes in the spacing.  I enclose
two text files, glimmix.txt and glmer.txt, that are my reconstructions
of the originals.  Please let me know if I have not reconstructed them
correctly.  In particular, i don't think I got the first table of
Solutions for Fixed Effects: in the glimmix.txt file correct.  It
seems to mix t statistics and F statistics in ways that I don't
understand.

Another thing I don't understand is what the Pseudo-Likelihood is.
Perhaps it is what I would call the penalized weighted residual sum of
squares.  The likelihood reported by lmer and based on the binomial
distribution is very different.

If you want to compare coefficients I suggest using

options(contrasts = c(contr.SAS, contr.poly)

and assure that SEX, DIST, DW and ELI are factors, then call lmer.
This will ensure that the SEX, DIST, DW and ELI terms and their
interactions are represented by contrasts in which the last level is
the reference level (the SAS convention) as opposed to the first level
(the R convention).

Also, you may be confusing the S language formula terms with the SAS
formula terms.  In R the asterisk denotes crossing of terms and the :
is used for an interaction.  Thus SEX*ELI is equivalent to SEX + ELI +
SEX:ELI in R.  In SAS, it is the interaction that is written as
SEX*ELI.  I suggest that you change your SAS formula to include main
effects for SEX, ELI, DIST and DW.
Generalized linear mixed model fit using PQL

Formula: SURV ~ SEX * ELI + DW * DIST + SEAS + DEN + WT + (1 | SITE)
Family: binomial(logit link)

 AIC  BIC logLik deviance
1539 1606 -758.7 1517

Random effects:
Groups NameVariance Std.Dev.
SITE   (Intercept)  0.27816  0.52741
number of obs: 3104, groups: SITE, 19


Estimated scale (compare to  1 )  0.9458749

Fixed effects:
 Estimate Std. Errorz value Pr(|z|)
(Intercept) -1.144259   0.458672 -2.495 0.012606
SEX -0.606026   0.167289 -3.623 0.000292 ***
ELI -0.190757   0.219599 -0.869 0.385034
DW  -0.328796   0.175882 -1.869 0.061565 .
DIST-0.117745   0.374148 -0.315 0.752989
SEAS-0.784971   0.158748 -4.945 7.62e-07 ***
DEN -0.013381   0.002585 -5.176 2.27e-07 ***
WT   0.007735   0.019115  0.405 0.685732
SEX:ELI -0.466425   0.461596 -1.010 0.312274
DW:DIST -1.015454   0.404683 -2.509 0.012099 *
Model Information

 Variance Matrix Blocked BySite
 Estimation Technique:  Residual PL
 Degrees 

[R] aggregate.ts help

2008-01-06 Thread tom soyer
Hi,

I have a ts object with a frequency of 4, i.e., quarterly data, and I would
like to calculate the mean for each quarter. So for example:

 ts.data=ts(1:20,start=c(1984,2),frequency=4)
 ts.data
 Qtr1 Qtr2 Qtr3 Qtr4
1984 123
19854567
198689   10   11
1987   12   13   14   15
1988   16   17   18   19
1989   20

If I do this manually, the mean for the 1st quarter would be
mean(c(4,8,12,16,20)), which is 12. But I am wondering if there is a R
function that could do this faster. I tried aggregate.ts but it didn't work:

 aggregate(ts.data,nfrequency=4,mean)
 Qtr1 Qtr2 Qtr3 Qtr4
1984 123
19854567
198689   10   11
1987   12   13   14   15
1988   16   17   18   19
1989   20

Does anyone know what am I doing wrong?

-- 
Tom

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] aggregate.ts help

2008-01-06 Thread Gabor Grothendieck
On Jan 6, 2008 5:17 PM, tom soyer [EMAIL PROTECTED] wrote:
 Hi,

 I have a ts object with a frequency of 4, i.e., quarterly data, and I would
 like to calculate the mean for each quarter. So for example:

  ts.data=ts(1:20,start=c(1984,2),frequency=4)
  ts.data
 Qtr1 Qtr2 Qtr3 Qtr4
 1984 123
 19854567
 198689   10   11
 1987   12   13   14   15
 1988   16   17   18   19
 1989   20

 If I do this manually, the mean for the 1st quarter would be
 mean(c(4,8,12,16,20)), which is 12. But I am wondering if there is a R
 function that could do this faster. I tried aggregate.ts but it didn't work:

  aggregate(ts.data,nfrequency=4,mean)
 Qtr1 Qtr2 Qtr3 Qtr4
 1984 123
 19854567
 198689   10   11
 1987   12   13   14   15
 1988   16   17   18   19
 1989   20

 Does anyone know what am I doing wrong?

aggregate.ts aggregates to produce series of coarser granularity
which is not what you want.  You want the ordinary aggregate:

aggregate(c(ts.data), list(qtr = cycle(ts.data)), mean)

# or tapply:

tapply(ts.data, cycle(ts.data), mean)

See ?aggregate

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to get residuals in factanal

2008-01-06 Thread Yijun Zhao
p.s. I tried to use both regression and Bartlett way to get the scores. In 
both 
cases, the scores are uncorrelated, but the errors are NOT uncorrelated to the 
scores,
and are also NOT uncorrelated among themselves. 

What am I missing? factanal() are supposed to give independent errors vectors, 
so at
least should be uncorrelated among themselves. 

Thank you in advance for the help.

Yijun

--- Yijun Zhao [EMAIL PROTECTED] wrote:

 The factanal was called with 'varimax' rotation. The factors scores are 
 uncorrelated.
 But
 the residuals I got by using
 
 X - sum(loadings*factors-scores) is not uncorrelated to the factor scores. 
 
 I thought the residuals should be independent to the factor scores as 
 ?factanal says:
 
 ==
 The factor analysis model is 
 
 x = Lambda f + e
 
 for a p¨Celement row-vector x, a p x k matrix of loadings, a k¨Celement 
 vector of
 scores
 and a p¨Celement vector of errors. None of the components other than x is 
 observed, but
 the major restriction is that the scores be uncorrelated and of unit 
 variance, and that
 the errors be independent with variances Phi, the uniquenesses. 
 ===
 
 Thank you.
 
 Yijun
 
 --- Prof Brian Ripley [EMAIL PROTECTED] wrote:
 
  On Sun, 6 Jan 2008, Yijun Zhao wrote:
  
   In R factanal output, I can't find a function to give me residuals e.
  
   I mannually got it by using x -lamda1*f1 -lamda2*f2  - ... -lamdan*fn, 
   but the e
   I got are not uncorrelated with all the f's.
  
   What did I do wrong? Please help.
  
  What did you use for 'f'?  The factors ('scores') are latent quantities in 
  factor analysis, and there is more than one way to predict them.  Most 
  likely your assumption of uncorrelatedness is not correct for the 
  residuals and scores as you computed them.
  
  -- 
  Brian D. Ripley,  [EMAIL PROTECTED]
  Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
  University of Oxford, Tel:  +44 1865 272861 (self)
  1 South Parks Road, +44 1865 272866 (PA)
  Oxford OX1 3TG, UKFax:  +44 1865 272595
  
 
 
 
  
 
 Never miss a thing.  Make Yahoo your home page. 
 http://www.yahoo.com/r/hs
 



  

Looking for last minute shopping deals?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] aggregate.ts help

2008-01-06 Thread tom soyer
Thanks Gabor!!

On 1/6/08, Gabor Grothendieck [EMAIL PROTECTED] wrote:

 On Jan 6, 2008 5:17 PM, tom soyer [EMAIL PROTECTED] wrote:
  Hi,
 
  I have a ts object with a frequency of 4, i.e., quarterly data, and I
 would
  like to calculate the mean for each quarter. So for example:
 
   ts.data=ts(1:20,start=c(1984,2),frequency=4)
   ts.data
  Qtr1 Qtr2 Qtr3 Qtr4
  1984 123
  19854567
  198689   10   11
  1987   12   13   14   15
  1988   16   17   18   19
  1989   20
 
  If I do this manually, the mean for the 1st quarter would be
  mean(c(4,8,12,16,20)), which is 12. But I am wondering if there is a R
  function that could do this faster. I tried aggregate.ts but it didn't
 work:
 
   aggregate(ts.data,nfrequency=4,mean)
  Qtr1 Qtr2 Qtr3 Qtr4
  1984 123
  19854567
  198689   10   11
  1987   12   13   14   15
  1988   16   17   18   19
  1989   20
 
  Does anyone know what am I doing wrong?

 aggregate.ts aggregates to produce series of coarser granularity
 which is not what you want.  You want the ordinary aggregate:

 aggregate(c(ts.data), list(qtr = cycle(ts.data)), mean)

 # or tapply:

 tapply(ts.data, cycle(ts.data), mean)

 See ?aggregate




-- 
Tom

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] need help

2008-01-06 Thread Zakaria, Roslinazairimah - zakry001
Hi,

I'm Roslina, PhD student of University of South Australia, Australia
from school Maths and Stats. I use S-Plus before and now has started
using R-package. I used 

to analyse rainfall data using julian date. Is there any similar 

function that you can suggest to me to be used in R-package? Thank you 

so much for your attention and help

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] testing fixed effects in lmer

2008-01-06 Thread Achaz von Hardenberg
Dear all,
I am performing a binomial glmm analysis using the lmer function in  
the lme4 package (last release, just downloaded). I am using the  
Laplace method.

However, I am not sure about what I should do to test for the  
significance of fixed effects in the binomial case: Is it correct to  
test a full model against a model from which I remove the fixed  
effect I want to test using the anova(mod1.lmer, mod2.lmer) method  
and then relying on the model with the lower AIC (or on the Log- 
likelihood test?)?

I thank in advance for your help!

best regards,
Achaz von Hardenberg
 

Centro Studi Fauna Alpina - Alpine Wildlife Research Centre
Servizio Sanitario e della Ricerca Scientifica
Parco Nazionale Gran Paradiso, Degioz, 11, 11010-Valsavarenche (Ao),  
Italy

E-mail: [EMAIL PROTECTED]
 [EMAIL PROTECTED]
Skype: achazhardenberg
  Tel.: +39.0165.905783
  Fax: +39.0165.905506
Mobile: +39.328.8736291
 






[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Can R solve this optimization problem?

2008-01-06 Thread Paul Smith
Dear All,

I am trying to solve the following maximization problem with R:

find x(t) (continuous) that maximizes the

integral of x(t) with t from 0 to 1,

subject to the constraints

dx/dt = u,

|u| = 1,

x(0) = x(1) = 0.

The analytical solution can be obtained easily, but I am trying to
understand whether R is able to solve numerically problems like this
one. I have tried to find an approximate solution through
discretization of the objective function but with no success so far.

Thanks in advance,

Paul

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] need help

2008-01-06 Thread Charles C. Berry

Rosalina,

You should start by reading the Posting Guide - it has helpful advice on 
how to solve a problem yourself and how to craft postings to get good 
answers.

The Posting Guide says:

[some basics deleted]

Do your homework before posting: If it is clear that you have done basic 
background research, you are far more likely to get an informative 
response. See also Further Resources further down this page.

 * Do help.search(keyword) and apropos(keyword) with different
   keywords (type this at the R prompt).

[other helpful suggestions deleted]

-

and doing exactly that on my system yields:



Help files with alias or concept or title matching .julian. using fuzzy 
matching:



weekdays(base)  Extract Parts of a POSIXt or Date 
Object
day.of.week(chron)  Convert between Julian and Calendar 
Dates
TimeDateCoercion(fCalendar) timeDate Class, Coercion and 
Transformation
date.ddmmmyy(survival)  Format a Julian date
date.mdy(survival)  Convert from Julian Dates to Month, 
Day, and Year
date.mmddyy(survival)   Format a Julian date
date.mmdd(survival) Format a Julian date
mdy.date(survival)  Convert to Julian Dates



Type 'help(FOO, package = PKG)' to inspect entry 'FOO(PKG) TITLE'.
-


which should be enough to get you going.

Also, you will want to consult Rnews which had an informative article on 
handling dates a few years back.

HTH,

Chuck


On Mon, 7 Jan 2008, Zakaria, Roslinazairimah - zakry001 wrote:

 Hi,

 I'm Roslina, PhD student of University of South Australia, Australia
 from school Maths and Stats. I use S-Plus before and now has started
 using R-package. I used

 to analyse rainfall data using julian date. Is there any similar

 function that you can suggest to me to be used in R-package? Thank you

 so much for your attention and help




   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


Charles C. Berry(858) 534-2098
 Dept of Family/Preventive Medicine
E mailto:[EMAIL PROTECTED]  UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Avoiding FOR loops

2008-01-06 Thread dxc13

useR's,

I would like to know if there is a way to avoid using FOR loops to perform
the below calculation. 

Consider the following data:

 x
 [,1] [,2] [,3]
[1,]4   111
[2,]192
[3,]733
[4,]364
[5,]685

 xk
 Var1  Var2 Var3
1   -0.25  1.75  0.5
20.75  1.75  0.5
31.75  1.75  0.5
42.75  1.75  0.5
53.75  1.75  0.5
64.75  1.75  0.5
75.75  1.75  0.5
86.75  1.75  0.5
97.75  1.75  0.5
10  -0.25  2.75  0.5

Here, X is a matrix of 3 variables in which each is of size 5 and XK are
some values that correspond to each variable.  For each variable, I want to
do:

|Xi - xkj|   where i = 1 to 3 and j = 1 to 10

It looks as if a double FOR loop would work, but can the apply function
work?  Or some other function that is shorter than a FOR loop?  Thank you, I
hope this makes sense.

Derek


-- 
View this message in context: 
http://www.nabble.com/Avoiding-FOR-loops-tp14656517p14656517.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Can R solve this optimization problem?

2008-01-06 Thread Duncan Murdoch
On 06/01/2008 7:55 PM, Paul Smith wrote:
 On Jan 7, 2008 12:18 AM, Duncan Murdoch [EMAIL PROTECTED] wrote:
 I am trying to solve the following maximization problem with R:

 find x(t) (continuous) that maximizes the

 integral of x(t) with t from 0 to 1,

 subject to the constraints

 dx/dt = u,

 |u| = 1,

 x(0) = x(1) = 0.

 The analytical solution can be obtained easily, but I am trying to
 understand whether R is able to solve numerically problems like this
 one. I have tried to find an approximate solution through
 discretization of the objective function but with no success so far.
 R doesn't provide any way to do this directly.  If you really wanted to
 do it in R, you'd need to choose some finite dimensional parametrization
 of u (e.g. as a polynomial or spline, but the constraint on it would
 make the choice tricky:  maybe a linear spline?), then either evaluate
 the integral analytically or numerically to give your objective
 function.  Then there are some optimizers available, but in my
 experience they aren't very good on high dimensional problems:  so your
 solution would likely be quite crude.

 I'd guess you'd be better off in Matlab, Octave, Maple or Mathematica
 with a problem like this.
 
 Thanks, Duncan. I have placed a similar post in the Maxima list and
 another one in the Octave list. (I have never used splines; so I did
 not quite understand the method that you suggested to me.)

Linear splines are just piecewise linear functions.  An easy way to 
parametrize them is by their value at a sequence of locations; they 
interpolate linearly between there.

x would be piecewise quadratic, so its integral would be a sum of cubic 
terms.

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Avoiding FOR loops

2008-01-06 Thread Charilaos Skiadas
On Jan 6, 2008, at 7:55 PM, dxc13 wrote:


 useR's,

 I would like to know if there is a way to avoid using FOR loops to  
 perform
 the below calculation.

 Consider the following data:

snip
 Here, X is a matrix of 3 variables in which each is of size 5 and  
 XK are
 some values that correspond to each variable.  For each variable, I  
 want to
 do:

 |Xi - xkj|   where i = 1 to 3 and j = 1 to 10

That should be i=1 to 5 I take it?

If I understand what you want to do, then the outer function is the key:

lapply(1:3, function(i) { outer(x[,i], xk[,i],  -) } )

This should land you with a list of three 5x10 tables

 It looks as if a double FOR loop would work, but can the apply  
 function
 work?  Or some other function that is shorter than a FOR loop?   
 Thank you, I
 hope this makes sense.

 Derek


Haris Skiadas
Department of Mathematics and Computer Science
Hanover College

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Can R solve this optimization problem?

2008-01-06 Thread Paul Smith
On Jan 7, 2008 1:04 AM, Duncan Murdoch [EMAIL PROTECTED] wrote:
  I am trying to solve the following maximization problem with R:
 
  find x(t) (continuous) that maximizes the
 
  integral of x(t) with t from 0 to 1,
 
  subject to the constraints
 
  dx/dt = u,
 
  |u| = 1,
 
  x(0) = x(1) = 0.
 
  The analytical solution can be obtained easily, but I am trying to
  understand whether R is able to solve numerically problems like this
  one. I have tried to find an approximate solution through
  discretization of the objective function but with no success so far.
  R doesn't provide any way to do this directly.  If you really wanted to
  do it in R, you'd need to choose some finite dimensional parametrization
  of u (e.g. as a polynomial or spline, but the constraint on it would
  make the choice tricky:  maybe a linear spline?), then either evaluate
  the integral analytically or numerically to give your objective
  function.  Then there are some optimizers available, but in my
  experience they aren't very good on high dimensional problems:  so your
  solution would likely be quite crude.
 
  I'd guess you'd be better off in Matlab, Octave, Maple or Mathematica
  with a problem like this.
 
  Thanks, Duncan. I have placed a similar post in the Maxima list and
  another one in the Octave list. (I have never used splines; so I did
  not quite understand the method that you suggested to me.)

 Linear splines are just piecewise linear functions.  An easy way to
 parametrize them is by their value at a sequence of locations; they
 interpolate linearly between there.

 x would be piecewise quadratic, so its integral would be a sum of cubic
 terms.

Thanks, Duncan, for your explanation.

Paul

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Can R solve this optimization problem?

2008-01-06 Thread Gabor Grothendieck
This can be discretized to a linear programming problem
so you can solve it with the lpSolve package.  Suppose
we have x0, x1, x2, ..., xn.  Our objective (up to a
multiple which does not matter) is:

Maximize: x1 + ... + xn

which is subject to the constraints:

-1/n = x1 - x0 = 1/n
-1/n = x2 - x1 = 1/n
...
-1/n = xn - x[n-1] = 1/n
and
x0 = xn = 0

On Jan 6, 2008 7:05 PM, Paul Smith [EMAIL PROTECTED] wrote:
 Dear All,

 I am trying to solve the following maximization problem with R:

 find x(t) (continuous) that maximizes the

 integral of x(t) with t from 0 to 1,

 subject to the constraints

 dx/dt = u,

 |u| = 1,

 x(0) = x(1) = 0.

 The analytical solution can be obtained easily, but I am trying to
 understand whether R is able to solve numerically problems like this
 one. I have tried to find an approximate solution through
 discretization of the objective function but with no success so far.

 Thanks in advance,

 Paul

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data frame manipulation - newbie question

2008-01-06 Thread jim holtman
There are a number of different ways that you would have to manipulate
your data to do what you want.  It is useful to learn some of these
techniques.  Here, I think, are the set of actions that you want to
do.

 x - read.table(textConnection(row  k.idx  step.forwd   pt.nummodel  
  prev   valueabs.error
+ 1  2000  1 lm  09
 10.5   1.5
+ 2  2000  2 lm  11
10.5   1.5
+ 3  2011  1 lm  10
12  2.0
+ 4  2011  2 lm  12
12  2.0
+ 5  2022  1 lm  12
12.1   0.1
+ 6  2022  2 lm  12
12.1   0.1
+ 7  2000  1 rlm 10.1
10.5   0.4
+ 8  2000  2 rlm 10.3
10.5   0.2
+ 9  2011  1 rlm 11.6
12  0.4
+ 102011  2 rlm 11.4
12  0.6
+ 112022  1 rlm 11.8
12.1   0.1
+ 122022  2 rlm 11.9
12.1   0.2), header=TRUE)
 closeAllConnections()

 # split the data by the grouping factors
 x.split - split(x, list(x$k.idx, x$step.forwd, x$model), drop=TRUE)
 x.split
$`200.0.lm`
  row k.idx step.forwd pt.num model prev value abs.error
1   1   200  0  1lm9  10.5   1.5
2   2   200  0  2lm   11  10.5   1.5

$`201.1.lm`
  row k.idx step.forwd pt.num model prev value abs.error
3   3   201  1  1lm   1012 2
4   4   201  1  2lm   1212 2

$`202.2.lm`
  row k.idx step.forwd pt.num model prev value abs.error
5   5   202  2  1lm   12  12.1   0.1
6   6   202  2  2lm   12  12.1   0.1

$`200.0.rlm`
  row k.idx step.forwd pt.num model prev value abs.error
7   7   200  0  1   rlm 10.1  10.5   0.4
8   8   200  0  2   rlm 10.3  10.5   0.2

$`201.1.rlm`
   row k.idx step.forwd pt.num model prev value abs.error
99   201  1  1   rlm 11.612   0.4
10  10   201  1  2   rlm 11.412   0.6

$`202.2.rlm`
   row k.idx step.forwd pt.num model prev value abs.error
11  11   202  2  1   rlm 11.8  12.1   0.1
12  12   202  2  2   rlm 11.9  12.1   0.2


 # now take the means of given columns
 x.mean - lapply(x.split, function(.grp) colMeans(.grp[, c('prev', 'value', 
 'abs.error')]))

 # put back into a matrix
 (x.mean - do.call(rbind, x.mean))
   prev value abs.error
200.0.lm  10.00  10.5  1.50
201.1.lm  11.00  12.0  2.00
202.2.lm  12.00  12.1  0.10
200.0.rlm 10.20  10.5  0.30
201.1.rlm 11.50  12.0  0.50
202.2.rlm 11.85  12.1  0.15

 #boxplot
 boxplot(abs.error ~ k.idx, data=x)

 # create a table with average of the abs.error for each 'model'
 cbind(x, abs.error.mean=ave(x$abs.error, x$model))
   row k.idx step.forwd pt.num model prev value abs.error abs.error.mean
11   200  0  1lm  9.0  10.5   1.5  1.200
22   200  0  2lm 11.0  10.5   1.5  1.200
33   201  1  1lm 10.0  12.0   2.0  1.200
44   201  1  2lm 12.0  12.0   2.0  1.200
55   202  2  1lm 12.0  12.1   0.1  1.200
66   202  2  2lm 12.0  12.1   0.1  1.200
77   200  0  1   rlm 10.1  10.5   0.4  0.317
88   200  0  2   rlm 10.3  10.5   0.2  0.317
99   201  1  1   rlm 11.6  12.0   0.4  0.317
10  10   201  1  2   rlm 11.4  12.0   0.6  0.317
11  11   202  2  1   rlm 11.8  12.1   0.1  0.317
12  12   202  2  2   rlm 11.9  12.1   0.2  0.317



On Jan 6, 2008 10:50 AM, Rense Nieuwenhuis [EMAIL PROTECTED] wrote:
 Hi,

 you may want to use that apply / tapply function. Some find it a bit
 hard to grasp at first, but it will help you many times in many
 situations when you get the hang of it.

 Maybe you can get some information on my site: http://
 www.rensenieuwenhuis.nl/r-project/manual/basics/tables/


 Hope this helps,

 Rense Nieuwenhuis



 On Jan 3, 2008, at 11:53 , José Augusto M. de Andrade Junior wrote:

  Hi all,
 
  Could someone please explain how can i efficientily query a data frame
  with several factors, as shown below:
 
  --
  ---
  Data frame: pt.knn
  --
  ---
  row | k.idx  

Re: [R] Can R solve this optimization problem?

2008-01-06 Thread Paul Smith
On Jan 7, 2008 1:32 AM, Gabor Grothendieck [EMAIL PROTECTED] wrote:
 This can be discretized to a linear programming problem
 so you can solve it with the lpSolve package.  Suppose
 we have x0, x1, x2, ..., xn.  Our objective (up to a
 multiple which does not matter) is:

 Maximize: x1 + ... + xn

 which is subject to the constraints:

 -1/n = x1 - x0 = 1/n
 -1/n = x2 - x1 = 1/n
 ...
 -1/n = xn - x[n-1] = 1/n
 and
 x0 = xn = 0


 On Jan 6, 2008 7:05 PM, Paul Smith [EMAIL PROTECTED] wrote:
  Dear All,
 
  I am trying to solve the following maximization problem with R:
 
  find x(t) (continuous) that maximizes the
 
  integral of x(t) with t from 0 to 1,
 
  subject to the constraints
 
  dx/dt = u,
 
  |u| = 1,
 
  x(0) = x(1) = 0.
 
  The analytical solution can be obtained easily, but I am trying to
  understand whether R is able to solve numerically problems like this
  one. I have tried to find an approximate solution through
  discretization of the objective function but with no success so far.

Thats is clever, Gabor! But suppose that the objective function is

integral of sin( x( t ) ) with t from 0 to 1

and consider the same constraints. Can your method be adapted to get
the solution?

Paul

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Installing R on ubuntu dapper

2008-01-06 Thread hadley wickham
I followed the instructions at
http://cran.r-project.org/bin/linux/ubuntu/README.html, but I'm
getting the following error:

~: sudo apt-get install r-base
Reading package lists... Done
Building dependency tree... Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.

Since you only requested a single operation it is extremely likely that
the package is simply not installable and a bug report against
that package should be filed.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
  r-base: Depends: r-base-core (= 2.6.1-1dapper0) but it is not installable
  Depends: r-recommended (= 2.6.1-1dapper0) but it is not
going to be installed
E: Broken packages


Any help would be much appreciated.

Thanks,

Hadley

-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] need help

2008-01-06 Thread Zakaria, Roslinazairimah - zakry001
Hi,

I'm Roslina, PhD student of University of South Australia, Australia
from school Maths and Stats. I use S-Plus before and now has started
using R-package. I used 

to analyse rainfall data using julian date. Is there any similar 

function that you can suggest to me to be used in R-package? Thank you 

so much for your attention and help. 

Here are some of my codes:

# dt1-data 

# mt1-begin month, mt2- end month, nn=year, n= no of years

# da - days in the following month

# yr1 - year begin

 

define.date1-function(dt1,mt1,mt2,nn,da)

{  mt2-mt2+1

   start-julian(mt1, 1, nn, origin=c(month=1, day=1, year=1971))+1

   end-julian(mt2, 1, nn, origin=c(month=1, day=1, year=1971))+da

   a-dt1[start:end,]

   #am-as.matrix(a[,5])

}

 

seq.date1-function(dt1,mt1,mt2,n,yr1,da)

{ yr1-yr1-1

   for (i in 1:n)

   {  kp1-define.date1(dt1,mt1,mt2,yr1+i,da)

   if (i==1) kp2-kp1

   else kp2-rbind(kp2,kp1) 

   }

   kp2

 

 

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] rainbow function

2008-01-06 Thread Wang, Zhaoming (NIH/NCI) [C]
Hello
I'm using rainbow function to generate 10 colors for the plot and it is
difficult to tell the neighboring colors from each other. How can I make
the colors more differently.
 
Thanks
Zhaoming 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] rainbow function

2008-01-06 Thread jim holtman
Specify them exactly if there are only 10.

On Jan 6, 2008 10:55 PM, Wang, Zhaoming (NIH/NCI) [C]
[EMAIL PROTECTED] wrote:
 Hello
 I'm using rainbow function to generate 10 colors for the plot and it is
 difficult to tell the neighboring colors from each other. How can I make
 the colors more differently.

 Thanks
 Zhaoming

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] compiling with mpicc for R

2008-01-06 Thread Erin Hodgess
Dear R People:

Hao Yu has a very nice package for mpi in R.

I'm trying to experiment on my own and am looking at 
building a shared library with objects from mpicc.

I tried to compile a .o object and then use R CMD SHLIB to
compile the shared library.

But I'm getting errors with the MPI_Init function, which is 
the first MPI function in the subroutine.

Any suggestions please?  (or maybe i should just leave well enough 
alone)

thanks,
Erin Hodgess
mailto: [EMAIL PROTECTED]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] numerical data frame

2008-01-06 Thread mohamed nur anisah
Dear All,
   
  I've successfully import my synteny data to R by using scan command. Below 
show my results. My major problem with my data is how am i going to combine the 
column names with the data( splt) where i have tried on cbind but a warning 
message occur. I have realized that the splt data only have 5 column instead of 
6. Please help me with this!!
   
  I want my data to be a numerical data with a proper column and column names 
and to  replace CS with 1 and CSO with 0 and also to get remove all the 
punctuations and the characters from the data.
   
  Attach herewith is my original data. Your kindly help is highly appreciated 
and thanks in advance.
   
  Cheers,
  Anisah
   
   
   
  1)for col names
   
  
nms-scan(C:/Users/user/Documents/cfa-1.txt,sep=\t,nlines=1,skip=10,what=character(0))
Read 6 items
 nms
[1] CS(O) id (number of marker/anchor)
[2]  Location(s) on reference 
[3] CS(O) size
[4] CS(O) density on reference chromosome 
[5] Location(s) on tested 
[6] Breakpoints CS(O) locations (denstiy of marker/anchor)
  
2) my data
   
  
x-scan(C:/Users/user/Documents/cfa-1.txt,sep=\n,skip=12,what=character(0))
Read 21 items
 splt-strsplit(x,\t)
 splt
[[1]]
[1] CS 1 (73): cfa1: [ 3251712 - 24126920 ]
[3]   20875208 3   
[5]  hsa18: [ 132170848 - 50139168 ]  ] 24126920, 24153560 [(8 )   
  [[2]]
[1] CS 2 (3): cfa1: [ 24153560 - 24265894 ]  
[3]   112334  27 
[5]  hsa18: [ 50105060 - 49934572 ]  ] 24265894, 24823786 [(7 )  
  [[3]]
[1] CSO 3.1 (6):  
 
[2]  cfa1: [ 24823786 - 27113036 ]
 
[3]   2289250 
 
[4]  3
 
[5]  hsa18: [ 48121156 - 46579500 ]- Decreasing order - ] 27113036, 27418228 [ 
(13)
  [[4]]
[1] CSO 3.2 (4):  
 
[2]  cfa1: [ 27418228 - 27578150 ]
 
[3]   159922  
 
[4]  25   
 
[5]  hsa18: [ 13872043 - 13208795 ]- Decreasing order - ] 27578150, 28055666 
[(9 ) 
  [[5]]
[1] CS 4 (4):  cfa1: [ 28055666 - 28835230 ]   
[3]   779564   5   
[5]  hsa6: [ 132311008 - 133132200 ]  ] 28835230, 29482792 [(7 )   
  [[6]]
[1] CS 5 (46): cfa1: [ 29482792 - 40120672 ]   
[3]   10637880 4   
[5]  hsa6: [ 133604208 - 146227152 ]  ] 40120672, 40539680 [(8 )   
  [[7]]
[1] CS 6 (9):  cfa1: [ 40539680 - 43339444 ]   
[3]   2799764  3   
[5]  hsa6: [ 146390608 - 149867328 ]  ] 43339444, 43390788 [(13 )  
  [[8]]
[1] CSO 7.1 (74): 
 
[2]  cfa1: [ 43390788 - 59714992 ]
 
[3]   16324204
 
[4]  5
 
[5]  hsa6: [ 149929104 - 169714432 ]- Increasing order -] 59714992, 59864308 [ 
(15)
  [[9]]
[1] CSO 7.2 (52): 
[2]  cfa1: [ 59864308 - 72417520 ]
[3]   12553212
[4]  4
[5]  hsa6: [ 116707976 - 131508152 ]- Increasing order -  
[6] ] 72417520, 73256040 [(7 )
  [[10]]
[1] CSO 8.1 (12):  
[2]  cfa1: [ 73256040 - 75192808 ] 
[3]   1936768  
[4]  6 
[5]  hsa9: [ 98441680 - 96360824 ]- Decreasing order - 
[6] ] 75192808, 75272528 [ 
[7]  (6 )  
  [[11]]
[1] CSO 8.2 (56):  
[2]  cfa1: [ 75272528 - 91881664 ] 
[3]   16609136 
[4]  3 
[5]  hsa9: [ 89530256 - 70341312 ]- Decreasing order - 
[6] ] 91881664, 92281272 [ 
[7]  (5 )  
  [[12]]
[1] CSO 8.3 (22):
[2]  cfa1: [ 92281272 - 96913624 ]