[R] prediction error for test set-cross validation

2009-03-11 Thread Mehmet U Ayvaci
Hi, 

 

I have a database of 2211 rows with 31 entries each and I manually split my
data into 10 folds for cross validation. I build logistic regression model
as:  

 

model - glm(qual ~ AgGr + FaHx + PrHx + PrSr + PaLp + SvD + IndExam + 

Rad +BrDn + BRDS + PrinFin+ SkRtr + NpRtr + SkThck +TrThkc +
SkLes + AxAdnp + ArcDst + MaDen + CaDt + MaMG + 

MaMrp + MaSh + SCTub + SCFoc + MaSz,
family=binomial(link=logit));

 

Where the  variables are taken from the trainSet of size 1989x31. The test
set is sized 222x31. Now my question is when I try to predict on the test
set it gives me the error:

 

 predict.glm(model, testSet, type =response)

Error in drop(X[, piv, drop = FALSE] %*% beta[piv]) : 

  subscript out of bounds

 

It does fine on trainSet. so it is something about the testSet. On the other
hand, I realized that some independent variables say MaSz takes 3
different values in the trainset vs. 4 different ones in the testSet. I am
not sure if this is the cause.If so, what would be the remedy?

 

Since I can retrieve the coefficients of the logistic regression, I could
manually calculate response for each entry in the testSet. This could solve
my problem although burdensome. But, I don't know an easy way of doing it as
my logistic regression have 80+ coefficients.

 

 

Could somebody advise?

 

 

Thanks, 

M


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Converting a dataframe to a matrix

2009-03-11 Thread Simon Blomberg
xtabs is your friend:

xtabs(likes ~ color + name, data=dat)

   color
nameblue green red
  jake 1 1   0
  sally1 1   0
  tom  0 0   1

See ?xtabs for more info. Note that I changed the likes? column to
just likes. It is a bad idea to have question marks in variable names.

Simon.

On Wed, 2009-03-11 at 01:42 -0400, Jennifer Brea wrote:
 If I have a dataframe which is organized like this:
 
name color likes?
 1 sally   red0
 2 sally  blue1
 3 sally green1
 4  jake   red0
 5  jake  blue1
 6  jake green1
 7   tom   red1
 8   tom  blue0
 9   tom green0
 
 
 And I want to create a matrix in the form:
 
   red blue green
 sally   01 1
 jake01 1
 tom 10 0
 
 
 Are there any built-in commands that might help me do this?  Also, I 
 can't assume that there is an observation for every person-color.  In 
 other words, in the original dataset, there might be some colors for 
 which sally offered no opinion.  In some cases, this may be represented 
 by NA, in others, it may mean that no row exists for sally for that color.
 
 Thank you!
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
-- 
Simon Blomberg, BSc (Hons), PhD, MAppStat. 
Lecturer and Consultant Statistician 
School of Biological Sciences
The University of Queensland 
St. Lucia Queensland 4072 
Australia
Room 320 Goddard Building (8)
T: +61 7 3365 2506
http://www.uq.edu.au/~uqsblomb
email: S.Blomberg1_at_uq.edu.au

Policies:
1.  I will NOT analyse your data for you.
2.  Your deadline is your problem.

The combination of some data and an aching desire for 
an answer does not ensure that a reasonable answer can 
be extracted from a given body of data. - John Tukey.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Converting a dataframe to a matrix

2009-03-11 Thread markleeds
Hi: Below works but it's extremely ugly and overly complicated. i'm sure 
someone else will send you something better  and I'll be waiting also.
Also, the way I named the rows and columns works for below but it won't 
hold in the general case if you don't have nice ordered names like

you do below. So, use it until we something better.

DF - read.table(textConnection(name color likes
1 sally red 0
2 sally blue 1
3 sally green 1
4 jake red 0
5 jake blue  1
6 jake green 1
7 tom red 1
8 tom blue 0
9 tom green 0),header=TRUE,stringsAsFactors=FALSE)


mat - matrix(sapply(1:nrow(DF),function(.row) {
DF[.row,3]

}),nrow=length(unique(DF$name)),byrow=TRUE,dimnames=list(c(unique(DF$name)),c(unique(DF$color

print(mat)





On Wed, Mar 11, 2009 at 12:42 AM, Jennifer Brea wrote:


If I have a dataframe which is organized like this:

  name color likes?
1 sally   red0
2 sally  blue1
3 sally green1
4  jake   red0
5  jake  blue1
6  jake green1
7   tom   red1
8   tom  blue0
9   tom green0


And I want to create a matrix in the form:

 red blue green
sally   01 1
jake01 1
tom 10 0


Are there any built-in commands that might help me do this?  Also, I 
can't assume that there is an observation for every person-color.  In 
other words, in the original dataset, there might be some colors for 
which sally offered no opinion.  In some cases, this may be 
represented by NA, in others, it may mean that no row exists for sally 
for that color.


Thank you!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lsmeans in R

2009-03-11 Thread suman Duvvuru
I need help with calculating lsmeans (adjusted means) of different terms in
a linear model including the main effect and the interaction effect terms. I
use lm to run the linear models...I previously noted from literature that
that effects package can be used to generate lsmeans. But I tried to use
it but could not figure out which option to use to get means. If anyone can
give an example of how to get lsmeans using lm object, that will very
helpful.

Thanks,
SUman

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Barplot -density and multhist x-axis problem

2009-03-11 Thread Honorata Kaja Gajda
Thanks in advance for any help

I would like to plot a barplot with probability density instead of count
frequencies, as is possible in histograms (freq=FALSE). Is this possible?
I have tried the following code:

with(center,table(light))-light.table
barplot(light.table ,col=brewer, freq=FALSE, beside=T)

But  freq=FALSE doesn’t seem to work in the command of barplot.
The reason why I want to use a barplot is that I have two variables. I
have tried to use multhist instead, but when I plot the histogram the
x-axis labels are unevenly spaced. I would like to create a bar with 7
ticks on the x-axis (1,2,3..7), however the following tics are yielded in
the x-axis (1, 1.5, 2.5, 3.5, 4.5, 5.5, 6.5). How can I regulate this?
I have used the following code:

l-list(m1=(center$light[center$site==1]),
m2=(center$light[center$site==2]))
multhist(l,breaks=c(1:7, by=1), freq=FALSE, main=Illumination in Plot
Center, ylab=Density, col =brewer)

I have tried names.arg= but this doesn’t work in multihist either.
Is there any way around this problem? Either by using barplot or multihist?

I am very grateful for any help
Honorata

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Re : Converting a dataframe to a matrix

2009-03-11 Thread justin bem
Something like this will work

m-matrix(df1$likes, nr=3,nc=3,byrow=T)
colnames(m)-unique(df1$color)
rowlnames(m)-unique(df1$name)

 Sincerly.
Justin BEM
BP 1917 Yaoundé Cameroun
Tél (237) 76043774






De : Jennifer Brea b...@fas.harvard.edu
À : r-help@r-project.org
Envoyé le : Mercredi, 11 Mars 2009, 6h42mn 29s
Objet : [R] Converting a dataframe to a matrix

If I have a dataframe which is organized like this:

  name color likes?
1 sally  red    0
2 sally  blue    1
3 sally green    1
4  jake  red    0
5  jake  blue    1
6  jake green    1
7  tom  red    1
8  tom  blue    0
9  tom green    0


And I want to create a matrix in the form:

    red blue green
sally  0    1    1
jake    0    1    1
tom    1    0    0


Are there any built-in commands that might help me do this?  Also, I can't 
assume that there is an observation for every person-color.  In other words, in 
the original dataset, there might be some colors for which sally offered no 
opinion.  In some cases, this may be represented by NA, in others, it may mean 
that no row exists for sally for that color.

Thank you!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] t-test, survey data

2009-03-11 Thread Marie Vandresse



 *How can I do a t-test with survey data?*


Thanks.

Marie


--








-

Le Bureau fédéral du Plan fête ses 50 ans en 2009.

Het Federaal Planbureau viert zijn 50-ste verjaardag in 2009.

The Federal Planning Bureau celebrates its 50-th anniversary in 2009.

-

Disclaimer: please see www.plan.be/disclaimer.html

Please, think before you print...

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Re : t-test, survey data

2009-03-11 Thread justin bem
For survey data, the survey package provides methods for variance estimation.
 Justin BEM
BP 1917 Yaoundé
Tél (237) 99597295
(237) 22040246 





De : Marie Vandresse v...@plan.be
À : r-help@r-project.org
Envoyé le : Mercredi, 11 Mars 2009, 9h25mn 11s
Objet : [R] t-test, survey data



    *How can I do a t-test with survey data?*


Thanks.

Marie


-- 







-

Le Bureau fédéral du Plan fête ses 50 ans en 2009.

Het Federaal Planbureau viert zijn 50-ste verjaardag in 2009.

The Federal Planning Bureau celebrates its 50-th anniversary in 2009.

-

Disclaimer: please see www.plan.be/disclaimer.html

Please, think before you print...

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lsmeans in R

2009-03-11 Thread Chuck Cleland
On 3/11/2009 2:45 AM, suman Duvvuru wrote:
 I need help with calculating lsmeans (adjusted means) of different terms in
 a linear model including the main effect and the interaction effect terms. I
 use lm to run the linear models...I previously noted from literature that
 that effects package can be used to generate lsmeans. But I tried to use
 it but could not figure out which option to use to get means. If anyone can
 give an example of how to get lsmeans using lm object, that will very
 helpful.

  This R-help thread from March 2007 should help:

http://finzi.psych.upenn.edu/R/Rhelp02/archive/95809.html

 Thanks,
 SUman
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code. 

-- 
Chuck Cleland, Ph.D.
NDRI, Inc. (www.ndri.org)
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] removing daylight savings in R

2009-03-11 Thread Prof Brian Ripley

On Wed, 11 Mar 2009, Rhiannon Marchant wrote:



  Hello,
  Thanks the Prof. Brian Ripley for his earlier response, however I still have
  a few issues with the times in R.
  I'd like to somehow set up R so it automatically has the timezone set to
  Western Australian standard time when I open up a workspace instead of
  Western Australian daylight savings time. Is this possible? I haven't had
  much luck in trying to find this.


Yes, see ?Sys.timezone (as described in my earlier reply).  According 
to Wikipedia that is not an actual timezone, but it seems you want 
TZ='GMT-8'.  As that is not a known timezone, you will get some 
messages when trying to print the time and (spuriously) GMT as the 
timezone names, but input conversion did work.


The best thing to do would be to create a timezone with the names you 
want and recreate the zoneinfo database: there are instructions on how 
to do so in the R sources.


Life would be a lot easier if you keep times in an official timezone 
or UTC.  I would suggest reading them as if in UTC (again, I explained 
that in my first reply) and shifting by 8 hours (subtract 8*3600): you 
can then print them in UTC or a real-world time.


[...]


--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Repeated values

2009-03-11 Thread Tammy Ma

Hi,All.

How to make a program to delete repeated value?

a example

 c
 [1] 4 3 0 3 4 1 0 1 4 4 3 4 3 4

I want to get : 4 3 0 3 4 1 0 1 4 3 4 3 4

two 4 is being represented by one 4.

Thanks.

Tammy

_
More than messages–check out the rest of the Windows Live™.
http://www.microsoft.com/windows/windowslive/
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mean +/- SEM

2009-03-11 Thread Jim Lemon

Stefo Ratino wrote:

Hi all,

I am looking for a R function which unables me to plot mean +/- SEM. Is there such a function in R? 
  

Hi Stefo,
The dispersion function in the plotrix package (among many, many 
others in other packages) allows you to stick error bars onto almost 
anything.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] popular R packages

2009-03-11 Thread Jim Lemon

Christos Hatzis wrote:

Bioconductor already provides download stats for all packages...

http://bioconductor.org/packages/stats/bioc/affy.html
  

Maybe if we asked the Bioconductor people _really_ nicely

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Repeated values

2009-03-11 Thread Chuck Cleland
On 3/11/2009 5:08 AM, Tammy Ma wrote:
 Hi,All.
 
 How to make a program to delete repeated value?
 
 a example
 
 c
  [1] 4 3 0 3 4 1 0 1 4 4 3 4 3 4
 
 I want to get : 4 3 0 3 4 1 0 1 4 3 4 3 4
 
 two 4 is being represented by one 4.

x - c(4, 3, 0, 3, 4, 1, 0, 1, 4, 4, 3, 4, 3, 4)

rle(x)$values
 [1] 4 3 0 3 4 1 0 1 4 3 4 3 4

?rle

 Thanks.
 
 Tammy
 
 _
 More than messages–check out the rest of the Windows Live™.
 http://www.microsoft.com/windows/windowslive/
   [[alternative HTML version deleted]]
 
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Chuck Cleland, Ph.D.
NDRI, Inc. (www.ndri.org)
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Barplot -density and multhist x-axis problem

2009-03-11 Thread Jim Lemon

Honorata Kaja Gajda wrote:

Thanks in advance for any help

I would like to plot a barplot with probability density instead of count
frequencies, as is possible in histograms (freq=FALSE). Is this possible?
I have tried the following code:

with(center,table(light))-light.table
barplot(light.table ,col=brewer, freq=FALSE, beside=T)

But  freq=FALSE doesn’t seem to work in the command of barplot.
The reason why I want to use a barplot is that I have two variables. I
have tried to use multhist instead, but when I plot the histogram the
x-axis labels are unevenly spaced. I would like to create a bar with 7
ticks on the x-axis (1,2,3..7), however the following tics are yielded in
the x-axis (1, 1.5, 2.5, 3.5, 4.5, 5.5, 6.5). How can I regulate this?
I have used the following code:

l-list(m1=(center$light[center$site==1]),
m2=(center$light[center$site==2]))
multhist(l,breaks=c(1:7, by=1), freq=FALSE, main=Illumination in Plot
Center, ylab=Density, col =brewer)

I have tried names.arg= but this doesn’t work in multihist either.
Is there any way around this problem? Either by using barplot or multihist?

  

Hi Honorata,
Have a look at the barp function in plotrix. The bars or groups of 
bars are centered on integer values. You can also save the return value 
of barplot as this gives you the positions of the bars.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] prediction error for test set-cross validation

2009-03-11 Thread Uwe Ligges



Mehmet U Ayvaci wrote:
Hi, 

 


I have a database of 2211 rows with 31 entries each and I manually split my
data into 10 folds for cross validation. I build logistic regression model
as:  

 

model - glm(qual ~ AgGr + FaHx + PrHx + PrSr + PaLp + SvD + IndExam + 


Rad +BrDn + BRDS + PrinFin+ SkRtr + NpRtr + SkThck +TrThkc +
SkLes + AxAdnp + ArcDst + MaDen + CaDt + MaMG + 


MaMrp + MaSh + SCTub + SCFoc + MaSz,
family=binomial(link=logit));

 


Where the  variables are taken from the trainSet of size 1989x31. The test
set is sized 222x31. Now my question is when I try to predict on the test
set it gives me the error:

 


predict.glm(model, testSet, type =response)


Error in drop(X[, piv, drop = FALSE] %*% beta[piv]) : 


  subscript out of bounds

 


It does fine on trainSet. so it is something about the testSet. On the other
hand, I realized that some independent variables say MaSz takes 3
different values in the trainset vs. 4 different ones in the testSet. I am
not sure if this is the cause.If so, what would be the remedy?

 


Since I can retrieve the coefficients of the logistic regression, I could
manually calculate response for each entry in the testSet. This could solve
my problem although burdensome. But, I don't know an easy way of doing it as
my logistic regression have 80+ coefficients.



Well, if MaSz takes 3 different values in the trainset vs. 4 different 
ones in the testSet, then you won't even be able to calculate it by 
hand, because you got no coefficients for the 4th level of that factor.
Either you need the data to estimate coefficients from or you cannot 
predict.


Uwe Ligges



 

 


Could somebody advise?

 

 

Thanks, 


M


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating a directory for my data

2009-03-11 Thread Uwe Ligges
Hmmm, in fact I do not understand what you are going to do. Can you tell 
us what the result of your code should look like?


Uwe Ligges



miya wrote:

Hi everyone,
I am currently working with a very large data set. It is data collected a
few times a day, so there are repeated titles in the data set. I want to
assign an id number to each different title and enter this information in a
directory that I can access whenever I am working with the data.

An example of a data I am workin with follows:

1 title A
2 title C
3 title B
1 title A
2 title B
3 title C
1 title D
2 title A
3 title C
1 title B
2 title A
3 title C

What I have tried thus far is for loops.

r-matrix(x[,1])
t-matrix(x[,2])
for(i in 1:length(t)) {
for(k in 1:length(r)) {
z = matrix(c(i,t))
A = Article A
for(j in 1:length(z)){
if(B==Article A)

else{enter article in z}
}
}

This is of course giving me a lot of errors. I just have no idea where to go
with this. Does anyone have any ideas on where I can go from here to create
my directory?
I appreciate all your help.
Thank you in advance.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] trying to run odfWeave()

2009-03-11 Thread Lore M

Hi there !

I'm working with windows and R GUI and I'm trying to generate an automatic 
repport using odfWeave. 

I have taken the basic template available on line at : 
http://www.biostat.uzh.ch/services/templates.html
which is SampleOdf.odt  I've imported the package odfWeave and the 
corresponding library. And then I run : 

 file.in=E:/Tex/SampleOdf.odt
 file.out=E:/Tex/RepportOdf.odt
 odfWeave(file.in, file.out)
  Copying  E:/Tex/SampleOdf.odt 
  Setting wd to  C:\DOCUME~1\Lore\LOCALS~1\Temp\RtmpSjnx0I/odfWeave11100313419 
  Unzipping ODF file using unzip -o SampleOdf.odt 
Erreur dans odfWeave(file.in, file.out) : Error unzipping file
De plus : Warning message:
In system(zipCmd[2], invisible = TRUE) : unzip introuvable 

Is there someone who has an idea of what's the problem here ?

Then, I have a second question, which totally independent from the one above (I 
have an error in both cases...) : is RepportOdf.odt automatically 
generated/created or should I create the document before ? 

Thanks a lot everyone for your help !

Lo.

_
[[elided Hotmail spam]]

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] RES: North Arrow (.png file) on a Map

2009-03-11 Thread Rodrigo Aluizio
Thanks a lot Yihui, It's perfect, in fact exactly identical the I have as a
png file in my computer.
Thank you so much.

Regards
Rodrigo

-Mensagem original-
De: Yihui Xie [mailto:xieyi...@gmail.com] 
Enviada em: quarta-feira, 11 de março de 2009 02:48
Para: Rodrigo Aluizio
Cc: R Help
Assunto: Re: [R] North Arrow (.png file) on a Map

Is this arrow satisfactory for you?

north.arrow = function(x, y, h) {
polygon(c(x, x, x + h/2), c(y - h, y, y - (1 + sqrt(3)/2) * h),
col = black, border = NA)
polygon(c(x, x + h/2, x, x - h/2), c(y - h, y - (1 + sqrt(3)/2) *
h, y, y - (1 + sqrt(3)/2) * h))
text(x, y, N, adj = c(0.5, 0), cex = 4)
}
plot(1, type = n, ylim = c(0, 1))
north.arrow(1, 0.8, 0.3)

Regards,
Yihui
--
Yihui Xie xieyi...@gmail.com
Phone: +86-(0)10-82509086 Fax: +86-(0)10-82509086
Mobile: +86-15810805877
Homepage: http://www.yihui.name
School of Statistics, Room 1037, Mingde Main Building,
Renmin University of China, Beijing, 100872, China



On Tue, Mar 10, 2009 at 7:21 PM, Rodrigo Aluizio r.alui...@gmail.com
wrote:
 Hi list.

 I would like to know how do I insert a North arrow, stored as a png file
in
 my computer, in a map? I found lots of post asking similar things, one of
 them mentioned the pixmap package.  The map was done using map() and
 shapefiles (the code is below). I’m using the pixmap () and addlogo()
 functions. Well I can import the png with pixmap() function (I guess, once
 there’s no error message), but I can’t put It on the map, I got an error
 message telling me that:



 “Error at t(x...@index[nrow(x...@index):1, , drop = FALSE]) :

  index out of limits”



 Well I tried changing coordinates but I always got the same result. How do
I
 do this correctly? Is there a better way?



 Thanks for the help and attention.



 Here is the complete map script:



 library(RODBC)

 library(maps)

 library(mapdata)

 library(maptools)

 library(pixmap)

 #Carregar Coordenadas e dados dos Pontos Amostrais

 Dados-odbcConnectExcel('Campos.xls',readOnly=T)

 Coord-sqlFetch(Dados,'CoordMed',colnames=F,rownames='Ponto')

 odbcClose(Dados)

 N-pixmap('Norte.png',nrow=166,ncol=113)

 # Carregar pontos e shapes


Batimetria-readShapeSpatial('C:/Users/Rodrigo/Documents/UFPR/Micropaleontol
 ogia/Campos/ShapeFiles/Batimetria_BC.shp')


Estados-readShapeSpatial('C:/Users/Rodrigo/Documents/UFPR/Micropaleontologi
 a/Campos/ShapeFiles/Estados_Sudeste.shp')


Faciologia-readShapeSpatial('C:/Users/Rodrigo/Documents/UFPR/Micropaleontol
 ogia/Campos/ShapeFiles/Faciologia_BC.shp')

 # Mapa com os Pontos da Bacia


postscript('MapaCampos.eps',paper='special',onefile=F,horizontal=F,width=3.5
 ,height=4.5,bg='white',pointsize=3)

 par(mar=c(3,2,2,0))

 map('worldHires','brazil',ylim=c(23.9,20.3),xlim=c(42.1,39.2),type='n')


plot(Faciologia,ylab='',xlab='',col=c('lightgreen','lightgreen','lightgreen'

,'lightgreen','lightgreen','lightgray','lightgray','lightgray','lightgray','

lightgray','lightgray','lightgray','lightgray','lightgray','lightgray','ligh

tgray','lightgray','lightgray','lightgray','lightgray','lightgray','lightyel
 low','lightyellow','lightyellow'),add=T,lwd=0.5,border=0)

 plot(Batimetria,ylab='',xlab='',col='darkgray',lty='solid',lwd=0.2,add=T)

 plot(Estados,ylab='',xlab='',lty='solid',add=T,lwd=0.8)


text(Coord$Longitude[Coord$Réplicas=='1'],Coord$Latitude[Coord$Réplicas=='1'
 ],rownames(Coord)[Coord$Réplicas=='1'],col='red',cex=0.5,font=2)


text(Coord$Longitude[Coord$Réplicas=='2'],Coord$Latitude[Coord$Réplicas=='2'
 ],rownames(Coord)[Coord$Réplicas=='2'],col='yellow',cex=0.5,font=2)


text(Coord$Longitude[Coord$Réplicas=='3'],Coord$Latitude[Coord$Réplicas=='3'
 ],rownames(Coord)[Coord$Réplicas=='3'],col='blue',cex=0.5,font=2)

 points(Coord$Longitude,Coord$Latitude-0.045,pch=20,cex=0.7)

 text(c(41.5,41.3),c(21.7,20.6),c('RJ','ES'))

 axis(1,xaxp=c(42.1,39.2,2),cex.axis=1)

 axis(2,yaxp=c(23.9,20.3,4),cex.axis=1)

 title(main='Bacia')


legend(40.2,23.5,c('Uma','Duas','Três'),pch=21,cex=1,pt.bg=c('red','yellow',
 'blue'),bty='n',pt.cex=2,pt.lwd=0.6,title='Réplicas')


legend(39.8,23.5,c('Areia','Calcário','Lama'),pch=21,cex=1,pt.bg=c('lightyel

low','lightgray','lightgreen'),bty='n',pt.cex=2,pt.lwd=0.6,title='Faciologia
 ')

 addlogo(N,px=c(40,39.8),py=c(21,20.8))

 dev.off()

 q('no')



 -

 MSc. Rodrigo Aluizio mailto:r.alui...@gmail.com

 Centro de Estudos do Mar/UFPR
 Laboratório de Micropaleontologia


        [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 

[R] Date frame

2009-03-11 Thread Tammy Ma

Hi, All,

How to make a data frame, each row of data frame store the different length of 
vector?

Thanks.
Tammy

_
Show them the way! Add maps and directions to your party invites. 
http://www.microsoft.com/windows/windowslive/products/events.aspx
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] chisq.test: decreasing p-value

2009-03-11 Thread soeren . vogel
A Likert scale may have produced counts of answers per category.  
According to theory I may expect equality over the categories. A  
statistical test shall reveal the actual equality in my sample.


When applying a chi square test with increasing number of repetitions  
(simulate.p.value) over a fixed sample, the p-value decreases  
dramatically (looks as if converge to zero).


(1) Why?
(2) (If this test is wrong), then which test can check what I want to  
check, that is: are the two distributions of frequencies (observed and  
expected) in principle the same?

(3) By the way, how to deal with low frequency cells?

r - c(10, 100, 500, 1000, 2000, 5000)
v - c(35, 40, 45, 45, 40, 35)
sapply(list(r), function (x) { chisq.test(v, p=c(rep.int(40, 6)),  
rescale.p=T, simulate.p.value=T, B=x)$p.value })


Thank you, Sören


--
Sören Vogel, PhD-Student, Eawag, Dept. SIAM
http://www.eawag.ch, http://sozmod.eawag.ch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Confidence/prediction interval for choice probabilities mlogit

2009-03-11 Thread Tryntsje Wesselius
Hi,

I have estimated a multinomial logit model with use of the
mlogit-package for my choice data.
I predicted the probabilities for each choice, using the formula:
pred1 - exp(B1*X1.1 + B2*X2.1 + B3*X3.1)
pred2 - exp(B1*X1.2 + B2*X2.2 + B3*X3.2)
pred3 - exp(B1*X1.3 + B2*X2.3 + B3*X3.3)
sumpred - pred1+pred2+pred3
pred1 - pred1/sumpred
pred2 - pred2/sumpred
pred3 - pred3/sumpred

Then for each choice I have a prediction of the choice probabilities.

Now I want to calculate a confidence interval of the choice
probablities. I'm not sure if it is called a confidence interval or a
prediction interval.

At the internet I haven't found anything useful. Most examples are
about the confidence interval of a certain regression coefficient, for
example the intercept. But I want to know the interval around my
choice probabilities.
The predict() function does not work with mlogit, I think.

Can anybody help me?

Kind regards,

Tryntsje

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Pseudo-random numbers between two numbers

2009-03-11 Thread Duncan Murdoch

g...@ucalgary.ca wrote:

Please forget the last email I sent with the same subject.
=
I would like to generate pseudo-random numbers between two numbers using
R, up to a given distribution,
for instance, norm. That is something like rnorm(HowMany,Min,Max,mean,sd)
over rnorm(HowMany,mean,sd).
I am wondering if

qnorm(runif(HowMany, pnorm(Min,mean,sd), pnorm(Max,mean, sd)), mean, sd)

is good.



It depends on what Min and Max are.  If you get far out in the tails, 
rounding error will kill you.  For example, pnorm(x) is exactly 1 for x 
bigger than 10 or so, so this approach would fail if Min and Max were 
both bigger than 10.  The solution is to switch to lower=FALSE in the 
upper tail, and possibly switch to a log scale if you want to be really 
extreme.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] chisq.test: decreasing p-value

2009-03-11 Thread Peter Dalgaard
soeren.vo...@eawag.ch wrote:
 A Likert scale may have produced counts of answers per category.
 According to theory I may expect equality over the categories. A
 statistical test shall reveal the actual equality in my sample.
 
 When applying a chi square test with increasing number of repetitions
 (simulate.p.value) over a fixed sample, the p-value decreases
 dramatically (looks as if converge to zero).
 
 (1) Why?
 (2) (If this test is wrong), then which test can check what I want to
 check, that is: are the two distributions of frequencies (observed and
 expected) in principle the same?
 (3) By the way, how to deal with low frequency cells?
 
 r - c(10, 100, 500, 1000, 2000, 5000)
 v - c(35, 40, 45, 45, 40, 35)
 sapply(list(r), function (x) { chisq.test(v, p=c(rep.int(40, 6)),
 rescale.p=T, simulate.p.value=T, B=x)$p.value })

This is a combination of user error and an infelicity in chisq.test.

You are sapply'ing over a list with one element, so essentially you are
doing

chisq.test(v, p=c(rep.int(40, 6)),
 rescale.p=T, simulate.p.value=T, B=r)$p.value

Now B is supposed to be a single integer, so the above cannot be
expected to do anything sensible, but you might have hoped for an error
message. Instead, it seems that you get the result of r[1] replications
divided by r+1:

 chisq.test(v, p=c(rep.int(40, 6)), rescale.p=T, simulate.p.value=T,
B=r)$p.value
[1] 0.636363636 0.069306931 0.013972056 0.006993007 0.003498251 0.001399720

 7/(r+1)
[1] 0.636363636 0.069306931 0.013972056 0.006993007 0.003498251 0.001399720

What you really wanted was

 sapply(r,function (x) { chisq.test(v, p=c(rep.int(40, 6)),
rescale.p=T, simulate.p.value=T, B=x)$p.value })
[1] 0.9090909 0.8118812 0.7964072 0.7672328 0.8025987 0.7932414



 Thank you, Sören
 
 
 --Sören Vogel, PhD-Student, Eawag, Dept. SIAM
 http://www.eawag.ch, http://sozmod.eawag.ch
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] chisq.test: decreasing p-value

2009-03-11 Thread David Winsemius


On Mar 11, 2009, at 6:36 AM, soeren.vo...@eawag.ch wrote:

A Likert scale may have produced counts of answers per category.  
According to theory I may expect equality over the categories. A  
statistical test shall reveal the actual equality in my sample.


When applying a chi square test with increasing number of  
repetitions (simulate.p.value) over a fixed sample, the p-value  
decreases dramatically (looks as if converge to zero).


(1) Why?


With low numbers of repetitions the test has low power, i.e, it may  
give you the wrong answer to the question: are those two vectors from  
the same distribution? As you increase in number, the simulated value  
approaches the truth.


(2) (If this test is wrong), then which test can check what I want  
to check, that is: are the two distributions of frequencies  
(observed and expected) in principle the same?


In principle they are not the same. Do you want a test that tells  
you they are?


(3) By the way, how to deal with low frequency cells?

r - c(10, 100, 500, 1000, 2000, 5000)
v - c(35, 40, 45, 45, 40, 35)
sapply(list(r), function (x) { chisq.test(v, p=c(rep.int(40, 6)),  
rescale.p=T, simulate.p.value=T, B=x)$p.value })


Thank you, Sören


--
Sören Vogel, PhD-Student, Eawag, Dept. SIAM
http://www.eawag.ch, http://sozmod.eawag.ch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Making a prediction model from a plot

2009-03-11 Thread joe1985

Hello

I have plot looking like this:

http://www.nabble.com/file/p22453304/Index%2Bgraf.jpeg 

Where each continues line corresponds to a period og days, and each break on
the line corresponds to a shift in the index level. 

So what i thought about was, is it possible to make a model predicting the
index level depending on the length of period?

Here is a bit of my dataset:

 b[b$a.CHR_NR==97592,]
 a.CHR_NR a.SALMO_INDEX   a.x a.diffdato1 a.akkusum
149297592  20.6 62  3434
149397592  18.7 62  1448
149497592  18.1 62  1462
149597592  19.8 16  1616
149697592  18.1 139   7 7
149797592  18.1 139  1017
149897592  22.2 139   320
149997592  22.2 139  1131
150097592  23.2 139   738
150197592  23.2 139  2361
150297592  26.2 139   768
150397592  26.2 139  3098
150497592  15.2  139  10   108
150597592  15.2  139   7   115
150697592  10.8 139   4   119
150797592  10.8 139  20   139
150897592  15.2  62   6 6
150997592  15.2  62  2834
151097592  12.3   62   438
151197592  12.3   62  2462
151297592  14.222  1010
151397592  14.222  1222
151497592  31.6   140  3232
151597592  35.1   140  2456
151697592  30.0   140  3995
151797592  10.9140  11   106
151897592  11.8140  34   140
151997592  10.1329  2828
152097592  10.6329  3058
152197592  57.0 329  3088
152297592  47.3 329  27   115
152397592  30.4 329  28   143
152497592  19.3 329  21   164
152597592  28.8 329  39   203
152697592  20.4  329  23   226
152797592  35.1  329  26   252
152897592  21.3   329  32   284
152997592  16.6   329  45   329

Where a.CHR_NR is a particular herd (i have a lot of herds),  a.SALMO_INDEX
is the index level (the responds variable), a.x is the length of period (the
lines on the plot), a.diffdato1 is the days in the period and a.akkusum is
the cumulated sum of a.diffdato1 making the breaks on the lines.

Hope you can help me  
 
-- 
View this message in context: 
http://www.nabble.com/Making-a-prediction-model-from-a-plot-tp22453304p22453304.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Date frame

2009-03-11 Thread jim holtman
It is called a list.

On Wed, Mar 11, 2009 at 6:32 AM, Tammy Ma metal_lical...@live.com wrote:

 Hi, All,

 How to make a data frame, each row of data frame store the different length 
 of vector?

 Thanks.
 Tammy

 _
 Show them the way! Add maps and directions to your party invites.
 http://www.microsoft.com/windows/windowslive/products/events.aspx
        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] odfWeave xtable still best packages to produce tables?

2009-03-11 Thread Jason Rupert

Just a quick question - I did a quick search of the R-search 
(http://search.r-project.org/nmz.html) [thanks to David Winsemius for 
recommending this neat R search engine] and R-seek (http://www.rseek.org/) 
trying to find a package(s) in order to produce publication quality tables.  

I found several references to odfWeave and xtable.  My question is, are those 
still best R packages to produce publication quality tables tables from R?

Thank you for any feedback and insights. 

P.S. I know I should be trying to Using Graphs Instead of Tables, but some 
customers (bosses) still like seeing tables.  Thanks again:)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] puzzled by math on date-time objects

2009-03-11 Thread Denis Chabot

Hi Phil,

Well thank you very much for this detailed explanation. It will help  
me when summarizing information over periods of time using either  
summarize (Hmisc) or summaryBy (doBy). Until now, doing so resulted in  
mean time for each group being transformed as a number of seconds,  
as you explain below. But both these functions do not put it back in a  
POSIX date-time object. I tried to do so by using as.POSIXct() but  
this failed because I did not provide a reference. From now on I'll  
try the structure command you used below.


Denis
Le 09-03-10 à 19:04, Phil Spector a écrit :


Denis -
  If you look inside of summary.POSIXct, you'll see the
following:

x - summary.default(unclass(object), digits = digits, ...)[1:6]

In other words, summary accepts the POSIX object, unclasses it
(resulting in a numeric value representing the number of seconds
since January 1, 1960), performs the operation, and then reassigns
the class.  You can do this basic trick yourself.  Suppose we have a  
vector of dates and want the median:


dates =  
as.POSIXct(c('2009-3-15','2009-2-19','2009-3-20','2009-2-18'))

median(dates)

Error in Summary.POSIXct(c(1235030400, 1237100400), na.rm = FALSE) :
 'sum' not defined for POSIXt objects

res = median(as.numeric(dates))
structure(res,class='POSIXct')

[1] 2009-03-02 23:30:00 PST

  I think it's clear that you can do any arithmetic operation on
dates this way, even if it doesn't make sense:


sum(dates)

Error in Summary.POSIXct(c(1237100400, 1235030400, 1237532400,
1234944000 :
 'sum' not defined for POSIXt objects

res = sum(as.numeric(dates))
structure(res,class='POSIXct')

[1] 2126-09-08 23:00:00 PDT

  I'm quite certain that median.POSIXct will be fixed pretty quickly,
but you can always unclass and reclass to do what you need.

- Phil






On Tue, 10 Mar 2009, Denis Chabot wrote:


Thanks Phil,

but how does summary() finds the median of the same type of object?  
I would have thought the algorithm used when the vector is even  
would  also require the SUM of the POSIX vector. I am glad of the  
solution you propose, but still puzzled a bit!


Denis
Le 09-03-10 à 12:39, Phil Spector a écrit :


Denis -
There is no median method for POSIX objects, although
there is a summary object.  Thus, when you pass a POSIX
object to median, it uses median.default, which contains
the following code:

 if (n%%2L == 1L)
sort(x, partial = half)[half]
 else sum(sort(x, partial = half + 0L:1L)[half + 0L:1L])/2
So when the length of your POSIX vector is odd, it works, but if  
it's even, it would need to take  the sum of a POSIX
object.  Of course, there is no sum method for POSIX objects,  
since it doesn't make sense.

Right now, it looks like your best bet for a summary of POSIX
objects is
summary(a)['Median']

- Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu
On Tue, 10 Mar 2009, Denis Chabot wrote:

Hi,
I don't understand the following. When I create a small  
artificial set of date information in class POSIXct, I can  
calculate the mean and the median:

a = as.POSIXct(Sys.time())
a = a + 60*0:10; a
[1] 2009-03-10 11:30:16 EDT 2009-03-10 11:31:16 EDT  
2009-03-10 11:32:16 EDT
[4] 2009-03-10 11:33:16 EDT 2009-03-10 11:34:16 EDT  
2009-03-10 11:35:16 EDT
[7] 2009-03-10 11:36:16 EDT 2009-03-10 11:37:16 EDT  
2009-03-10 11:38:16 EDT

[10] 2009-03-10 11:39:16 EDT 2009-03-10 11:40:16 EDT
median(a)
[1] 2009-03-10 11:35:16 EDT
mean(a)
[1] 2009-03-10 11:35:16 EDT
But for real data (for this post, a short subset is in object c)   
that I have converted into a POSIXct object, I cannot calculate  
the median with median(), though I do get it with summary():

c
[1] 2009-02-24 14:51:18 EST 2009-02-24 14:51:19 EST  
2009-02-24 14:51:19 EST
[4] 2009-02-24 14:51:20 EST 2009-02-24 14:51:20 EST  
2009-02-24 14:51:21 EST
[7] 2009-02-24 14:51:21 EST 2009-02-24 14:51:22 EST  
2009-02-24 14:51:22 EST

[10] 2009-02-24 14:51:22 EST
class(c)
[1] POSIXt  POSIXct
median(c)
Erreur dans Summary.POSIXct(c(1235505080.6, 1235505081.1), na.rm  
= FALSE) :

'sum' not defined for POSIXt objects
One difference is that in my own date-time series, some events  
are repeated (the original data contained fractions of seconds).  
But then, why can I get a median through summary()?

summary(c)
Min.   1st Qu. 
Median
2009-02-24 14:51:18 EST 2009-02-24 14:51:19 EST 2009-02-24  
14:51:20 EST
Mean   3rd  
Qu.  Max.
2009-02-24 14:51:20 EST 2009-02-24 14:51:21 EST 2009-02-24  
14:51:22 EST

Thanks in advance,
Denis Chabot
sessionInfo()
R version 2.8.1 Patched (2009-01-19 r47650)
i386-apple-darwin9.6.0
locale:

Re: [R] (no subject)

2009-03-11 Thread Uwe Ligges



Shuying Yang wrote:

Dear Members,

 

I have a question about using R2WinBUGS to obtain the WinBUGS results. 

 


By default, when R2WinBUGS returns summary stats, I got mean, sd, 2.5%, 25%, 
median, 75% and 97.5%.  Could anyone tell me how to modify the code to obtain 
5% and 95% summary results?



Hi,

please read the posting guide and use a sensible subject line (I almost 
missed your posting).


The used quantiles in the printed output are hardcoded, but you can 
calculate arbitrary quantiles from the data.


Ad an example, please consider the schools example from ?bugs, running 
the example results in an object called schools.sim. Now you can 
calculate arbitrary quantiles as e.g. in:


apply(schools.sim$sims.matrix, 2, quantile, c(0.05, 0.95))

Uwe Ligges

 


Many thanks

 


Alice

 

 


_
[[elided Hotmail spam]]

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lsmeans in R

2009-03-11 Thread John Fox
Dear Suman,

Chuck Cleland has already pointed you toward a reasonably complete
discussion of the topic (thank you Chuck). To update my contribution to that
discussion, the effects package now uses t-intervals for models with an
estimated dispersion parameter (such as linear models) and will create
displays for multinomial and proportional-odds logit models. (The latter
isn't relevant to your intended application, of course.)

Beyond that, I don't quite understand your question. Adjusted means are
fitted values, and this is what the effects package gives you for linear
models. Maybe you could explain in more detail what it is that you want and
why you think that you're not getting it.

Regards,
 John


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On
 Behalf Of suman Duvvuru
 Sent: March-11-09 2:45 AM
 To: r-help@r-project.org
 Subject: [R] lsmeans in R
 
 I need help with calculating lsmeans (adjusted means) of different terms
in
 a linear model including the main effect and the interaction effect terms.
I
 use lm to run the linear models...I previously noted from literature that
 that effects package can be used to generate lsmeans. But I tried to use
 it but could not figure out which option to use to get means. If anyone
can
 give an example of how to get lsmeans using lm object, that will very
 helpful.
 
 Thanks,
 SUman
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Date frame

2009-03-11 Thread Tammy Ma

I have produced the data frame(1) as:
 starting_timeEnding_time   
session_1: 14:36:1014:40:44
session_2: 14:40:47   14:41:47


for each session, I produced  different sequences as:
seq_1-c(4 ,3, 0 ,3 ,4 ,1, 0 ,1 ,4 ,3 ,4 ,3 ,4)
seq_2-c(3,4,1,2)
seq_3-Null


I want to put those to the 3rd column of data frame (1) corresponding the 
individual session.
How could I do so?

Originally, I want to create a data frame with each row contains different 
sequence and then put to 3rd column, which seems 
couldn't be done.

thanks.

Tammy


 Date: Wed, 11 Mar 2009 07:50:37 -0400
 Subject: Re: [R] Date frame
 From: jholt...@gmail.com
 To: metal_lical...@live.com
 CC: r-help@r-project.org
 
 It is called a list.
 
 On Wed, Mar 11, 2009 at 6:32 AM, Tammy Ma metal_lical...@live.com wrote:
 
  Hi, All,
 
  How to make a data frame, each row of data frame store the different length 
  of vector?
 
  Thanks.
  Tammy
 
  _
  Show them the way! Add maps and directions to your party invites.
  http://www.microsoft.com/windows/windowslive/products/events.aspx
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 -- 
 Jim Holtman
 Cincinnati, OH
 +1 513 646 9390
 
 What is the problem that you are trying to solve?

_
Show them the way! Add maps and directions to your party invites. 
http://www.microsoft.com/windows/windowslive/products/events.aspx
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] prediction error for test set-cross validation

2009-03-11 Thread Frank E Harrell Jr

Uwe Ligges wrote:



Mehmet U Ayvaci wrote:

Hi,
 

I have a database of 2211 rows with 31 entries each and I manually 
split my
data into 10 folds for cross validation. I build logistic regression 
model
as: 
 

model - glm(qual ~ AgGr + FaHx + PrHx + PrSr + PaLp + SvD + IndExam + 


Rad +BrDn + BRDS + PrinFin+ SkRtr + NpRtr + SkThck +TrThkc +
SkLes + AxAdnp + ArcDst + MaDen + CaDt + MaMG +
MaMrp + MaSh + SCTub + SCFoc + MaSz,
family=binomial(link=logit));

 

Where the  variables are taken from the trainSet of size 1989x31. The 
test

set is sized 222x31. Now my question is when I try to predict on the test
set it gives me the error:

 


predict.glm(model, testSet, type =response)


Error in drop(X[, piv, drop = FALSE] %*% beta[piv]) :
  subscript out of bounds

 

It does fine on trainSet. so it is something about the testSet. On the 
other

hand, I realized that some independent variables say MaSz takes 3
different values in the trainset vs. 4 different ones in the testSet. 
I am

not sure if this is the cause.If so, what would be the remedy?

 


Since I can retrieve the coefficients of the logistic regression, I could
manually calculate response for each entry in the testSet. This could 
solve
my problem although burdensome. But, I don't know an easy way of doing 
it as

my logistic regression have 80+ coefficients.



Well, if MaSz takes 3 different values in the trainset vs. 4 different 
ones in the testSet, then you won't even be able to calculate it by 
hand, because you got no coefficients for the 4th level of that factor.
Either you need the data to estimate coefficients from or you cannot 
predict.


Uwe Ligges


And note that your test sample is far too small to yield reliable 
results.  You need to use resampling (e.g., bootstrap or 50-fold repeats 
of 10-fold cross-validation).  See the validate function in the Design 
package.  Note that validate does not implement the proportion 
classified correctly because this is an improper scoring rule with 
minimum information/lowest precision/lowest power.


Frank Harrell





 

 


Could somebody advise?

 

 


Thanks,
M


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.




--
Frank E Harrell Jr   Professor and Chair   School of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] t-test, survey data

2009-03-11 Thread Thomas Lumley

On Wed, 11 Mar 2009, Marie Vandresse wrote:


*How can I do a t-test with survey data?*



The 'survey' package does analysis of survey data.  There isn't a t-test 
function; the easiest way to test for differences in means is with a linear 
regression model, using svyglm()

  -thomas



Thomas Lumley   Assoc. Professor, Biostatistics
tlum...@u.washington.eduUniversity of Washington, Seattle

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mean +/- SEM

2009-03-11 Thread Jorge Ivan Velez
Hi Stefo,
Perhaps this post might be useful:
http://www.nabble.com/error-bars-to22092367.html#a22092367


HTH,

Jorge


On Tue, Mar 10, 2009 at 6:39 PM, Stefo Ratino srat...@yahoo.com wrote:

 Hi all,

 I am looking for a R function which unables me to plot mean +/- SEM. Is
 there such a function in R?

 Many thanks,
 Stefo




[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using napredict in prcomp

2009-03-11 Thread Alain Paquette

Thanks for the answer.
Yes indeed, napredict only puts NAs in the right places (ie where 
na.action was used). Anyway I realize that I'm much better off dealing 
with the missing values prior to the PCA, and R offers many options for 
that.

Thanks
Alain


Prof Brian Ripley a écrit :

On Tue, 10 Mar 2009, Alain Paquette wrote:


Hello all,

I wish to compute site scores using PCA (prcomp) on a matrix with 
missing values, for example:


  DrainSlopeOrgL
a41NA
b2.5396
c6845
d3912
e3164
...

Where a,b... are sites.
The command
pca-prcomp(~ Drain + Slope + OrgL, data = t, center = TRUE, scale = 
TRUE, 

na.action=na.exclude)
works great, and from

pca$x

I can get site scores, e.g.

 PC1  PC2  PC3
a  NA   NA   NA
b -2.10475208 -2.315128625 -0.885197753
c  5.01177388 -1.778786252 -0.193285051
d  0.28638602  0.298315086  0.386113799
e -0.58861254  0.089498632 -0.434951813
...

Easy enough...
But how do I use the napredict argument? Is it intended as an 
argument to be used in the prcomp line (as suggested in ?prcomp), or 
is it to be used by itself, to replace NAs in the above site score 
matrix (which is what I really want to do).


I don't see an 'napredict' argument.  The help says

   x: if 'retx' is true the value of the rotated data (the centred
  (and scaled if requested) data multiplied by the 'rotation'
  matrix) is returned.  Hence, 'cov(x)' is the diagonal matrix
  'diag(sdev^2)'.  For the formula method, 'napredict' is
  applied to handle the treatment of values omitted by the
  'na.action'.

'napredict' is a function (so please look up its help), and it says R 
applies it, not that you need to.  It is napredict which gives you the 
NAs in the correct places.




Thank you,
Alain

--
Alain Paquette, Ph.D.
alain.paque...@gmail.com

Centre d'étude de la forêt (CEF)
Université du Québec à Montréal
www.cef-cfr.ca

Projet TRIADE
www.projettriade.ca
alain.paque...@projettriade.ca

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.





--
Alain Paquette, Ph.D.
alain.paque...@gmail.com

Centre d'étude de la forêt (CEF)
Université du Québec à Montréal
www.cef-cfr.ca

Projet TRIADE
www.projettriade.ca
alain.paque...@projettriade.ca

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to monthly,daily,yearly average

2009-03-11 Thread Qianfeng Li
Sorry, this is my first time to post. 

I have a big data set: first colume is date (ex: 2008-2-150, the second is time 
(10:30:00), and the following columes are variaty measurement data. Every 30 
min, I have one data. 

I want to find an effecient way to calculate the hourly, daily, monthly and 
yearly average, and plot them, and eventually use these average data to do 
further analysis. 

Thanks!
Jeff  


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cholesky Decomposition in R

2009-03-11 Thread Ravi Varadhan
You got an A+ on the homework, Doug! 

I got a C- for suggesting svd(), which clearly doesn't yield a lower (or
upper) triangular factorization.

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: rvarad...@jhmi.edu

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html







-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Douglas Bates
Sent: Tuesday, March 10, 2009 10:14 PM
To: Manli Yan
Cc: r-help@r-project.org
Subject: Re: [R] Cholesky Decomposition in R

On Tue, Mar 10, 2009 at 4:33 PM, Manli Yan manliyanrh...@gmail.com wrote:
  Hi everyone:
  I try to use r to do the Cholesky Decomposition,which is A=LDL',so 
 far I only found how to decomposite A in to  LL' by using chol(A),the 
 function
 Cholesky(A) doesnt work,any one know other command to decomposte A in 
 to LDL'

  My r code is:
 library(Matrix)
 A=matrix(c(1,1,1,1,5,5,1,5,14),nrow=3)

 chol(A)
     [,1] [,2] [,3]
 [1,]    1    1    1
 [2,]    0    2    2
 [3,]    0    0    3

 Cholesky(A)
 Error in function (classes, fdef, mtable)  :
  unable to find an inherited method for function Cholesky, for 
 signature matrix

 whatz wrong???

The answer, surprisingly, is in the documentation, accessible as

?Cholesky

which says that the first argument has to be a sparse, symmetric matrix.
Because the Cholesky function is intended for sparse matrices it is not the
best approach.  The object returned is rather obscure

 Cholesky(as(A, dsCMatrix), LDL = TRUE)
'MatrixFactorization' of Formal class 'dCHMsimpl' [package Matrix] with 10
slots
  ..@ x   : num [1:6] 1 1 1 4 1 9
  ..@ p   : int [1:4] 0 3 5 6
  ..@ i   : int [1:6] 0 1 2 1 2 2
  ..@ nz  : int [1:3] 3 2 1
  ..@ nxt : int [1:5] 1 2 3 -1 0
  ..@ prv : int [1:5] 4 0 1 2 -1
  ..@ colcount: int [1:3] 3 2 1
  ..@ perm: int [1:3] 0 1 2
  ..@ type: int [1:4] 2 0 0 1
  ..@ Dim : int [1:2] 3 3

It turns out that the factorization you want is encoded in the 'x'
slot but not in an obvious way.  Even if you ask for an expansion

 expand(Cholesky(as(A, dsCMatrix), LDL = TRUE))
$P

[1,] | . .
[2,] . | .
[3,] . . |

$L

[1,] 1 . .
[2,] 1 2 .
[3,] 1 2 3

the result is converted from the LDL' factor to the LL' factor.

A better approach is to consider how the LDL' factorization is related to
the R'R form of the factorization returned by chol()

 ch - chol(A)
 dd - diag(ch)
 L - t(ch/dd)
 DD - dd^2
 L
 [,1] [,2] [,3]
[1,]100
[2,]110
[3,]111
 DD
[1] 1 4 9

This is all rather backwards in that the whole purpose of the LDL'
form of the factorization is to avoid taking square roots to get the
diagonal elements and to contend with positive semidefinite matrices.
In other words, the LDL' form avoids some of the possible problems of the
LL' form but not if you go through the LL' form to get to it.

I think the underlying reason that an LDL' form is not directly available in
R is because there is no Lapack subroutine for it.

Let me know what our grade on the homework is.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to monthly,daily,yearly average

2009-03-11 Thread Tammy Ma

Check as.Date

 Date: Wed, 11 Mar 2009 06:25:44 -0700
 From: qflic...@yahoo.com
 To: r-help@r-project.org
 Subject: [R] How to monthly,daily,yearly average
 
 Sorry, this is my first time to post. 
 
 I have a big data set: first colume is date (ex: 2008-2-150, the second is 
 time (10:30:00), and the following columes are variaty measurement data. 
 Every 30 min, I have one data. 
 
 I want to find an effecient way to calculate the hourly, daily, monthly and 
 yearly average, and plot them, and eventually use these average data to do 
 further analysis. 
 
 Thanks!
 Jeff  
 
 
   
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

_
News, entertainment and everything you care about at Live.com. Get it now!
http://www.live.com/getstarted.aspx
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Ggplot2: saving a grid with multiple plots

2009-03-11 Thread ONKELINX, Thierry
Hi all,

I have managed to create a figure on the screen with multiple plots in
it. Something like the example below. When I save that with ggsave(),
only the last plot gets saved (pPath in the example) instead of the
entire figure. Any suggestions how I can save this kind of figures
automated?

Thanks,

Thierry

library(ggplot2)
pPoint - qplot(unemploy/pop, psavert, data=economics)
pPath - qplot(unemploy/pop, psavert, data=economics, geom=path)
grid.newpage()
pushViewport(viewport(layout = grid.layout(2, 1)))
print(pPoint, vp = viewport(layout.pos.row = 1, layout.pos.col = 1))
print(pPath, vp = viewport(layout.pos.row = 2, layout.pos.col = 1))
ggsave(filename = test.pdf)



ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
methodology and quality assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium 
tel. + 32 54/436 185
thierry.onkel...@inbo.be 
www.inbo.be 

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey

Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer 
en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is
door een geldig ondertekend document. The views expressed in  this message 
and any annex are purely those of the writer and may not be regarded as stating 
an official position of INBO, as long as the message is not confirmed by a duly 
signed document.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] chisq.test: decreasing p-value

2009-03-11 Thread David Winsemius
Thanks to Peter Dalgaard for the correct answer. I misinterpreted what  
R was returning.



On Mar 11, 2009, at 7:32 AM, David Winsemius wrote:



On Mar 11, 2009, at 6:36 AM, soeren.vo...@eawag.ch wrote:

A Likert scale may have produced counts of answers per category.  
According to theory I may expect equality over the categories. A  
statistical test shall reveal the actual equality in my sample.


When applying a chi square test with increasing number of  
repetitions (simulate.p.value) over a fixed sample, the p-value  
decreases dramatically (looks as if converge to zero).


(1) Why?


With low numbers of repetitions the test has low power, i.e, it may  
give you the wrong answer to the question: are those two vectors  
from the same distribution? As you increase in number, the simulated  
value approaches the truth.


(2) (If this test is wrong), then which test can check what I want  
to check, that is: are the two distributions of frequencies  
(observed and expected) in principle the same?


In principle they are not the same. Do you want a test that tells  
you they are?


(3) By the way, how to deal with low frequency cells?

r - c(10, 100, 500, 1000, 2000, 5000)
v - c(35, 40, 45, 45, 40, 35)
sapply(list(r), function (x) { chisq.test(v, p=c(rep.int(40, 6)),  
rescale.p=T, simulate.p.value=T, B=x)$p.value })





David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Forecasting with dlm

2009-03-11 Thread Michael Pearmain
Hi All,
I have a problem trying to forecast using the dlm package, can anyone offer
any advise?

I setup my problem as follows, (following the manual as much as possible)

data for example to run code

CostUSD - c(27.24031,32.97051, 38.72474, 22.78394, 28.58938, 49.85973,
42.93949, 35.92468)
library(dlm)

buildFun - function(x) {
dlmModPoly(1, dV = exp(x[1]), dW = exp(x[2]))
}
fit - dlmMLE(CostUSD, parm = c(0,0), build = buildFun)
fit$conv
dlmCostUSD - buildFun(fit$par)
V(dlmCostUSD)
W(dlmCostUSD)
#For comparison
StructTS(CostUSD, level)

CostUSDFilt - dlmFilter(CostUSD, dlmCostUSD)
CostUSDFore - dlmForecast(CostUSDFilt, nAhead = 1)

after which i return the error message:

Error in mod$m[lastObsIndex, ] : incorrect number of dimensions

Can anyone offer any insight to this problem?

Thanks in advance

Mike

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] adding text and other elements to ggplot2 plots

2009-03-11 Thread levyofi

Hello,
I really like the interface and flexibility of the ggplots package. However,
I cannot find how to add text to a plot (like the 'text' and 'rect'
functions in the graphics package). I will appreciate any suggestions.
Thanks,
Ofir.
-- 
View this message in context: 
http://www.nabble.com/adding-text-and-other-elements-to-ggplot2-plots-tp22455395p22455395.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Is this a documentation bug? Spss dates import

2009-03-11 Thread Luca Braglia
Hello R-user

bug seekers are needed!
In order to perform these simple tasks you have to use a copy of SPSS
and obviously R.
The problem is that date conversion of data coming from SPSS
gives wrong results, if we follow ?as.POSIXct

## SPSS dates (R-help 2006-02-17)
 z - c(10485849600, 10477641600, 10561104000, 10562745600)
 as.Date(as.POSIXct(z, origin=1582-10-14, tz=GMT))


Note that ?as.POSIXct is coherent with the SPSS 'Programming and 
Data Management' guide (pag. 116):

'Internally, dates and date/times are stored as the number of seconds 
from October 14, 1582, and times are stored as the number of seconds 
from midnight.'

I think the SPSS 'Programming and Data Management' is not very clear:
times are stored as the number of seconds from midnight, but which midnight?
13th or 14th of October one? I think about 14th , so 
as.Date(as.POSIXct(z, origin=1582-10-14, tz=GMT))
has to be changed to
as.Date(as.POSIXct(z, origin=1582-10-15, tz=GMT))


Test:
-
Let's create a vector of dates in SPSS and save it in C:\\date.sav

DATA LIST / mydate (date).
BEGIN DATA.
01/01/1960
11/07/1955
25/11/1962
08/06/1959
28-01-2003
15,03,03
1/1/1997
01-JAN-1998
END DATA.
save outfile = C:\\date.sav /compressed.



Now we use R:

library(foreign)
test.df - read.spss(C://date.sav, to.data.frame=T)

now following ?as.POSIXct

test.df$newdate - as.Date(as.POSIXct(test.df$MYDATE , origin=1582-10-14))

But if you take a look at the vector test.df$newdate you got:

R.date = SPSS.date - 1 day

Please confirm this! I would like to be sure that i didn't made mistakes (I've
used SPSS 11 + R 2.8)


If you come back to data, changing 14 to 15

test.df$newdate2 - as.Date(as.POSIXct(test.df$MYDATE, origin=1582-10-15))

assures that

R.date = SPSS.date


 test.df
   MYDATEnewdate   newdate2
1 1190376 1959-12-31 1960-01-01
2 11762496000 1955-07-10 1955-07-11
3 11995257600 1962-11-24 1962-11-25
4 11885875200 1959-06-07 1959-06-08
5 13263091200 2003-01-27 2003-01-28
6 13267065600 2003-03-14 2003-03-15
7 13071456000 1996-12-31 1997-01-01
8 13102992000 1997-12-31 1998-01-01


You got that, do you?

It is needed to (eventually) ask clarifications
about SPSS to Raynald Levesque (the author of SPSS 'Programming and 
Data Management' guide) and to submit a bug for ?as.POSIXct

thank you
  Luca

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sparse PCA in R

2009-03-11 Thread Kevin Wright
Also look at the pcaMethods package (on Bioconductor).

Kevin Wright


On Tue, Mar 10, 2009 at 2:59 PM, Christos Hatzis
christos.hat...@nuverabio.com wrote:
 Take a look at the elasticnet package.

 -Christos

 -Original Message-
 From: r-help-boun...@r-project.org
 [mailto:r-help-boun...@r-project.org] On Behalf Of joris meys
 Sent: Tuesday, March 10, 2009 3:43 PM
 To: R-help Mailing List
 Subject: [R] Sparse PCA in R

 Dear all,

 I would like to perform a sparse PCA, but I didn't find any
 library offering me this in R. Is there one available, or do
 I have to write the functions myself?

 Kind regards
 Joris Meys

       [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cholesky Decomposition in R

2009-03-11 Thread Simon Wood

A - matrix(runif(25),5,5); A - A%*%t(A) ## Example +ve def matrix

U - chol(A)   ## start from factor given by chol
D - diag(U)   ## extract sqrt(D) 
L - t(U/D) ## get unit lower triangular factor
D - diag(D^2)  ## and diagonal
## So now A = LDL' 
range(A-L%*%D%*%t(L)) 

#best,
#Simon


On Tuesday 10 March 2009 21:33, Manli Yan wrote:
   Hi everyone:
   I try to use r to do the Cholesky Decomposition,which is A=LDL',so far I
 only found how to decomposite A in to  LL' by using chol(A),the function
 Cholesky(A) doesnt work,any one know other command to decomposte A in to
 LDL'

   My r code is:
 library(Matrix)
 A=matrix(c(1,1,1,1,5,5,1,5,14),nrow=3)

  chol(A)

  [,1] [,2] [,3]
 [1,]111
 [2,]022
 [3,]003

  Cholesky(A)

 Error in function (classes, fdef, mtable)  :
   unable to find an inherited method for function Cholesky, for signature
 matrix

 whatz wrong???
 thanks~

   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html and provide commented, minimal,
 self-contained, reproducible code.

-- 
 Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK
 +44 1225 386603  www.maths.bath.ac.uk/~sw283

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] odfWeave xtable still best packages to produce tables

2009-03-11 Thread Max Kuhn
Jason,

 I found several references to odfWeave and xtable.  My question is, are those 
 still best R packages to produce publication quality tables tables from R?

I think the file type you want (and how you will use the output)
should probably drive your choice. If you are going to pdf, then
Sweave/xtable/Hmisc:::latex are the best choices. If you are going to
a more editable file format (doc or odt), then consider odfWeave or
pdf/odf converter (which I think what Dr. Harrell does). HTML is
always an option, but you'll have to covert the document if that is
not your target format.

-- 

Max

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating an Excel file with multiple spreadsheets

2009-03-11 Thread Hans-Peter Suter
2009/3/9 Phil Spector spec...@stat.berkeley.edu:
 The xlsReadWrite package provides write.xls for Windows,

but it cannot write _multiple_ spreadsheets

-- 
Regards,
Hans-Peter

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] LDL' Cholesky decomposition

2009-03-11 Thread Terry Therneau

  The gchol function in library(kinship) does an LDL decomposition.  An updated 
version has just recently been posted on Rforge, in the bdsmatrix library which 
is part of survival.
  
 temp - matrix(c(1,1,1,1,5,8,1,8,14), 3)
 gt - gchol(temp)

 as.matrix(gt) # L
 [,1] [,2] [,3]
[1,]1 0.000
[2,]1 1.000
[3,]1 1.751

 diag(gt) # D
[1] 1.00 4.00 0.75

Terry Therneau

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to monthly,daily,yearly average

2009-03-11 Thread Gabor Grothendieck
Here is an example.  See the three vignettes in zoo:
vignette(zoo)
and R News 4/1 for info on dates.

# assume this input

Lines - Date Time Value
01/01/08 00:00:00 1
01/01/08 00:30:00 2
01/01/08 01:00:00 3
01/01/08 01:30:00 4
01/01/08 02:00:00 5
01/01/08 02:30:00 6
01/01/08 03:00:00 7
01/01/08 03:30:00 8
01/01/08 04:00:00 9

# read in the data, convert it to zoo, aggregate it and plot

library(zoo)
library(chron)

# DF - read.table(myfile.dat, header = TRUE, as.is = TRUE)
DF - read.table(textConnection(Lines), header = TRUE, as.is = TRUE)

# next line may need to specify formats, if different
z - zoo(DF$Value, chron(DF$Date, DF$Time))

# hourly aggregation
z.hr - aggregate(z, trunc(time(z), 01:00:00), mean)
plot(z.hr, type = o)

# daily aggregation
z.day - aggregate(z, trunc, mean)
plot(z.day, type = o)

# monthly aggregation
z.mo - aggregate(z, as.yearmon, mean)
plot(z.mo, type = o)

# yearly aggregation
z.yr - aggregate(z.mo, floor, mean)
plot(z.yr, type = o)


On Wed, Mar 11, 2009 at 9:25 AM, Qianfeng Li qflic...@yahoo.com wrote:
 Sorry, this is my first time to post.

 I have a big data set: first colume is date (ex: 2008-2-150, the second is 
 time (10:30:00), and the following columes are variaty measurement data. 
 Every 30 min, I have one data.

 I want to find an effecient way to calculate the hourly, daily, monthly and 
 yearly average, and plot them, and eventually use these average data to do 
 further analysis.

 Thanks!
 Jeff



        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] trying to run odfWeave()

2009-03-11 Thread Max Kuhn
  Unzipping ODF file using unzip -o SampleOdf.odt
 Erreur dans odfWeave(file.in, file.out) : Error unzipping file
 De plus : Warning message:
 In system(zipCmd[2], invisible = TRUE) : unzip introuvable

I get this question about twice a week.

Look at the help file. ?odfWeave says:

Since ODF files are compressed archives of files and directories, R
will need to zip and unzip the source file. While R has an unzip
utility, it does not have one for re-zipping files, so an external
application is needed. unzip and zip are free utilities located at

http://www.info-zip.org/;

 Then, I have a second question, which totally independent from the one above 
 (I have an error in both cases...) : is RepportOdf.odt automatically 
 generated/created or should I create the document before ?

You will need to create an odt document with the R code in it, then

   odfWeave(inFile, outFile)

will take the input odt file and produce an output odt file. Please see:

   http://www.r-project.org/conferences/useR-2007/program/presentations/kuhn.pdf

for an idea of how it works.

And read the posting guide.

Max

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Bias correction for random forests?

2009-03-11 Thread Zhou Fang

Hi,

Way back in 2004, an update to randomForest added an option 'corr.bias'. 
The explanation was a bit vague, but it turns out it improves RF's 
predictive fit with my data substantially. But I am having trouble 
understanding it.


Does anyone know what this 'bias correction' actually does? Or what the 
justification for it is? Or when it would be necessary? Is there a paper 
I can look at?


And is the feature likely to emerge from 'experimental' any time soon?

Zhou Fang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Easy Recall to get ls(..., all.names=TRUE)?

2009-03-11 Thread Szumiloski, John
Dear useRs,

I have a utility function which is meant to be a clone of ls(), except
with the option all.names=TRUE.  Currently however, the function merely
consists of a copy of the source code of ls(), except the default value
of all.names is different.  That approach has the drawback of future
inconsistency if the code for ls() ever changes.  No comment on whether
that is likely or not; I would and do object to the current construction
in principle.

What I would like to do is rewrite the new function so that it does very
minimal processing of its arguments, and then calls ls() with the new
arguments, somewhat in the same spirit of the Recall() function.  

A challenge to me has been twofold, that this new ls() call has a
different search path, and that ls() itself has a good deal of lazy
evaluation (including possibly twice in one line, around line 11) in it.
Getting my new function to work with all permutations of arguments,
without merely copying the code from ls(), has been futile.  

I am sure I could manually enumerate the behavior for all permutations
of all the arguments to ls(), and code each case individually.  I am
also sure I will not waste my time doing that.  I was hoping, however,
that there was some simple trick to allow this easily, one that I have
missed.  Even an explanation of why this might be a fool's errand would
be quite helpful to my understanding of the intricacies of R evaluation.


Many thanks, 
John
John  Szumiloski,  Ph.D.

Senior Biometrician
Biometrics Research
WP53B-120
Merck Research Laboratories
P.O. Box 0004
West Point, PA 19486-0004

 (215) 652-7346 (PH)
 (215) 993-1835 (FAX)
 
Notice:  This e-mail message, together with any attachme...{{dropped:15}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] About the merge

2009-03-11 Thread Tammy Ma

hi, All.

 DateDtime Hour Min Second Rep App_dur
9  2006-02-22 14:36:11   14  36 11   4   1
10 2006-02-22 14:36:12   14  36 12   3  86
11 2006-02-22 14:37:38   14  37 38   0  58
14 2006-02-22 14:38:36   14  38 36   3   1
15 2006-02-22 14:38:37   14  38 37   4   1
16 2006-02-22 14:38:38   14  38 38   1   9
18 2006-02-22 14:38:47   14  38 47   0   3
20 2006-02-22 14:38:50   14  38 50   1   1
21 2006-02-22 14:38:51   14  38 51   4   2   ***
23 2006-02-22 14:38:53   14  38 53   4  39  ***
25 2006-02-22 14:39:32   14  39 32   3   1
26 2006-02-22 14:39:33   14  39 33   4   1
27 2006-02-22 14:39:34   14  39 34   3   8
28 2006-02-22 14:39:42   14  39 42   4  62

How to write a program to merge Rep with  consecutive equal numbers to one of 
that number and sum up App_dur??

on the above data frame, i want to merge
21 2006-02-22 14:38:51   14  38 51   4   2   ***

23 2006-02-22 14:38:53   14  38 53   4  39  ***

to 2006-02-22 14:38:51   14  38 51   4   41 

and represent original two *** by the new one like:
9  2006-02-22 14:36:11   14  36 11   4   1

10 2006-02-22 14:36:12   14  36 12   3  86

11 2006-02-22 14:37:38   14  37 38   0  58

14 2006-02-22 14:38:36   14  38 36   3   1

15 2006-02-22 14:38:37   14  38 37   4   1

16 2006-02-22 14:38:38   14  38 38   1   9

18 2006-02-22 14:38:47   14  38 47   0   3

20 2006-02-22 14:38:50   14  38 50   1   1
2006-02-22 14:38:51   14  38 51   4   41 ***
25 2006-02-22 14:39:32   14  39 32   3   1

26 2006-02-22 14:39:33   14  39 33   4   1

27 2006-02-22 14:39:34   14  39 34   3   8

28 2006-02-22 14:39:42   14  39 42   4  62

??
Thanks.

Tammy



_
Drag n’ drop—Get easy photo sharing with Windows Live™ Photos.

http://www.microsoft.com/windows/windowslive/products/photos.aspx
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Is there any difference between - and =

2009-03-11 Thread Sean Zhang
Dear R-helpers:

I have a question related to - and =.

I saw very experienced R programmers use = rather than - quite
consistently.
However, I heard from others that do not use = but always stick to - when
assigning valuese.

I personally like = because I was using Matabl, But, would like to receive
expert opinion to avoid potential trouble.

Many thanks in advance.

-Sean

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] chisq.test: decreasing p-value

2009-03-11 Thread soeren . vogel
Thanks to Peter, David, and Michael! After having corrected the coding  
error, the p values converge to particular value, not necessarily  
zero. The whole story is, 634 respondents in 6 different areas marked  
their answer on a 7-step Likert scale (very bad, bad, ..., very good  
-- later recoded to 5 scale levels). The statistical question now is,  
do the answer's distributions (amount of goods, bads etc.) in either  
area differ from the mean answer-distribution calculated with  
summing up all goods, bads, etc. Anyway an omnibus chi square would  
not answer my question, and due to spurious significances I'd rather  
go back to my chi square book ;-) (for the interested, see http://sozmod.eawag.ch/files/file.Robj 
 for the entire table).


Thanks for your help

Sören

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mixed models fixed effects

2009-03-11 Thread Emma Stone

Dear All,

This may sound like a dumb question but I am trying to use a mixed model to 
determine the predictors of bat activity along hedges within 8 sites. So my 
response is continuous (bat passes) my predictors fixed effects are 
continuous (height metres), width (metres) etc and the random effect is 
site  - can you tell me if the fixed effects can be continuous as all the 
examples I have read show them as categorical, but this is not covered in 
any documents I can find.


Help!

Emma

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mixed models fixed effects

2009-03-11 Thread ONKELINX, Thierry
Hi Emma,

Continuous predictors are no problem at all. You can mix both continuous
and categorial predictors if needed. I suppose your response are counts
(the number of bats that passes)? In that case a generalised linear
mixed model is more appropriate. With the lme4 package you could try
something like this:

library(lme4)
Model - glmer(BatPasses ~ Width + Height + (1|Site), family = poisson)

HTH,

Thierry

PS There is a mailing list dedicated to mixed models: R-Sig-MixedModels


ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
methodology and quality assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium 
tel. + 32 54/436 185
thierry.onkel...@inbo.be 
www.inbo.be 

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey

-Oorspronkelijk bericht-
Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
Namens Emma Stone
Verzonden: woensdag 11 maart 2009 15:29
Aan: r-help@r-project.org
Onderwerp: Re: [R] Mixed models fixed effects

Dear All,

This may sound like a dumb question but I am trying to use a mixed model
to 
determine the predictors of bat activity along hedges within 8 sites. So
my 
response is continuous (bat passes) my predictors fixed effects are 
continuous (height metres), width (metres) etc and the random effect is 
site  - can you tell me if the fixed effects can be continuous as all
the 
examples I have read show them as categorical, but this is not covered
in 
any documents I can find.

Help!

Emma

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer 
en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is
door een geldig ondertekend document. The views expressed in  this message 
and any annex are purely those of the writer and may not be regarded as stating 
an official position of INBO, as long as the message is not confirmed by a duly 
signed document.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is there any difference between - and =

2009-03-11 Thread Duncan Murdoch

On 3/11/2009 10:18 AM, Sean Zhang wrote:

Dear R-helpers:

I have a question related to - and =.

I saw very experienced R programmers use = rather than - quite
consistently.
However, I heard from others that do not use = but always stick to - when
assigning valuese.

I personally like = because I was using Matabl, But, would like to receive
expert opinion to avoid potential trouble.


Use - for assignment, and = for function arguments.  Then the 
difference between


 f( a = 3 )
 f( a - 3 )

is clear, and you won't be surprised that a gets changed in the second 
case.  If you use = for assignment, the two lines above will be written as


 f( a = 3 )
 f( ( a = 3 ) )

and it is very easy to miss the crucial difference between them.

Duncan Murdoch



Many thanks in advance.

-Sean

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] problem with rfImpute (package randomForest)

2009-03-11 Thread lionel fugon

Hello everybody,

this is my first request about R so I am sorry if I send it to a bad mail or if 
I am not very clear.

So my problem is about the use of rfImpute from randomForest package. I am 
interested in imputations of missing values and I read that randomForest can 
make it. So i write the following code :

set.seed(100);
library(mlbench)
library(randomForest)
data(BreastCancer)
summary(BreastCancer)
data=BreastCancer[,-1]
data=data[!is.na(data[,Bare.nuclei]),]
summary(data)


is.factor(data$Cl.thickness)# OK


##selection of missing values##
x=1:nrow(data)
sample1=sample(x,70)
sample3=sample(x,70)
sample5=sample(x,70)


##replace by missing values#
data_missing=data
data_missing[sample1,1]=NA
data_missing[sample3,3]=NA
data_missing[sample5,5]=NA
summary(data_missing)


is.factor(data_missing$Cl.thickness)# OK


imputation by random forest
data_imputed - rfImpute(Class ~ .,data_missing,iter=5,ntree=1000)


is.factor(data_imputed$Cl.thickness)# Not OK



And as you can see, rfImpute change the type of one explanatory variable. 
Before imputation, it was a factor. After it becomes a quantitative variable. 
So I don't understand what it happens. Maybe I should add an option in 
rfImpute...
If someone could help me to understand.

Thank you very much




_

? Lancez-vous !

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to rotate the histogram

2009-03-11 Thread Abelian
Dear all
I want to combine two figures on one plot. First figure display the
data distribution. The x-axis of first figure is individual's ranking
and the y-axis of first figure is probability. However, the second
figure is a histogram about probability. In order to make the figure
more clearly, i want to put those two figure together. But i have a
problem that i can not rotate the histogram.Because i want put the x-
axis of histogram on the left side of first figure.
Can anyone provide some way to solve this problem.
Sinerely

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] RServe

2009-03-11 Thread Maxl18

Dear all,

I'm trying to use R-functions from Java. I want to use the Package Rserve.
Everthing is installed and I can run my Java-testprogramm.

...
Rconnection r = new Rconnection();
double[] d = r.eval(rnorm(10)).asDoubleArray();
...

But now I get always arrays with ten 0.0, which is not correct

I suppose, that is is a problem from R, because at the fist time (last week)
I started RServe I got an icon in my taskbar which was something about a R
server. But I never got it since then...
I tried to reinstall R, but it didn't help.

Where is teh problem and what can I do?

Thanks a lot, Max
-- 
View this message in context: 
http://www.nabble.com/RServe-tp22452051p22452051.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Combining math and variables in expression

2009-03-11 Thread socrates

I am trying to get the following line in a plot margin using mtext:

100% Area = 120.000 km^2

Where I intend that 100% Area = is text, 120.000 is a number that varies
according to different data, and km^2 should be a neat
km-with-superscript-2. 
The expression function fails me, since it apparently cannot coerce an
expression when a variable is involved. 
In the R help file there is an example using either bquote or substitute to
accomplish like things, but these are sufficiently different from my
objective and I am becoming more and more confused by every further example
I stumble on. 
This should be easy. Can anyone help me prevent spending anoter 2.5 hours on
the matter? Thanks.
-- 
View this message in context: 
http://www.nabble.com/Combining-math-and-variables-in-expression-tp22455877p22455877.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating a directory for my data

2009-03-11 Thread miya

Sorry, I did forget to add what it is I want to get. I wish to get a result
like the following:

Legend:
ID   Title
1 Article A
2 Article B
3 Article C
4 Article D
...
i  Article N


-- 
View this message in context: 
http://www.nabble.com/Creating-a-directory-for-my-data-tp22448335p22455535.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] adding text and other elements to ggplot2 plots

2009-03-11 Thread hadley wickham
Have a look at the annotations section of
http://had.co.nz/ggplot2/book/toolbox.pdf

Hadley

On Wed, Mar 11, 2009 at 8:44 AM, levyofi levy...@post.tau.ac.il wrote:

 Hello,
 I really like the interface and flexibility of the ggplots package. However,
 I cannot find how to add text to a plot (like the 'text' and 'rect'
 functions in the graphics package). I will appreciate any suggestions.
 Thanks,
 Ofir.
 --
 View this message in context: 
 http://www.nabble.com/adding-text-and-other-elements-to-ggplot2-plots-tp22455395p22455395.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mixed models fixed effects

2009-03-11 Thread Simon Pickett

Also check out these pdfs
http://cran.r-project.org/other-docs.html

and try to get your hands on the bible
http://www.amazon.co.uk/R-Book-Michael-J-Crawley/dp/0470510242

Simon.






Hi Emma,

Continuous predictors are no problem at all. You can mix both continuous
and categorial predictors if needed. I suppose your response are counts
(the number of bats that passes)? In that case a generalised linear
mixed model is more appropriate. With the lme4 package you could try
something like this:

library(lme4)
Model - glmer(BatPasses ~ Width + Height + (1|Site), family = poisson)

HTH,

Thierry

PS There is a mailing list dedicated to mixed models: R-Sig-MixedModels


ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
methodology and quality assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium
tel. + 32 54/436 185
thierry.onkel...@inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey

-Oorspronkelijk bericht-
Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
Namens Emma Stone
Verzonden: woensdag 11 maart 2009 15:29
Aan: r-help@r-project.org
Onderwerp: Re: [R] Mixed models fixed effects

Dear All,

This may sound like a dumb question but I am trying to use a mixed model
to
determine the predictors of bat activity along hedges within 8 sites. So
my
response is continuous (bat passes) my predictors fixed effects are
continuous (height metres), width (metres) etc and the random effect is
site  - can you tell me if the fixed effects can be continuous as all
the
examples I have read show them as categorical, but this is not covered
in
any documents I can find.

Help!

Emma

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver 
weer
en binden het INBO onder geen enkel beding, zolang dit bericht niet 
bevestigd is

door een geldig ondertekend document. The views expressed in  this message
and any annex are purely those of the writer and may not be regarded as 
stating
an official position of INBO, as long as the message is not confirmed by a 
duly

signed document.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RES: North Arrow (.png file) on a Map

2009-03-11 Thread Greg Snow
Another possibility is the my.symbols function in the TeachingDemos package.  
You can define a polygon of your arrow (or other symbol), then use my.symbols 
to add it to an existing graph, the advantage of my.symbols is that the size of 
the arrow is absolute, not in terms of the scale of the original plot.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Rodrigo Aluizio
 Sent: Wednesday, March 11, 2009 4:22 AM
 To: 'Yihui Xie'
 Cc: R Help
 Subject: [R] RES: North Arrow (.png file) on a Map
 
 Thanks a lot Yihui, It's perfect, in fact exactly identical the I have
 as a
 png file in my computer.
 Thank you so much.
 
 Regards
 Rodrigo
 
 -Mensagem original-
 De: Yihui Xie [mailto:xieyi...@gmail.com]
 Enviada em: quarta-feira, 11 de março de 2009 02:48
 Para: Rodrigo Aluizio
 Cc: R Help
 Assunto: Re: [R] North Arrow (.png file) on a Map
 
 Is this arrow satisfactory for you?
 
 north.arrow = function(x, y, h) {
 polygon(c(x, x, x + h/2), c(y - h, y, y - (1 + sqrt(3)/2) * h),
 col = black, border = NA)
 polygon(c(x, x + h/2, x, x - h/2), c(y - h, y - (1 + sqrt(3)/2) *
 h, y, y - (1 + sqrt(3)/2) * h))
 text(x, y, N, adj = c(0.5, 0), cex = 4)
 }
 plot(1, type = n, ylim = c(0, 1))
 north.arrow(1, 0.8, 0.3)
 
 Regards,
 Yihui
 --
 Yihui Xie xieyi...@gmail.com
 Phone: +86-(0)10-82509086 Fax: +86-(0)10-82509086
 Mobile: +86-15810805877
 Homepage: http://www.yihui.name
 School of Statistics, Room 1037, Mingde Main Building,
 Renmin University of China, Beijing, 100872, China
 
 
 
 On Tue, Mar 10, 2009 at 7:21 PM, Rodrigo Aluizio r.alui...@gmail.com
 wrote:
  Hi list.
 
  I would like to know how do I insert a North arrow, stored as a png
 file
 in
  my computer, in a map? I found lots of post asking similar things,
 one of
  them mentioned the pixmap package.  The map was done using map() and
  shapefiles (the code is below). I'm using the pixmap () and addlogo()
  functions. Well I can import the png with pixmap() function (I guess,
 once
  there's no error message), but I can't put It on the map, I got an
 error
  message telling me that:
 
 
 
  Error at t(x...@index[nrow(x...@index):1, , drop = FALSE]) :
 
   index out of limits
 
 
 
  Well I tried changing coordinates but I always got the same result.
 How do
 I
  do this correctly? Is there a better way?
 
 
 
  Thanks for the help and attention.
 
 
 
  Here is the complete map script:
 
 
 
  library(RODBC)
 
  library(maps)
 
  library(mapdata)
 
  library(maptools)
 
  library(pixmap)
 
  #Carregar Coordenadas e dados dos Pontos Amostrais
 
  Dados-odbcConnectExcel('Campos.xls',readOnly=T)
 
  Coord-sqlFetch(Dados,'CoordMed',colnames=F,rownames='Ponto')
 
  odbcClose(Dados)
 
  N-pixmap('Norte.png',nrow=166,ncol=113)
 
  # Carregar pontos e shapes
 
 
 Batimetria-
 readShapeSpatial('C:/Users/Rodrigo/Documents/UFPR/Micropaleontol
  ogia/Campos/ShapeFiles/Batimetria_BC.shp')
 
 
 Estados-
 readShapeSpatial('C:/Users/Rodrigo/Documents/UFPR/Micropaleontologi
  a/Campos/ShapeFiles/Estados_Sudeste.shp')
 
 
 Faciologia-
 readShapeSpatial('C:/Users/Rodrigo/Documents/UFPR/Micropaleontol
  ogia/Campos/ShapeFiles/Faciologia_BC.shp')
 
  # Mapa com os Pontos da Bacia
 
 
 postscript('MapaCampos.eps',paper='special',onefile=F,horizontal=F,widt
 h=3.5
  ,height=4.5,bg='white',pointsize=3)
 
  par(mar=c(3,2,2,0))
 
 
 map('worldHires','brazil',ylim=c(23.9,20.3),xlim=c(42.1,39.2),type='n')
 
 
 plot(Faciologia,ylab='',xlab='',col=c('lightgreen','lightgreen','lightg
 reen'
 
 ,'lightgreen','lightgreen','lightgray','lightgray','lightgray','lightgr
 ay','
 
 lightgray','lightgray','lightgray','lightgray','lightgray','lightgray',
 'ligh
 
 tgray','lightgray','lightgray','lightgray','lightgray','lightgray','lig
 htyel
  low','lightyellow','lightyellow'),add=T,lwd=0.5,border=0)
 
 
 plot(Batimetria,ylab='',xlab='',col='darkgray',lty='solid',lwd=0.2,add=
 T)
 
  plot(Estados,ylab='',xlab='',lty='solid',add=T,lwd=0.8)
 
 
 text(Coord$Longitude[Coord$Réplicas=='1'],Coord$Latitude[Coord$Réplicas
 =='1'
  ],rownames(Coord)[Coord$Réplicas=='1'],col='red',cex=0.5,font=2)
 
 
 text(Coord$Longitude[Coord$Réplicas=='2'],Coord$Latitude[Coord$Réplicas
 =='2'
  ],rownames(Coord)[Coord$Réplicas=='2'],col='yellow',cex=0.5,font=2)
 
 
 text(Coord$Longitude[Coord$Réplicas=='3'],Coord$Latitude[Coord$Réplicas
 =='3'
  ],rownames(Coord)[Coord$Réplicas=='3'],col='blue',cex=0.5,font=2)
 
  points(Coord$Longitude,Coord$Latitude-0.045,pch=20,cex=0.7)
 
  text(c(41.5,41.3),c(21.7,20.6),c('RJ','ES'))
 
  axis(1,xaxp=c(42.1,39.2,2),cex.axis=1)
 
  axis(2,yaxp=c(23.9,20.3,4),cex.axis=1)
 
  title(main='Bacia')
 
 
 legend(40.2,23.5,c('Uma','Duas','Três'),pch=21,cex=1,pt.bg=c('red','yel
 low',
  'blue'),bty='n',pt.cex=2,pt.lwd=0.6,title='Réplicas')
 
 
 

[R] OT: Likelihood ratio for the randomization/permutation test?

2009-03-11 Thread Mike Lawrence
Hi guRus,

My discipline (experimental psychology) is gradually moving away from
Null Hypothesis Testing and towards measures of evidence. One measure
of evidence that has been popular of late is the likelihood ratio.
Glover  Dixon (2005) demonstrate the calculation of the likelihood
ratio from ANOVA tables, but I'm also interested in non-parametric
statistics and wonder if anyone has any ideas on how to compute a
likelihood ratio from a randomization test (aka. permutation test)?

Say one had two groups and were interested in whether the mean scores
of the two groups differ in a manner consistent with random chance or
in a manner consistent with a non-null effect of some manipulation
applied to the two groups. The randomization test addresses this by
randomly re-assigning the participants to the groups, re-computing the
difference between means, and repeating many times, yielding a
distribution of simulated difference scores that represents the
distribution expected by chance.

Within a Null Hypothesis Testing framework you then estimate the
probability of the null by observing the proportion of simulated
difference scores that are greater in magnitude than the observed
difference score. Any guesses on how to translate this into a
quantification of evidence?

Mike

-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tinyurl.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is there any difference between - and =

2009-03-11 Thread David M Smith
On Wed, Mar 11, 2009 at 7:18 AM, Sean Zhang seane...@gmail.com wrote:
 Dear R-helpers:

 I have a question related to - and =.

 I saw very experienced R programmers use = rather than - quite
 consistently.
 However, I heard from others that do not use = but always stick to - when
 assigning valuese.

 I personally like = because I was using Matabl, But, would like to receive
 expert opinion to avoid potential trouble.

 Many thanks in advance.

 -Sean

The short answer is that - is used for assignment, and = is used to
associate function arguments with their values.  You can use = instead
of - for assignment too (in most contexts), but the converse isn't
true.

I've provided more detail about when you can and can't exchange the
two operators (and some of the history about the operators themselves)
in this blog post:

http://blog.revolution-computing.com/2008/12/use-equals-or-arrow-for-assignment.html

# David Smith

-- 
David M Smith da...@revolution-computing.com
Director of Community, REvolution Computing www.revolution-computing.com
Tel: +1 (206) 577-4778 x3203 (Seattle, USA)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help on MLInterfaces

2009-03-11 Thread Martin Morgan
Hi Tulgan --

MLInterfaces is a Bioconductor package, so please ask on the Bioc
mailing list (the package maintainer is more likely to read that list)

  https://stat.ethz.ch/mailman/listinfo/bioconductor

Martin

Tul Gan tul...@ymail.com writes:

 Hi,
   I am trying to use MLearn in MLInterfaces package to do 
 randomforest, clustering, knn etc. How do I predict on a test set for which I 
 do not know the classes? My training set has two classes.

 Thanks,

 Tulgan


   __
 Yahoo! Canada Toolbar: Search from anywhere on the web, and bookmark your 
 favourite sites. Download it now at
 http://ca.toolbar.yahoo.com.
   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M2 B169
Phone: (206) 667-2793

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] compile under Ubuntu 8.10 (Ibex)

2009-03-11 Thread Terry Therneau

 I fired up a new machine last night and loaded Ubuntu 8.10 on it.  I then had 
to add in the usual compiler stuff which is not loaded by default: make, emacs, 
fortran, c++, etc.
 
 I'm having trouble compiling R 2.8 however.  I get a message that it cannot 
figure out how to link f77 and C.  I've pulled in gfortran.  A google search 
on fortran ubuntu 8.10 shows various problems with the gcc switch from 
fortran77 to fortran95 --- also true for R?
 
 Any hints are welcome, including don't use 8.10 (thought that will be an 
evening of work).
 
Terry Therneau

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Using a NAMESPACE or the Imports field in DESCRIPTION?

2009-03-11 Thread Wolfgang Koller

Hello list,

I am writing a package which builds on a function (foo) of another 
package. I only need that particular function. What is the state of 
the art to do this? Do I need a NAMESPACE file in my package? 
Currently I still don't have one. Or can I do this with the Imports 
field in my DESCRIPTION file? Reading chapter 1 of the Writing R 
Extensions manual I do not completely understand for what situations 
these two alternatives are intended for. The manual says that with 
the Imports field of DESCRIPTION I can import the name space of 
another package. What does that mean? What are the differences 
between the two alternatives? Are there consequences for how I can 
access function foo within the code of my package?


Thanks for some clarifications and recommendations!

Wolfgang Koller

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Two new R courses in April (US)

2009-03-11 Thread David M Smith
REvolution Computing is hosting two new R courses in New York City and
Chicago in April. Short descriptions are below, but further
information and registration details can be found in the links at:

http://blog.revolution-computing.com/2009/03/r-courses-from-revolution-april.html

April 1 (New York City): What's All The Fuss About R? An Introduction
to R Using Real-world Examples.
This one-day, hands-on basic R training provides an introduction to R
through the quantitative exploration of real-world problems. We’ll
begin by using R for exploratory data analysis and statistical
inference; specific topics include graphical methods (both base and
grid graphics) and hypothesis testing via parametric and
non-parametric methods. This introduction motivates a more formal
presentation of the R language, including types of objects,
reading/writing data, functions, loops, and control statements. Other
strengths of R will be addressed, including statistical modeling,
classes and methods, the package management system, and the interface
to C/C++ and Fortran.

April 23 (Chicago): High-Performance Computing in R.
High-performance computing (HPC) technology is more available and
affordable than ever before. Productively using HPC computing has
remained surprisingly difficult, particularly with high-level scripted
languages like R. Fortunately, that is changing. The last few years
have seen significant gains in high-quality HPC computing-related
packages for R including RMPI, SNOW, ParallelR and many more. This
one-day course will present an overview of available HPC technologies
for the R language, and will demonstrate each technology with simple
examples that can be used as starting points for more sophisticated
work. (Ideal for attendees of R/Finance 2009.)

# David Smith

-- 
David M Smith da...@revolution-computing.com
Director of Community, REvolution Computing www.revolution-computing.com
Tel: +1 (206) 577-4778 x3203 (Seattle, USA)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Combining math and variables in expression

2009-03-11 Thread Greg Snow
 tmp - '120.000'
 mtext(bquote( '100% Area' == .(tmp)~km^2 ))

Hope this helps,

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of socrates
 Sent: Wednesday, March 11, 2009 8:08 AM
 To: r-help@r-project.org
 Subject: [R] Combining math and variables in expression
 
 
 I am trying to get the following line in a plot margin using mtext:
 
 100% Area = 120.000 km^2
 
 Where I intend that 100% Area = is text, 120.000 is a number that
 varies
 according to different data, and km^2 should be a neat
 km-with-superscript-2.
 The expression function fails me, since it apparently cannot coerce an
 expression when a variable is involved.
 In the R help file there is an example using either bquote or
 substitute to
 accomplish like things, but these are sufficiently different from my
 objective and I am becoming more and more confused by every further
 example
 I stumble on.
 This should be easy. Can anyone help me prevent spending anoter 2.5
 hours on
 the matter? Thanks.
 --
 View this message in context: http://www.nabble.com/Combining-math-and-
 variables-in-expression-tp22455877p22455877.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using a NAMESPACE or the Imports field in DESCRIPTION?

2009-03-11 Thread Martin Morgan

Hi Wolfgang --

Wolfgang Koller wrote:

Hello list,

I am writing a package which builds on a function (foo) of another 
package. I only need that particular function. What is the state of the 
art to do this? Do I need a NAMESPACE file in my package? Currently I 
still don't have one. Or can I do this with the Imports field in my 
DESCRIPTION file? Reading chapter 1 of the Writing R Extensions manual 
I do not completely understand for what situations these two 
alternatives are intended for. The manual says that with the Imports 
field of DESCRIPTION I can import the name space of another package. 
What does that mean? What are the differences between the two 
alternatives? Are there consequences for how I can access function foo 
within the code of my package?


It sounds like you want to add Imports: OtherPackage to DESCRIPTION, and

importFrom(OtherPackage, foo)

in you NAMESPACE file, in addition perhaps to

export(foo)

if you'd like users of your package to access foo. Using a NAMESPACE can 
be very beneficial, especially as your software grows in complexity -- 
it ensures ready access to the symbols you want (foo, in this case) 
without relying on the structure of the user search path. I don't think 
adding Imports to DESCRIPTION and adding a NAMESPACE should really be 
viewed as 'alternatives' -- add a NAMESPACE, and many of the packages 
that you had previously listed in Depends: likely belong more 
appropriately in Imports.


Martin


Thanks for some clarifications and recommendations!

Wolfgang Koller

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



--
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M2 B169
Phone: (206) 667-2793

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Combining math and variables in expression

2009-03-11 Thread David Winsemius

Dear socrates;

It did not take me 2.5 hours but it did take longer than I thought it  
would. I worked off the example given by Henrique Dallazuanna seen at:

http://finzi.psych.upenn.edu/R/Rhelp08/archive/120065.html

 This seems to work:

mtext(bquote(100~% ~ Area == ~ .(text.val)~ Km^2), side=3)

I cannot say the the syntactic rules are entirely clear. The proper  
placement of the tildes is a bit mysterious to me. I am wondering if a  
solution using substitute(expression(.)) might be more clear but my  
efforts in that direction were not successful.


--

David

On Mar 11, 2009, at 10:08 AM, socrates wrote:



I am trying to get the following line in a plot margin using mtext:

100% Area = 120.000 km^2

Where I intend that 100% Area = is text, 120.000 is a number that  
varies

according to different data, and km^2 should be a neat
km-with-superscript-2.
The expression function fails me, since it apparently cannot coerce an
expression when a variable is involved.
In the R help file there is an example using either bquote or  
substitute to

accomplish like things, but these are sufficiently different from my
objective and I am becoming more and more confused by every further  
example

I stumble on.
This should be easy. Can anyone help me prevent spending anoter 2.5  
hours on

the matter? Thanks.
--
View this message in context: 
http://www.nabble.com/Combining-math-and-variables-in-expression-tp22455877p22455877.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to check the model adequacy for a cox proportional hazard model fit?

2009-03-11 Thread li li
Anybody has any suggestion? Thanks!


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question about datatypes/plotting issue

2009-03-11 Thread Oscar Bonilla

David,

I struggled with this for a while. I think the problem with the dates  
I have is that they are not specific dates, they are partial dates.  
A workaround for that that I got from someone else in the list was:


as.Date(paste(x$Date, '1'), '%B %Y %d')

to make them specific dates (the first of the month).

Cheers,

-Oscar

On Mar 10, 2009, at 7:58 PM, David Winsemius wrote:

You need to convert W$Date into a real date variable. At the moment  
it is just a character variable.


 str(W)
'data.frame':   265 obs. of  23 variables:
$ Date: Factor w/ 265 levels  ,April 1987,..: 1 90  
68 156 2 178 134 ...
$ AZ.Phoenix  : Factor w/ 236 levels ,100.00,100.43,..:  
236 1 1 1 1 1 1 1 1 1 ...
$ CA.Los.Angeles  : Factor w/ 260 levels 100.00,100.02,..: 260  
113 114 115 116 ...
$ CA.San.Diego: Factor w/ 261 levels 100.00,101.07,..: 261  
109 110 111 112 ...
$ CA.San.Francisco: Factor w/ 256 levels 100.00,102.70,..: 256  
108 109 110 111 ...

.(output trimmed)
.
.
?Date  # not the variable name, the R class name
?format.Date
?strptime

Unfortunately I seem to be at one of the many limits to my knowledge:
This code behaves in the manner I expected:

 format(Sys.time(), %a %b %d %X %Y %Z)
[1] Tue Mar 10 22:19:28 2009 EDT
 strptime(format(Sys.time(), %a %b %d %X %Y %Z), format=%a %b %d  
%X %Y %Z)

[1] 2009-03-10 22:20:04

Whereas this code does not:
 format(Sys.Date(), %B %Y)
[1] March 2009
 as.Date(format(Sys.Date(), %B %Y), %B %Y)
# would have assumed one was the inverse of the other, but ...

[1] NA

For some reason I cannot get the space delimited Month- combo to  
convert. I can getother  space delimited formats to work for input  
or output:

 as.Date(03 1998, %M %Y)
[1] 1998-03-10

 format(Sys.Date(), %B %Y)
[1] March 2009

Puzzled;
--
David Winsemius

On Mar 10, 2009, at 9:15 PM, Oscar Bonilla wrote:


Hi,

I am trying to plot the Case-Shiller index found at: 
http://www2.standardandpoors.com/spf/pdf/index/CSHomePrice_History_022445.xls

The way I'm importing it into R is as follows:

library(gdata)
	W - read.xls(http://www2.standardandpoors.com/spf/pdf/index/CSHomePrice_History_022445.xls 
, header=TRUE)

attach(W)

To give you and idea of what the data looks like:

 head(W)
 Date AZ.Phoenix CA.Los.Angeles CA.San.Diego CA.San.Francisco
1 PHXR   LXXR SDXR  
SFXR
2  January 1987 59.3354.67 
46.61
3 February 1987 59.6554.89 
46.87
4March 1987 59.9955.16 
47.32
5April 1987 60.8155.85 
47.69
6  May 1987 61.6756.35 
48.31
CO.Denver DC.Washington FL.Miami FL.Tampa GA.Atlanta IL.Chicago  
MA.Boston
1  DNXR  WDXR MIXR TPXR   ATXR
CHXR  BOXR
2 50.20 64.1168.5077.33  
53.55 70.04
3 49.96 64.7768.7677.93  
54.64 70.08
4 50.15 65.7169.2377.76  
54.80 70.00
5 50.55 66.4069.2077.56  
54.88 70.70
6 50.63 67.2769.4677.85  
55.43 71.51
MI.Detroit MN.Minneapolis NC.Charlotte NV.Las.Vegas NY.New.York  
OH.Cleveland
1   DEXR   MNXR CRXR LVXR 
NYXR CEXR
2  63.3966.36
74.4253.53
3  63.9467.03
75.4353.50
4  64.1767.34
76.2553.68
5  64.8167.88
77.3453.75
6  65.1867.90
79.1654.71

OR.Portland TX.Dallas WA.Seattle Composite.10 Composite.20
1POXR  DAXR   SEXR CSXR SPCS20R
2   41.05 62.82
3   41.28 63.39
4   41.06 63.87
5   40.96 64.57
6   41.24 65.56


Now on to the problem... if I just run

plot(CA.San.Francisco ~ Date)

I get:
pastedGraphic.png

Which I suspect is a problem because the Date column is not really  
a Date, it is a factor


 class(Date)
[1] factor

If I run:
plot(as.numeric(CA.San.Francisco), type=l)

I get:

pastedGraphic.png


which is wrong, as CA.San.Francisco has no such discontinuity.

 CA.San.Francisco
[1] SFXR   46.61  46.87  47.32  47.69  48.31  48.83  49.49  49.94   
50.69
[11] 51.33  51.80  52.03  52.24  52.64  53.19  54.19  56.09  58.22   
58.70
[21] 59.00  59.50  60.37  61.31  62.20  62.66  63.32  64.64  66.27   
67.77
[31] 69.26  70.27  71.36  72.31  72.95  73.25  73.02  72.87  72.95   
73.50
[41] 74.57  75.12  75.15  74.81  74.45  

Re: [R] Question about datatypes/plotting issue

2009-03-11 Thread Gabor Grothendieck
Another approach is to use yearmon. e.g.

 library(zoo)
 as.yearmon(January 1987, %B %Y)
[1] Jan 1987

Thus we can replace the w - line in my example code with:

w - zoo(as.matrix(W[-1]), as.yearmon(W[,1], %B %Y))

On Wed, Mar 11, 2009 at 12:48 PM, Oscar Bonilla oboni...@galileo.edu wrote:
 David,

 I struggled with this for a while. I think the problem with the dates I have
 is that they are not specific dates, they are partial dates. A workaround
 for that that I got from someone else in the list was:

 as.Date(paste(x$Date, '1'), '%B %Y %d')

 to make them specific dates (the first of the month).

 Cheers,

 -Oscar

 On Mar 10, 2009, at 7:58 PM, David Winsemius wrote:

 You need to convert W$Date into a real date variable. At the moment it is
 just a character variable.

  str(W)
 'data.frame':   265 obs. of  23 variables:
 $ Date            : Factor w/ 265 levels  ,April 1987,..: 1 90 68 156
 2 178 134 ...
 $ AZ.Phoenix      : Factor w/ 236 levels ,100.00,100.43,..: 236 1 1
 1 1 1 1 1 1 1 ...
 $ CA.Los.Angeles  : Factor w/ 260 levels 100.00,100.02,..: 260 113 114
 115 116 ...
 $ CA.San.Diego    : Factor w/ 261 levels 100.00,101.07,..: 261 109 110
 111 112 ...
 $ CA.San.Francisco: Factor w/ 256 levels 100.00,102.70,..: 256 108 109
 110 111 ...
 .(output trimmed)
 .
 .
 ?Date  # not the variable name, the R class name
 ?format.Date
 ?strptime

 Unfortunately I seem to be at one of the many limits to my knowledge:
 This code behaves in the manner I expected:

  format(Sys.time(), %a %b %d %X %Y %Z)
 [1] Tue Mar 10 22:19:28 2009 EDT
  strptime(format(Sys.time(), %a %b %d %X %Y %Z), format=%a %b %d %X %Y
  %Z)
 [1] 2009-03-10 22:20:04

 Whereas this code does not:
  format(Sys.Date(), %B %Y)
 [1] March 2009
  as.Date(format(Sys.Date(), %B %Y), %B %Y)
 # would have assumed one was the inverse of the other, but ...

 [1] NA

 For some reason I cannot get the space delimited Month- combo to
 convert. I can getother  space delimited formats to work for input or
 output:
  as.Date(03 1998, %M %Y)
 [1] 1998-03-10

  format(Sys.Date(), %B %Y)
 [1] March 2009

 Puzzled;
 --
 David Winsemius

 On Mar 10, 2009, at 9:15 PM, Oscar Bonilla wrote:

 Hi,

 I am trying to plot the Case-Shiller index found at:
 http://www2.standardandpoors.com/spf/pdf/index/CSHomePrice_History_022445.xls

 The way I'm importing it into R is as follows:

        library(gdata)
        W -
 read.xls(http://www2.standardandpoors.com/spf/pdf/index/CSHomePrice_History_022445.xls;,
 header=TRUE)
        attach(W)

 To give you and idea of what the data looks like:

  head(W)
         Date AZ.Phoenix CA.Los.Angeles CA.San.Diego CA.San.Francisco
 1                     PHXR           LXXR         SDXR             SFXR
 2  January 1987                     59.33        54.67            46.61
 3 February 1987                     59.65        54.89            46.87
 4    March 1987                     59.99        55.16            47.32
 5    April 1987                     60.81        55.85            47.69
 6      May 1987                     61.67        56.35            48.31
 CO.Denver DC.Washington FL.Miami FL.Tampa GA.Atlanta IL.Chicago MA.Boston
 1      DNXR          WDXR     MIXR     TPXR       ATXR       CHXR
  BOXR
 2     50.20         64.11    68.50    77.33                 53.55
 70.04
 3     49.96         64.77    68.76    77.93                 54.64
 70.08
 4     50.15         65.71    69.23    77.76                 54.80
 70.00
 5     50.55         66.40    69.20    77.56                 54.88
 70.70
 6     50.63         67.27    69.46    77.85                 55.43
 71.51
 MI.Detroit MN.Minneapolis NC.Charlotte NV.Las.Vegas NY.New.York
 OH.Cleveland
 1       DEXR           MNXR         CRXR         LVXR        NYXR
 CEXR
 2                                  63.39        66.36       74.42
  53.53
 3                                  63.94        67.03       75.43
  53.50
 4                                  64.17        67.34       76.25
  53.68
 5                                  64.81        67.88       77.34
  53.75
 6                                  65.18        67.90       79.16
  54.71
 OR.Portland TX.Dallas WA.Seattle Composite.10 Composite.20
 1        POXR      DAXR       SEXR         CSXR     SPCS20R
 2       41.05                             62.82
 3       41.28                             63.39
 4       41.06                             63.87
 5       40.96                             64.57
 6       41.24                             65.56


 Now on to the problem... if I just run

        plot(CA.San.Francisco ~ Date)

 I get:
 pastedGraphic.png

 Which I suspect is a problem because the Date column is not really a
 Date, it is a factor

         class(Date)
        [1] factor

 If I run:
        plot(as.numeric(CA.San.Francisco), type=l)

 I get:

 pastedGraphic.png


 which is wrong, as CA.San.Francisco has no such discontinuity.

  CA.San.Francisco
 [1] SFXR   46.61  46.87  47.32  47.69  48.31  48.83  49.49  49.94  

Re: [R] Mixed models fixed effects

2009-03-11 Thread Mark Difford

Hi Simon,

Carefull, someone is likely to tell you that the bible is Pinheiro, J.C.,
and Bates, D.M. (2000) Mixed-Effects Models in S and S-PLUS, Springer, and
that would be much closer to being correct. Others might mention something
by Searle. Nothing against Crawley, of course. But it usually is better to
get close to the source, and to the active researchers in the field. One
nice thing about the first reference (there are many others) is that Prof.
Bates is an active contributor to this list and to the SIG mixed-models list
(which he maintains):

https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Check it out.

Regards, Mark.


Simon Pickett-4 wrote:
 
 Also check out these pdfs
 http://cran.r-project.org/other-docs.html
 
 and try to get your hands on the bible
 http://www.amazon.co.uk/R-Book-Michael-J-Crawley/dp/0470510242
 
 Simon.
 
 
 
 
 
 Hi Emma,

 Continuous predictors are no problem at all. You can mix both continuous
 and categorial predictors if needed. I suppose your response are counts
 (the number of bats that passes)? In that case a generalised linear
 mixed model is more appropriate. With the lme4 package you could try
 something like this:

 library(lme4)
 Model - glmer(BatPasses ~ Width + Height + (1|Site), family = poisson)

 HTH,

 Thierry

 PS There is a mailing list dedicated to mixed models: R-Sig-MixedModels
 
 
 ir. Thierry Onkelinx
 Instituut voor natuur- en bosonderzoek / Research Institute for Nature
 and Forest
 Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
 methodology and quality assurance
 Gaverstraat 4
 9500 Geraardsbergen
 Belgium
 tel. + 32 54/436 185
 thierry.onkel...@inbo.be
 www.inbo.be

 To call in the statistician after the experiment is done may be no more
 than asking him to perform a post-mortem examination: he may be able to
 say what the experiment died of.
 ~ Sir Ronald Aylmer Fisher

 The plural of anecdote is not data.
 ~ Roger Brinner

 The combination of some data and an aching desire for an answer does not
 ensure that a reasonable answer can be extracted from a given body of
 data.
 ~ John Tukey

 -Oorspronkelijk bericht-
 Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 Namens Emma Stone
 Verzonden: woensdag 11 maart 2009 15:29
 Aan: r-help@r-project.org
 Onderwerp: Re: [R] Mixed models fixed effects

 Dear All,

 This may sound like a dumb question but I am trying to use a mixed model
 to
 determine the predictors of bat activity along hedges within 8 sites. So
 my
 response is continuous (bat passes) my predictors fixed effects are
 continuous (height metres), width (metres) etc and the random effect is
 site  - can you tell me if the fixed effects can be continuous as all
 the
 examples I have read show them as categorical, but this is not covered
 in
 any documents I can find.

 Help!

 Emma

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver 
 weer
 en binden het INBO onder geen enkel beding, zolang dit bericht niet 
 bevestigd is
 door een geldig ondertekend document. The views expressed in  this
 message
 and any annex are purely those of the writer and may not be regarded as 
 stating
 an official position of INBO, as long as the message is not confirmed by
 a 
 duly
 signed document.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/Re%3A-Mixed-models-fixed-effects-tp22456368p22460248.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Comparing data frames and keeping non-matches

2009-03-11 Thread Pele

Hi R users,

I am trying to compare 2 data frames by  subject and match and save the no
matches to an object called nomatch, but I am getting unexpected results...
Can anyone tell me how to correct the code to get the expected results shown
in the last table?

Many thanks in advance for your any help!

library(reshape)

year - c(2100:2110)
x1 - c(F,T,T,F,F,F,T,F,T,T,F)
df1 - data.frame(cbind(year, x1)) 
df1$subject - c(1,1,1,2,2,3,3,3,3,4,4)
df1$match - 1; df1

df2 - data.frame(cbind(year, x1))
df2$subject - c(1:11)
df2$match - 1; df2
key - c(subject, match)

nomatch - subset(df2, is.element(df2[,key], df1[,key])==FALSE); nomatch
rm(list=ls())

Unexpected Results
year x1 subject match
1  2100  0   1 1
3  2102  1   3 1
5  2104  0   5 1
7  2106  1   7 1
9  2108  1   9 1
11 2110  0  11 1


Results I expected

   year x1 subject match
5  2104  0   5 1
6  2105  0   6 1
7  2106  1   7 1
8  2107  0   8 1
9  2108  1   9 1
10 2109  1  10 1
11 2110  0  11 1

-- 
View this message in context: 
http://www.nabble.com/Comparing-data-frames-and-keeping-non-matches-tp22460451p22460451.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] OT: Likelihood ratio for the randomization/permutation test?

2009-03-11 Thread Charles C. Berry

On Wed, 11 Mar 2009, Mike Lawrence wrote:


Hi guRus,

My discipline (experimental psychology) is gradually moving away from
Null Hypothesis Testing and towards measures of evidence. One measure
of evidence that has been popular of late is the likelihood ratio.
Glover  Dixon (2005) demonstrate the calculation of the likelihood
ratio from ANOVA tables, but I'm also interested in non-parametric
statistics and wonder if anyone has any ideas on how to compute a
likelihood ratio from a randomization test (aka. permutation test)?



You cannot get the likelihood ratio from just the null, you need an 
alternative. The alternative would have to provide different probabilities 
to the individual permutations than under the null I guess, so if you have 
a framework where this makes sense you are in business.


I suspect you might be aiming in the direction of empirical likelihood 
for which there is a literature - Google 'empirical likelihood'.


Also to turn this back to R, check out 'emplik' on CRAN.

HTH,

Chuck



Say one had two groups and were interested in whether the mean scores
of the two groups differ in a manner consistent with random chance or
in a manner consistent with a non-null effect of some manipulation
applied to the two groups. The randomization test addresses this by
randomly re-assigning the participants to the groups, re-computing the
difference between means, and repeating many times, yielding a
distribution of simulated difference scores that represents the
distribution expected by chance.

Within a Null Hypothesis Testing framework you then estimate the
probability of the null by observing the proportion of simulated
difference scores that are greater in magnitude than the observed
difference score. Any guesses on how to translate this into a
quantification of evidence?

Mike

--
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tinyurl.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] compile under Ubuntu 8.10 (Ibex)

2009-03-11 Thread Douglas Bates
On Wed, Mar 11, 2009 at 11:11 AM, Terry Therneau thern...@mayo.edu wrote:

  I fired up a new machine last night and loaded Ubuntu 8.10 on it.  I then had
 to add in the usual compiler stuff which is not loaded by default: make, 
 emacs,
 fortran, c++, etc.

  I'm having trouble compiling R 2.8 however.  I get a message that it cannot
 figure out how to link f77 and C.  I've pulled in gfortran.  A google search
 on fortran ubuntu 8.10 shows various problems with the gcc switch from
 fortran77 to fortran95 --- also true for R?

  Any hints are welcome, including don't use 8.10 (thought that will be an
 evening of work).

Read http://cran.us.r-project.org/bin/linux/ubuntu to learn how to add
the CRAN repository to your package search path then install the
r-base and r-base-dev packages.

If you want all the development packages to allow you to compile and
install development versions of R then run

sudo apt-get build-dep r-base

and all the required packages will be magically added.  If you want to
see exactly how Dirk has set up the compilation of R under Debian then
install the wajig package, run

cd /tmp
wajig build r-base

and sit back and enjoy.  Finally note that the mailing list
R-SIG-Debian is more suitable for questions related to Debian and
Debian-derived distributions like Ubuntu.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RES: North Arrow (.png file) on a Map

2009-03-11 Thread Kingsford Jones
Also, see

# install.packages('sp')
example('spplot', package='sp')


Kingsford Jones

On Wed, Mar 11, 2009 at 9:38 AM, Greg Snow greg.s...@imail.org wrote:
 Another possibility is the my.symbols function in the TeachingDemos package.  
 You can define a polygon of your arrow (or other symbol), then use my.symbols 
 to add it to an existing graph, the advantage of my.symbols is that the size 
 of the arrow is absolute, not in terms of the scale of the original plot.

 --
 Gregory (Greg) L. Snow Ph.D.
 Statistical Data Center
 Intermountain Healthcare
 greg.s...@imail.org
 801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Rodrigo Aluizio
 Sent: Wednesday, March 11, 2009 4:22 AM
 To: 'Yihui Xie'
 Cc: R Help
 Subject: [R] RES: North Arrow (.png file) on a Map

 Thanks a lot Yihui, It's perfect, in fact exactly identical the I have
 as a
 png file in my computer.
 Thank you so much.

 Regards
 Rodrigo

 -Mensagem original-
 De: Yihui Xie [mailto:xieyi...@gmail.com]
 Enviada em: quarta-feira, 11 de março de 2009 02:48
 Para: Rodrigo Aluizio
 Cc: R Help
 Assunto: Re: [R] North Arrow (.png file) on a Map

 Is this arrow satisfactory for you?

 north.arrow = function(x, y, h) {
     polygon(c(x, x, x + h/2), c(y - h, y, y - (1 + sqrt(3)/2) * h),
 col = black, border = NA)
     polygon(c(x, x + h/2, x, x - h/2), c(y - h, y - (1 + sqrt(3)/2) *
 h, y, y - (1 + sqrt(3)/2) * h))
     text(x, y, N, adj = c(0.5, 0), cex = 4)
 }
 plot(1, type = n, ylim = c(0, 1))
 north.arrow(1, 0.8, 0.3)

 Regards,
 Yihui
 --
 Yihui Xie xieyi...@gmail.com
 Phone: +86-(0)10-82509086 Fax: +86-(0)10-82509086
 Mobile: +86-15810805877
 Homepage: http://www.yihui.name
 School of Statistics, Room 1037, Mingde Main Building,
 Renmin University of China, Beijing, 100872, China



 On Tue, Mar 10, 2009 at 7:21 PM, Rodrigo Aluizio r.alui...@gmail.com
 wrote:
  Hi list.
 
  I would like to know how do I insert a North arrow, stored as a png
 file
 in
  my computer, in a map? I found lots of post asking similar things,
 one of
  them mentioned the pixmap package.  The map was done using map() and
  shapefiles (the code is below). I'm using the pixmap () and addlogo()
  functions. Well I can import the png with pixmap() function (I guess,
 once
  there's no error message), but I can't put It on the map, I got an
 error
  message telling me that:
 
 
 
  Error at t(x...@index[nrow(x...@index):1, , drop = FALSE]) :
 
   index out of limits
 
 
 
  Well I tried changing coordinates but I always got the same result.
 How do
 I
  do this correctly? Is there a better way?
 
 
 
  Thanks for the help and attention.
 
 
 
  Here is the complete map script:
 
 
 
  library(RODBC)
 
  library(maps)
 
  library(mapdata)
 
  library(maptools)
 
  library(pixmap)
 
  #Carregar Coordenadas e dados dos Pontos Amostrais
 
  Dados-odbcConnectExcel('Campos.xls',readOnly=T)
 
  Coord-sqlFetch(Dados,'CoordMed',colnames=F,rownames='Ponto')
 
  odbcClose(Dados)
 
  N-pixmap('Norte.png',nrow=166,ncol=113)
 
  # Carregar pontos e shapes
 
 
 Batimetria-
 readShapeSpatial('C:/Users/Rodrigo/Documents/UFPR/Micropaleontol
  ogia/Campos/ShapeFiles/Batimetria_BC.shp')
 
 
 Estados-
 readShapeSpatial('C:/Users/Rodrigo/Documents/UFPR/Micropaleontologi
  a/Campos/ShapeFiles/Estados_Sudeste.shp')
 
 
 Faciologia-
 readShapeSpatial('C:/Users/Rodrigo/Documents/UFPR/Micropaleontol
  ogia/Campos/ShapeFiles/Faciologia_BC.shp')
 
  # Mapa com os Pontos da Bacia
 
 
 postscript('MapaCampos.eps',paper='special',onefile=F,horizontal=F,widt
 h=3.5
  ,height=4.5,bg='white',pointsize=3)
 
  par(mar=c(3,2,2,0))
 
 
 map('worldHires','brazil',ylim=c(23.9,20.3),xlim=c(42.1,39.2),type='n')
 
 
 plot(Faciologia,ylab='',xlab='',col=c('lightgreen','lightgreen','lightg
 reen'
 
 ,'lightgreen','lightgreen','lightgray','lightgray','lightgray','lightgr
 ay','
 
 lightgray','lightgray','lightgray','lightgray','lightgray','lightgray',
 'ligh
 
 tgray','lightgray','lightgray','lightgray','lightgray','lightgray','lig
 htyel
  low','lightyellow','lightyellow'),add=T,lwd=0.5,border=0)
 
 
 plot(Batimetria,ylab='',xlab='',col='darkgray',lty='solid',lwd=0.2,add=
 T)
 
  plot(Estados,ylab='',xlab='',lty='solid',add=T,lwd=0.8)
 
 
 text(Coord$Longitude[Coord$Réplicas=='1'],Coord$Latitude[Coord$Réplicas
 =='1'
  ],rownames(Coord)[Coord$Réplicas=='1'],col='red',cex=0.5,font=2)
 
 
 text(Coord$Longitude[Coord$Réplicas=='2'],Coord$Latitude[Coord$Réplicas
 =='2'
  ],rownames(Coord)[Coord$Réplicas=='2'],col='yellow',cex=0.5,font=2)
 
 
 text(Coord$Longitude[Coord$Réplicas=='3'],Coord$Latitude[Coord$Réplicas
 =='3'
  ],rownames(Coord)[Coord$Réplicas=='3'],col='blue',cex=0.5,font=2)
 
  points(Coord$Longitude,Coord$Latitude-0.045,pch=20,cex=0.7)
 
  text(c(41.5,41.3),c(21.7,20.6),c('RJ','ES'))
 
  axis(1,xaxp=c(42.1,39.2,2),cex.axis=1)
 
  axis(2,yaxp=c(23.9,20.3,4),cex.axis=1)
 
  title(main='Bacia')
 
 
 

Re: [R] compile under Ubuntu 8.10 (Ibex) - Thanks

2009-03-11 Thread Terry Therneau
 Thanks for the good replies.
 
 1. build-essential -- this is likely the essential thing that I forgot.  I 
tend to leave running machines alone (for years at a time) so forget things 
like 
this.  My laptop is Fedore core 6 I think, when the prior laptop died.  I did 
remember bin-utils though.
 
  2. See the R Admin manual.  I read that pretty closely looking for hints.  
Perhaps the build-essential hint would be a good addition to the manual.
  
  3. 'r-sig-debian' for Ubuntu questions:  I never make that connection.
  
  4. Current binaries for R can be found at...  Also a good plan. But I may 
need to build R-devel (for the ShortRead package in bioconductor, to deal with 
Solexa sequencer data) so I wanted to work the bugs out.
  
  5. It worked for me  Useful too, since it told me it had to be something 
SIMPLE that I had overlooked.
  
  
   Thanks to Frank Harrell, Berwin Turlach, Chuck Taylor, Dirk Eddelbuettel, 
Roland Rau, Kjetil Halvorsen.
   
Terry Therneau

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multilevel Modeling using glmmPQL

2009-03-11 Thread Howard Alper
 

  Hi,

 

  I'm trying to perform a power simulation for a simple multilevel
model, using the function glmmPQL in R version 2.8.1.  I want to extract
the p-value for the fixed-effects portion of the regression, but I'm
having trouble doing that.  I can extract the coefficients
(summary(fit)$coeff), and the covariance matrix (summary(fit)$varFix),
but I can't grab the p-value (or the t-statistic.)  Could someone
explain how to do this?

 

  Please send responses to hal...@health.nyc.gov.

 

  Thanks in advance,

 

  Howard

 



DISCLAIMER:\ Sample Disclaimer added in a VBScript.\ \  ...{{dropped:6}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ets() initialization

2009-03-11 Thread Yagci, Kezban

Hi,

I have been working on ets() for exponential smoothing.
R initialize the forecast itself to minimize the error, I want to 
set an initialization value for the first period(such as first period's acutal 
value).
However, I could not find how to do it. Would you please help me?

Regards,
Kezban YAgci
-- 
Kezban Yagci
Masters Student
Industrial and Systems Engineering
Georgia Institute of Technology

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multilevel Modeling using glmmPQL

2009-03-11 Thread Chuck Cleland
On 3/11/2009 11:29 AM, Howard Alper wrote:
   Hi,

   I'm trying to perform a power simulation for a simple multilevel
 model, using the function glmmPQL in R version 2.8.1.  I want to extract
 the p-value for the fixed-effects portion of the regression, but I'm
 having trouble doing that.  I can extract the coefficients
 (summary(fit)$coeff), and the covariance matrix (summary(fit)$varFix),
 but I can't grab the p-value (or the t-statistic.)  Could someone
 explain how to do this?
 
   Please send responses to hal...@health.nyc.gov.

library(MASS)

fm1 - glmmPQL(y ~ trt + I(week  2), random = ~ 1 | ID,
   family = binomial, data = bacteria)

summary(fm1)$tTable[,5]

  Look at str(summary(fm1)) .

   Thanks in advance,
 
   Howard
 
 DISCLAIMER:\ Sample Disclaimer added in a VBScript.\ \...{{dropped:6}}
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code. 

-- 
Chuck Cleland, Ph.D.
NDRI, Inc. (www.ndri.org)
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is this a documentation bug? Spss dates import

2009-03-11 Thread HBaize

Luca,
I ran your code using SPSS 17.0 and R 2.8.1.
I found that the correct date was returned in R using this code:

test.df$newdate - as.Date(as.POSIXct(test.df$mydate , origin=1582-10-14))  

Also note that SPSS 17.0 is case sensitive in storing variable names, so the
R code in your post needs to be changed from test.df$MYDATE to
test.df$mydate to match your use of lowercase in the SPSS syntax. 

Harold


Luca Braglia wrote:
 
 Hello R-user
 
 bug seekers are needed!
 In order to perform these simple tasks you have to use a copy of SPSS
 and obviously R
 

-- 
View this message in context: 
http://www.nabble.com/Is-this-a-documentation-bug--Spss-dates-import-tp22455442p22461929.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Iterations of random sampling

2009-03-11 Thread René Pineda

I have a univariate binary (1,0) 230,000 records, I need to make 20,000 
iterations of random sampling of a fixed size. Where I put the result of the 
sum of selected records for each repetition
 
 
Thank's
 
 
_

of your life

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Forecasting with dlm

2009-03-11 Thread Giovanni Petris

Hi Michael,

There was a bug in dlmForecast that I have fixed. While the new
version of the package makes its way to CRAN, you can source the
attached file, which contains the amended version of dlmForecast. 

Best,
Giovanni Petris

 Date: Wed, 11 Mar 2009 13:39:36 +
 From: Michael Pearmain mpearm...@google.com
 Sender: r-help-boun...@r-project.org
 Precedence: list
 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=google.com; s=beta;
   t=1236778779; bh=FXcHfXd155+Rn9gFIVOikSMaQWI=;
 DomainKey-Signature: a=rsa-sha1; s=beta; d=google.com; c=nofws; q=dns;
 
 Hi All,
 I have a problem trying to forecast using the dlm package, can anyone offer
 any advise?
 
 I setup my problem as follows, (following the manual as much as possible)
 
 data for example to run code
 
 CostUSD - c(27.24031,32.97051, 38.72474, 22.78394, 28.58938, 49.85973,
 42.93949, 35.92468)
 library(dlm)
 
 buildFun - function(x) {
 dlmModPoly(1, dV = exp(x[1]), dW = exp(x[2]))
 }
 fit - dlmMLE(CostUSD, parm = c(0,0), build = buildFun)
 fit$conv
 dlmCostUSD - buildFun(fit$par)
 V(dlmCostUSD)
 W(dlmCostUSD)
 #For comparison
 StructTS(CostUSD, level)
 
 CostUSDFilt - dlmFilter(CostUSD, dlmCostUSD)
 CostUSDFore - dlmForecast(CostUSDFilt, nAhead = 1)
 
 after which i return the error message:
 
 Error in mod$m[lastObsIndex, ] : incorrect number of dimensions
 
 Can anyone offer any insight to this problem?
 
 Thanks in advance
 
 Mike
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 

Giovanni Petris  gpet...@uark.edu
Associate Professor
Department of Mathematical Sciences
University of Arkansas - Fayetteville, AR 72701
Ph: (479) 575-6324, 575-8630 (fax)
http://definetti.uark.edu/~gpetris/__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is there any difference between - and =

2009-03-11 Thread Wacek Kusnierczyk
Duncan Murdoch wrote:

 Use - for assignment, and = for function arguments.  Then the
 difference between

  f( a = 3 )
  f( a - 3 )

 is clear, and you won't be surprised that a gets changed in the second
 case.  If you use = for assignment, the two lines above will be
 written as

  f( a = 3 )
  f( ( a = 3 ) )

 and it is very easy to miss the crucial difference between them.

in fact, some recent posts show that things can go the other way round: 
people try to use - for function arguments.  i think the following is
the most secure way if one really really has to do assignment in a
function call:

f({a=3})

and if one keeps this convention, - can be dropped altogether.

vQ

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Couple of Questions about Classification trees

2009-03-11 Thread Jen_mp3

So I have 2 sets of data - a training data set and a test data set. I've been
doing the analysis on the training data set and then using predict and
feeding the test data through that. There are 114 rows in the training data
and 117 in the test data and 1024 columns in both. It's actually the same
set of data split into two. The rows are made of 5 different numbers. They
do represent something but it would take too long to explain.

I want to try and find a classification rule for the 5 numbers in the rows
based on the columns so I created a classification tree and plotted that and
then pruned it. My question is how do you print the misclassification rate
at each node on the actual diagram of the classification tree. I can't seem
to get it up there. In my notes it uses gmistext() but I have a feeling that
it's for Splus rather than R as gmistext() doesn.t work for me either. 

Second question is when I try using the predict.tree to put the test data
into the tree and then plot it it comes up with a really weird and wrong
looking plot. Here is the code I'm using:
tree1 - tree(row~.,data=train)
pruned.tree - prune.tree(tree1, best = 5, method = misclass)
predict.tree1 - predict.tree(prune.tree, data = main)
plot(predict.tree);text(predict.tree)
I sort of don't get a classification tree, I get the x axis labelled 1, the
y axis labelled 2 and then about 4 small black rectangles scattered across
the plot. 

Thanks in Advance. 
-- 
View this message in context: 
http://www.nabble.com/Couple-of-Questions-about-Classification-trees-tp22461673p22461673.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Iterations of random sampling

2009-03-11 Thread Chuck Cleland
On 3/11/2009 3:15 PM, René Pineda wrote:
 I have a univariate binary (1,0) 230,000 records, I need to make 20,000 
 iterations of random sampling of a fixed size. Where I put the result of the 
 sum of selected records for each repetition

X - rbinom(23, 1, .5)

sample.sums - replicate(2, sum(sample(X, 10)))

 Thank's
  
 _
 
 of your life
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code. 

-- 
Chuck Cleland, Ph.D.
NDRI, Inc. (www.ndri.org)
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] tracing SV3 methods

2009-03-11 Thread William Dunlap
I would like to put traces on the methods for an SV3 generic
and trace() doesn't seem to work.  E.g.,

   median(1:10)
  [1] 5.5
   trace(median.default)
   median(1:10) # expect Tracing median.default ...
  [1] 5.5

I can tell that median.default is getting called, since
a trace on sort() shows it in the traceback:

   trace(sort,Quote(cat(deparse(sys.calls()),sep=\n)))
  Tracing function sort in package base
  [1] sort
   median(1:10)
  Tracing sort(x, partial = half + 0L:1L) on entry 
  list(median(1:10), median.default(1:10), sort(x, 
  partial = half + 0L:1L), .doTrace(cat(deparse(
  sys.calls()), sep = \n), on entry),
  eval.parent(exprObj), eval(expr,p), eval(expr,
  envir, enclos))
  [1] 5.5

Is there a recommended way to do this?  (I'd like to put a
trace on all methods of a SV3 generic to see which get called
in various situations.)

This was in R 2.8.1 on Windows but I see the same results
in several versions of R.

Bill Dunlap
TIBCO Software Inc - Spotfire Division
wdunlap tibco.com 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Reshape question.

2009-03-11 Thread rkevinburton
This hopefully is trivial. I am trying to reshape the data using the reshape 
package.

First I read in the data:

a2009 - read.csv(Total2009.dat, header = TRUE)

Then I trim it so that it only contains the columns that I have interested in:

m2009 - melt(a2009, id.var=c(DayOfYear,Category,SubCategory,Sku), 
measure.var=c(Quantity), na.rm=TRUE)

Then I start to formulate the data that I will process:

c2009 - cast(m2009, DayOfYear ~ variable | Category, sum)

Finally I aggregate the data:

t2009 - cast(m2009, DayOfYear ~ variable, sum)

My question is on the third step above (repeated here)

c2009 - cast(m2009, DayOfYear ~ variable | Category, sum)

This gets the data assocated with a unique 'Category' name. I want to get the 
data grouped by 'Category' and 'SubCategory'. The 'SubCategory' is not unique 
but the combination 'Category' and 'SubCategory' form a unique pair. What would 
be the formula that would give me the data grouped by Category AND SubCategory? 
Would it be as simple as:

c2009 - cast(m2009, DayOfYear ~ variable | Category  SubCategory, sum)

?

Thank you for your suggestions.

Kevin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Pseudo-random numbers between two numbers

2009-03-11 Thread Ted Harding
Because of a mistake I made in copying code into email, there has
been confusion (expressed in some private emails). Hence the
corrected version is appended at the end.

On 11-Mar-09 00:05:26, Ted Harding wrote:
 I have modified my example to make it more convincing! See at end.
 
 On 10-Mar-09 23:39:17, Ted Harding wrote:
 On 10-Mar-09 23:01:45, g...@ucalgary.ca wrote:
 Please forget the last email I sent with the same subject.
 =
 I would like to generate pseudo-random numbers between two numbers
 using R, up to a given distribution, for instance, norm. That is
 something like
 rnorm(HowMany,Min,Max,mean,sd)
 over rnorm(HowMany,mean,sd).
 I am wondering if
 
 qnorm(runif(HowMany, pnorm(Min,mean,sd), pnorm(Max,mean, sd)),
   mean, sd)
 
 is good. Any idea? Thanks.
 -james
 
 It would certainly work as intended! For instance:
 
 truncnorm-function(HowMany,Min,Max,mean=0,sd=1){
   qnorm(runif(HowMany, pnorm(Min,mean,sd), pnorm(Max,mean, sd)),
 mean, sd)}
 Sample - truncnorm(1000,-1,2.5)
 hist(Sample,breaks=100)
 
 Hoping this helps,
 Ted.
 
 truncnorm-function(HowMany,Min,Max,mean=0,sd=1){
   qnorm(runif(HowMany, pnorm(Min,mean,sd), pnorm(Max,mean, sd)),
 mean, sd)}
 Sample - truncnorm(10,-1,2.5)
 hist(Sample,breaks=100)
 u0-(-0.975)+0.05*(0:69)
 yy-dnorm(u0)
 du-min(diff(H$breaks))
 Y - 10*yy*du/(pnorm(2.5)-pnorm(-1.0))
 lines(u0,Y)
 
 Ted.

Corrected version:
--

truncnorm-function(HowMany,Min,Max,mean=0,sd=1){
  qnorm(runif(HowMany, pnorm(Min,mean,sd), pnorm(Max,mean, sd)),
mean, sd)}
Sample - truncnorm(10,-1,2.5)
H - hist(Sample,breaks=100)
u0-(-0.975)+0.05*(0:69)
yy-dnorm(u0)
du-min(diff(H$breaks))
Y - 10*yy*du/(pnorm(2.5)-pnorm(-1.0))
lines(u0,Y)


Apologies for the error.
Ted.


E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk
Fax-to-email: +44 (0)870 094 0861
Date: 11-Mar-09   Time: 19:48:12
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] error.bars

2009-03-11 Thread Sueli Rodrigues

Hi, I'm trying to use the function error.bars, but the program don't
find it, and I dont't found any package with this function. Is there some
another functin to draw barplots with error bars?

Sueli Rodrigues

Eng. Agrônoma - UNESP
Mestranda - USP/ESALQ
PPG-Solos e Nutrição de Plantas
Fones (19)93442981
  (19)33719762

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] About the merge

2009-03-11 Thread William Dunlap
You can assign a group number to each run of identical
values of 'Rep' with
firstInRun - with(data, c(TRUE, Rep[-1]!=Rep[-length(Rep)]))
group - cumsum(firstInRun)
where 'data' is your data.frame's name.  Once you've
assigned the runs to groups then you can use, e.g.,
sapply(split()) to compute the group sums
summed_App_dur - with(data, sapply(split(App_dur,group), sum))
and put them into the shortened data.frame with
newdata - data[firstInRun,]
newdata$App_dur - summed_App_dur 

There are various tricks for speeding up the sapply(split(),sum),
if that is a problem.

Bill Dunlap
TIBCO Software Inc - Spotfire Division
wdunlap tibco.com 

-
 DateDtime Hour Min Second Rep App_dur
9  2006-02-22 14:36:11   14  36 11   4   1
10 2006-02-22 14:36:12   14  36 12   3  86
11 2006-02-22 14:37:38   14  37 38   0  58
14 2006-02-22 14:38:36   14  38 36   3   1
15 2006-02-22 14:38:37   14  38 37   4   1
16 2006-02-22 14:38:38   14  38 38   1   9
18 2006-02-22 14:38:47   14  38 47   0   3
20 2006-02-22 14:38:50   14  38 50   1   1
21 2006-02-22 14:38:51   14  38 51   4   2   ***
23 2006-02-22 14:38:53   14  38 53   4  39  ***
25 2006-02-22 14:39:32   14  39 32   3   1
26 2006-02-22 14:39:33   14  39 33   4   1
27 2006-02-22 14:39:34   14  39 34   3   8
28 2006-02-22 14:39:42   14  39 42   4  62

How to write a program to merge Rep with  consecutive equal numbers to
one of that number and sum up App_dur??

on the above data frame, i want to merge
21 2006-02-22 14:38:51   14  38 51   4   2   ***

23 2006-02-22 14:38:53   14  38 53   4  39  ***

to 2006-02-22 14:38:51   14  38 51   4   41 
...
Thanks.

Tammy

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Couple of Questions about Classification trees

2009-03-11 Thread Frank E Harrell Jr

Jen_mp3 wrote:

So I have 2 sets of data - a training data set and a test data set. I've been
doing the analysis on the training data set and then using predict and
feeding the test data through that. There are 114 rows in the training data
and 117 in the test data and 1024 columns in both. It's actually the same
set of data split into two. The rows are made of 5 different numbers. They
do represent something but it would take too long to explain.


Your sample size is too small by a factor of perhaps 100 for simple data 
splitting to provide stable results.  Then you have the problem of an 
improper scoring rule, i.e., one that when optimized gives the wrong answer.


Frank Harrell



I want to try and find a classification rule for the 5 numbers in the rows
based on the columns so I created a classification tree and plotted that and
then pruned it. My question is how do you print the misclassification rate
at each node on the actual diagram of the classification tree. I can't seem
to get it up there. In my notes it uses gmistext() but I have a feeling that
it's for Splus rather than R as gmistext() doesn.t work for me either. 


Second question is when I try using the predict.tree to put the test data
into the tree and then plot it it comes up with a really weird and wrong
looking plot. Here is the code I'm using:
tree1 - tree(row~.,data=train)
pruned.tree - prune.tree(tree1, best = 5, method = misclass)
predict.tree1 - predict.tree(prune.tree, data = main)
plot(predict.tree);text(predict.tree)
I sort of don't get a classification tree, I get the x axis labelled 1, the
y axis labelled 2 and then about 4 small black rectangles scattered across
the plot. 

Thanks in Advance. 



--
Frank E Harrell Jr   Professor and Chair   School of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >