date:20121203



On Dec 3, 2012, at 3:34 PM, Audrey wrote:


res=names(dat);
get(res[ind],pos=dat) will retrieve dat$name



There are far less baroque was of doing that (including dat$name and
dat[["name"]].

Both dat$name and dat[["name"]] require you to know what "name" is.  
I was
looking for a way to retrieve a data frame column by name without  
actually
knowing what the data frame column name was. get() seems to do that,  
but I

am open to other options.


And I gave you some, but you failed to include context, an annoying  
habit of Nabble users. (In my opinion the Nabble interface is more  
hassle than it is worth.)  For instance, I wrote:




If you had a numeric object, 'ind' then

dat[ind] would retrieve a sub-list, with as many columns as there  
were items in 'ind' and would have class data.frame.








Thank you for your advice regarding [ vs. [[. Indeed,  
lower(dat[[ind]]) does
return the desired result. However, it seems that ind can only be a  
single

integer


That was what I wrote I believe.


(or evaluate to TRUE for only 1 column),


In some instances TRUE is coerced to 1.


I guess because "[[" is returning a vector.


No. You have failed to understand that "[[" can only return a single  
referenced object. It might be a vector or a list, but you cannot give  
it a vector with more than one element and expect satisfactory results.




By "text" I mean anything that was non-numerical: character and factor
classes, in my case.

Then you should be clear. "text" is not a well defined term when  
referring to R objects.



View this message in context: 
http://r.789695.n4.nabble.com/Change-case-of-factor-in-data-frame-tp4651696p4651971.html
Sent from the R help mailing list archive at Nabble.com.


No. This borders on actionable fraud.  R-help is not Nabble. Do not  
believe what Nabble is saying. Do read the Rhelp Posting Guide.




--

David Winsemius, MD
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Different results from random.Forest with test option and using predict function

2012-12-03 Thread Peter Langfelder

On Mon, Dec 3, 2012 at 3:30 PM, tdbuskirk  wrote:
>
> Hello R Gurus,
>
> I am perplexed by the different results I obtained when I ran code like
> this:
> set.seed(100)
> test1<-randomForest(BinaryY~., data=Xvars, trees=51, mtry=5, seed=200)
> predict(test1, newdata=cbind(NewBinaryY, NewXs), type="response")
>

Not sure about this since I haven't used predict.randomForest
extensively, but newdata usually contains predictors only, not the
response. Try using newdata = NexXs.

HTH,

Peter

> and this code:
> set.seed(100)
> test2<-randomForest(BinaryY~., data=Xvars, trees=51, mtry=5, seed=200,
> xtest=NewXs, ytest=NewBinarY)
>
> The confusion matrices for the two forests I thought would be the same by
> virtue of the same seed settings, but they differ as do the predicted
> values
> as well as the votes.  At first I thought it was just the way ties were
> broken, so I changed the number of trees to an odd number so there are no
> ties anymore.
>
> Can anyone shed light on what I am hoping is a simple oversight?  I just
> can't figure out why the results of the predictions from these two forests
> applied to the NewBinaryYs and NewX data sets would not be the same.
>
> Thanks for any hints and help.
>
> Sincerely,
>
> Trent Buskirk
>
>
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/Different-results-from-random-Forest-with-test-option-and-using-predict-function-tp4651970.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Periodicity of Weekly Zoo

2012-12-03 Thread Gabor Grothendieck

On Mon, Dec 3, 2012 at 8:30 PM, Andrew Freedman
 wrote:
> Hi List,
>
> I have weekly sales observations for several products drawn via ODBC.
> Source data is available at
> https://www.dropbox.com/s/78vxae5ic8tnutf/asr.csv.
>
> This is retail sales data, so will contain seasonality and trend
> information. I expect to see 52 or 53 observations per year, each
> observation occuring on the same day of the week (Saturday). Ultimately
> I'm looking to feed these series into forecasting models for demand
> planning.
>
> The data has issues with internal gaps, so while I've been able to
> create a ts that appears to respect the frequency and period, I suspect
> that a zoo is going to be a better data container. Unfortunately, I'm
> not understanding the use of zoo() to describe frequency/period/deltat.
>
> In the example below I use sales[,16] (aka $p) as it has several
> periods (data between 2004 and 2012). I've tried using frequency=52, =7
> and =1, but get the same result each time; every data point ends up in
> cycle 1 and I don't have the periodicity needed to find seasonality.
>
>> sales <- read.csv("asr.csv")
>> library(zoo)
>
> Attaching package: 'zoo'
>
> The following object(s) are masked from 'package:base':
>
> as.Date, as.Date.numeric
>
>> sales.zoo <- zoo(subset(sales, select=c(2:length(sales))), order.by=
> + sales$date_end, frequency = 52)
>> sales.zoo.i <- na.approx(sales.zoo) # interpolate internal NA values
>> frequency(sales.zoo.i) # 52, which seems right
> [1] 52
>> cycle(sales.zoo.i[1:20,16]) # everything is in the same cycle...
> 2004-08-14 2004-08-21 2004-08-28 2004-09-04 2004-09-11 2004-09-18
>  1  1  1  1  1  1
> 2004-09-25 2004-10-02 2004-10-09 2004-10-16 2004-10-23 2004-10-30
>  1  1  1  1  1  1
> 2004-11-06 2004-11-13 2004-11-20 2004-11-27 2004-12-04 2004-12-11
>  1  1  1  1  1  1
> 2004-12-18 2004-12-25 2005-01-01 2005-01-08 2005-01-15 2005-01-22
>  1  1  1  1  1  1
> 2005-01-29 2005-02-05 2005-02-12 2005-02-19 2005-02-26 2005-03-05
>  1  1  1  1  1  1
>>
>
> Doubtless it's some facile error that will make me feel sheepish, but
> I've been staring at this for a bit now and just getting nowhere. Any
> pointers would be greatly appreciated.
>

A complete cycle is always represented by 1 time unit so if you wanted
a complete cycle to be a year then you would need to represent time in
years and fractions of a year, not as "Date" class.  That is how "ts"
class works too.

Since weeks don't evenly divide years you will have to approximate
this in order to have a frequency of 52.  There are many ways to do
this but below we drop week 00 in 53 week years so that there are 52
weeks in every year:  Years with 52 weeks don't have a week 00 so this
makes all years 52 weeks.

z <- read.zoo("asr.csv", sep = ",", header = TRUE)

# drop week "00"
z0 <- z[ format(time(z), "%W") != "00" ]
t0 <- time(z0)

# convert time to year + fraction
time(z0) <- as.numeric(format(t0, "%Y")) +
  (as.numeric(format(t0, "%W")) - 1) / 52

# convert to zooreg class (almost regularly spaced)
zr <- as.zooreg(z0)
frequency(zr) # 52
head(cycle(zr))



--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reading JSON files from R

2012-12-03 Thread Duncan Temple Lang


Hi m.dr.

  Reading data from MongoDB is no problem. So the RJSONIO or rjson
packages should work.

  Can you send me the sample file that is causing the problem, please?

 The error about a method looks like a potential oversight in the combinations
of inputs.

  Thanks
D.




On 12/3/12 7:30 PM, m.dr wrote:
> Hello All -
> 
> I am trying to use RJSONIO to read in some JSON files.
> 
> I was wondering if anyone could please comment on the level of complexity of
> the files it can be used to read, exports from or directly from NoSQL DBMS
> like MongoDB and such.
> 
> Also, i understand that in reading the JSON file RJSONIO will automatically
> create the necessary structures. However I cannot seem to use to to read the
> file properly and get this error:
> 
> Error in function (classes, fdef, mtable)  : 
>   unable to find an inherited method for function "fromJSON", for signature
> "missing", "NULL"
> 
> The call I am making is:
> noSqlData <- fromJSON(file='data.json')
> 
> It is a small file - with 3 levels of nested records.
> 
> And if there were some links to some examples with a file and usage would be
> great. My JSON file validates - so do not believe there is anything wrong
> with the file.
> 
> Thanks for your help.
> 
> 
> 
> 
> 
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Reading-JSON-files-from-R-tp4651976.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Histogram plot help

2012-12-03 Thread David L Carlson

Does it work the way you want if you add prob=TRUE to the second and third
hist() commands and run do.call() before abline()?

--
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77843-4352

> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
> project.org] On Behalf Of YAddo
> Sent: Monday, December 03, 2012 4:22 PM
> To: r-help@r-project.org
> Subject: [R] Histogram plot help
> 
> Dear All:
> 
> I plotted a histogram with Abline, clipping with color codes but i run
> into
> some problems.   The "abline' does not show up at all,  and when i
> request
> the 'prob=True' (to obtain the freqs), my clipped region colors the
> section
> of the graph instead of the plot only.
> 
> Is there any way i can get the y-axis figures  to be in whole numbers
> rather
> than decimals?
> 
> Many thanks for your help.
> YA
> 
> Here are the working codes i am tweaking.
> 
> Everything worked fine before i trying adding stuffs (prob=T, etc).
> 
> 
> x <- rnorm(1000)
> hist(x, xlim=c(-4,4),ylab="Prevalence",prob=T,lwd=3,las=1)
> lines(density(x),col="black",lwd=2)
> usr <- par("usr")
> clip(usr[1], -2, usr[3], usr[4])
> hist(x, col = 'red', add = TRUE)
> clip(2, usr[2], usr[3], usr[4])
> hist(x, col = 'blue', add = TRUE)
> abline(v=c(-1),lty=1,lwd=3,col="black")
> do.call("clip", as.list(usr))  # reset to plot region
> 
> 
> 
> 
> 
> --
> View this message in context: http://r.789695.n4.nabble.com/Histogram-
> plot-help-tp4651958.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Reading JSON files from R

2012-12-03 Thread m.dr

Hello All -

I am trying to use RJSONIO to read in some JSON files.

I was wondering if anyone could please comment on the level of complexity of
the files it can be used to read, exports from or directly from NoSQL DBMS
like MongoDB and such.

Also, i understand that in reading the JSON file RJSONIO will automatically
create the necessary structures. However I cannot seem to use to to read the
file properly and get this error:

Error in function (classes, fdef, mtable)  : 
  unable to find an inherited method for function "fromJSON", for signature
"missing", "NULL"

The call I am making is:
noSqlData <- fromJSON(file='data.json')

It is a small file - with 3 levels of nested records.

And if there were some links to some examples with a file and usage would be
great. My JSON file validates - so do not believe there is anything wrong
with the file.

Thanks for your help.





--
View this message in context: 
http://r.789695.n4.nabble.com/Reading-JSON-files-from-R-tp4651976.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Different results from random.Forest with test option and using predict function

2012-12-03 Thread tdbuskirk

Hello R Gurus,

I am perplexed by the different results I obtained when I ran code like
this:
set.seed(100)
test1<-randomForest(BinaryY~., data=Xvars, trees=51, mtry=5, seed=200)
predict(test1, newdata=cbind(NewBinaryY, NewXs), type="response")

and this code:
set.seed(100)
test2<-randomForest(BinaryY~., data=Xvars, trees=51, mtry=5, seed=200,
xtest=NewXs, ytest=NewBinarY)

The confusion matrices for the two forests I thought would be the same by
virtue of the same seed settings, but they differ as do the predicted values
as well as the votes.  At first I thought it was just the way ties were
broken, so I changed the number of trees to an odd number so there are no
ties anymore.  

Can anyone shed light on what I am hoping is a simple oversight?  I just
can't figure out why the results of the predictions from these two forests
applied to the NewBinaryYs and NewX data sets would not be the same.

Thanks for any hints and help.

Sincerely,

Trent Buskirk



--
View this message in context: 
http://r.789695.n4.nabble.com/Different-results-from-random-Forest-with-test-option-and-using-predict-function-tp4651970.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Resampling Help Needed

2012-12-03 Thread KoopaTrooper

I am using package ks() to build 3D representations of bird territories and
calculate territory volume from spatial data (simply x, y, and z
coordinates). What I want to do is determine at what sample size (#
locations collected) does the territory volume stop increasing. This should
give me an idea of the number of points needed for future seasons. 

So I have a couple of birds each with 200 spatial locations (x,y,z). I want
to run the following code (see below), but have R calculate territory size
100 times with 10 random points (no replacement), 100 times with 20 random
points, 100 times with 30 random points, etc. I can figure out how to do
this manually (i.e. create 100 individual files with 10 random points, 20
random points, etc.) but I figure there must be a way to make my life
easier. Any help would be appreciated. Even pointing me in the correct
direction would be a big help. Thanks!

Nathan

#read data files (.csv's with 200 rows of x,y,z coordinates)
a<-read.csv("A_PW_ASY_M_LII_2011.csv")

#calls the plug-in bandwidth estimator
Ha <- Hpi(a) 

#sets min/max grid size for each dimension
minX<-min(a$X)-25
minY<-min(a$Y)-25
minZ<-0

maxX<-max(a$X)+25
maxY<-max(a$Y)+25
maxZ<-max(a$Z)+5

#creates kernel utilization distribution
fhata <- kde(x=a, H=Ha, binned=FALSE, xmin=c(minX,minY,minZ),
xmax=c(maxX,maxY,maxZ)) 

#calculates territory volume at 95% isopleth
Vol95<-contourSizes(fhata, cont=95)



--
View this message in context: 
http://r.789695.n4.nabble.com/Resampling-Help-Needed-tp4651973.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Change case of factor in data frame

2012-12-03 Thread Audrey

> res=names(dat);
> get(res[ind],pos=dat) will retrieve dat$name


There are far less baroque was of doing that (including dat$name and  
dat[["name"]].

Both dat$name and dat[["name"]] require you to know what "name" is. I was
looking for a way to retrieve a data frame column by name without actually
knowing what the data frame column name was. get() seems to do that, but I
am open to other options. 

Thank you for your advice regarding [ vs. [[. Indeed, lower(dat[[ind]]) does
return the desired result. However, it seems that ind can only be a single
integer (or evaluate to TRUE for only 1 column), I guess because "[[" is
returning a vector. 

By "text" I mean anything that was non-numerical: character and factor
classes, in my case. 



--
View this message in context: 
http://r.789695.n4.nabble.com/Change-case-of-factor-in-data-frame-tp4651696p4651971.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Creating Venn-like intersections for multiple data sets: Vennerable Package help

2012-12-03 Thread MLiebers

I was wondering if you could lend me some advice in using the Vennerable
package.  I am having trouble creating the right sort of Venn function
output.   I have "TRUE" and "FALSE"/NA
str(mutset)
'data.frame':   2310 obs. of  3 variables:
 $ TestResult.1  : chr  NA NA NA NA ...
 $ TestResult.2   :chr  NA NA NA NA...
 $ TestResult.3   : chr  NA NA NA NA ...

I always get something like this:
> Vstem<-Venn(mutset)
>Vstem
A Venn object on 3 sets named
TestResult.1,TestResult.2,TestResult.3 
000 100 010 110 001 101 011 111 
  0  0 0  0 0  0 0  2

I have many more of these sets actually, and intermittent "TRUE" values
nestled in the NA's.  We are interested in the overlap of these "TRUE"
values in a Venn diagram sense.  For instance if $ TestResult.1 and $
TestResult.3 shared 7 instances of both being "TRUE" where $ TestResult.2
was NA, then I would hope to see:

> Vstem
A Venn object on 3 sets named
TestResult.1,TestResult.2,TestResult.3 

000 100 010 110 001 101 011 111 
  X  X X XX  7 XX

I was wondering if you knew what was going wrong.  Thanks so much.



--
View this message in context: 
http://r.789695.n4.nabble.com/Creating-Venn-like-intersections-for-multiple-data-sets-Vennerable-Package-help-tp4651972.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Periodicity of Weekly Zoo

2012-12-03 Thread Andrew Freedman

Hi List,

I have weekly sales observations for several products drawn via ODBC.
Source data is available at
https://www.dropbox.com/s/78vxae5ic8tnutf/asr.csv.

This is retail sales data, so will contain seasonality and trend
information. I expect to see 52 or 53 observations per year, each
observation occuring on the same day of the week (Saturday). Ultimately
I'm looking to feed these series into forecasting models for demand
planning.

The data has issues with internal gaps, so while I've been able to
create a ts that appears to respect the frequency and period, I suspect
that a zoo is going to be a better data container. Unfortunately, I'm
not understanding the use of zoo() to describe frequency/period/deltat.

In the example below I use sales[,16] (aka $p) as it has several
periods (data between 2004 and 2012). I've tried using frequency=52, =7
and =1, but get the same result each time; every data point ends up in
cycle 1 and I don't have the periodicity needed to find seasonality.

> sales <- read.csv("asr.csv")
> library(zoo)

Attaching package: 'zoo'

The following object(s) are masked from 'package:base':

as.Date, as.Date.numeric

> sales.zoo <- zoo(subset(sales, select=c(2:length(sales))), order.by=
+ sales$date_end, frequency = 52)
> sales.zoo.i <- na.approx(sales.zoo) # interpolate internal NA values
> frequency(sales.zoo.i) # 52, which seems right
[1] 52
> cycle(sales.zoo.i[1:20,16]) # everything is in the same cycle...
2004-08-14 2004-08-21 2004-08-28 2004-09-04 2004-09-11 2004-09-18 
 1  1  1  1  1  1 
2004-09-25 2004-10-02 2004-10-09 2004-10-16 2004-10-23 2004-10-30 
 1  1  1  1  1  1 
2004-11-06 2004-11-13 2004-11-20 2004-11-27 2004-12-04 2004-12-11 
 1  1  1  1  1  1 
2004-12-18 2004-12-25 2005-01-01 2005-01-08 2005-01-15 2005-01-22 
 1  1  1  1  1  1 
2005-01-29 2005-02-05 2005-02-12 2005-02-19 2005-02-26 2005-03-05 
 1  1  1  1  1  1 
> 

Doubtless it's some facile error that will make me feel sheepish, but
I've been staring at this for a bit now and just getting nowhere. Any
pointers would be greatly appreciated.

Thanks,

Andrew

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] r function definition

2012-12-03 Thread Robert Baer


On 12/3/2012 2:30 PM, qq wrote:

I am a very new R user. I am trying to write functons and debug functions.
One problem for me is that I need to alwasy copy the whole function body and
resubmit to R console every time I changed even one line of the function.
Because I have long algorithm function, copying and pasting is very tedious
for me. I assume if I save the function files, R should be able to just use
the new function body since it is a scripting language. Can somebody let me
know his best practice of using R function?
You might be at the point where a development environment is useful.  
There are multiple choices, but I'm quite fond of RStudio.

http://www.rstudio.com/ide/download/

Rob




--
View this message in context: 
http://r.789695.n4.nabble.com/r-function-definition-tp4651943.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
__
Robert W. Baer, Ph.D.
Professor of Physiology
Kirksille College of Osteopathic Medicine
A. T. Still University of Health Sciences
Kirksville, MO 63501 USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Histogram plot help


Hello,

I can't say I understand your graph but as for the abline not showing 
up, it's outside the clipped region so it shouldn't. If you want it to 
show up, in the previous line, and after the hist() call, include


clip(2, -2, usr[3], usr[4])

As for the decimals, those are normal, have you seen the data you're 
ploting? To have whole numbers use the arguments ylim and yaxt = "n" in 
the first call to hist and axis(2, ...) afterward. Something like


hist(x, xlim=c(-4,4),ylab="Prevalence",prob=T,lwd=3,las=1, ylim = c(0, 
3), yaxt="n")

axis(2, at = 0:3)


Oh, and you forgot prob = TRUE in the other calls to hist().

Hope this helps,

Rui Barradas

Em 03-12-2012 22:21, YAddo escreveu:

Dear All:

I plotted a histogram with Abline, clipping with color codes but i run into
some problems.   The "abline' does not show up at all,  and when i request
the 'prob=True' (to obtain the freqs), my clipped region colors the section
of the graph instead of the plot only.

Is there any way i can get the y-axis figures  to be in whole numbers rather
than decimals?

Many thanks for your help.
YA

Here are the working codes i am tweaking.

Everything worked fine before i trying adding stuffs (prob=T, etc).


x <- rnorm(1000)
hist(x, xlim=c(-4,4),ylab="Prevalence",prob=T,lwd=3,las=1)
lines(density(x),col="black",lwd=2)
usr <- par("usr")
clip(usr[1], -2, usr[3], usr[4])
hist(x, col = 'red', add = TRUE)
clip(2, usr[2], usr[3], usr[4])
hist(x, col = 'blue', add = TRUE)
abline(v=c(-1),lty=1,lwd=3,col="black")
do.call("clip", as.list(usr))  # reset to plot region





--
View this message in context: 
http://r.789695.n4.nabble.com/Histogram-plot-help-tp4651958.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] .Rd vs. .R, matrix multiplication

2012-12-03 Thread Duncan Murdoch


On 12-12-03 5:00 PM, Christian Hoffmann wrote:

Hi,

I find it cumbersomesome the I have to use \%*\% in .Rd files vs. %*% in
.R files. R CMD check will refuse %*% in .Rd files. I would like to have
%*% in .Rd files to be able to execute expressions with matrix
multiplication from .Rd files directly, but ESS (version 5.13) would
refuse to execute this execution.

What can be done here?



That sounds like a problem that ESS could solve.  The rules for what 
needs escaping in .Rd files are well defined, and ESS should know what 
kind of file it is editing, so it should be fixable.


Allowing raw %*% in a .Rd code block would mean that invisible comments 
were impossible (since Rd comments start with %).  They are not 
extremely common, but common enough that it would inconvenience a lot of 
package writers to make that change.  Seems like a small ESS fix is easier.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to read SPSS file in R

2012-12-03 Thread Mark Lamias

The following works just fine for me, using the built-in SPSS dataset 
nhis2000_subset.sav.  

library(foreign)
a=read.spss("C:\\Program 
Files\\IBM\\SPSS\\Statistics\\21\\Samples\\English\\nhis2000_subset.sav", 
to.data.frame=T)
 names(a)
 [1] "STRATUM"  "PSU"  "WTFA_SA"  "SEX"  "AGE_P"    "REGION"   
"SMKNOW"   "VITANY"   "VITMUL"   "HERBSUPP" "VIGFREQW"
[12] "MODFREQW" "STRFREQW" "DESIREWT" "MOVE1"    "LIFT" "age_cat"

You do not need to specify a comma delimiter on your read.spss statement since 
the file is already in a native SPSS dataset format -- not a CSV file.

--Mark Lamias






 From: F86 
To: r-help@r-project.org 
Sent: Monday, December 3, 2012 10:55 AM
Subject: [R] How to read SPSS file in R

Dear R-users, 

I have som troubles with .sav file. How is it possible for us R-users to
read SPSS files. I know that is possible, 


I tried the following: 


> library(foreign)
> Corp<-read.spss("/Users/kama/Analysis/Corporation.sav", header=TRUE,
> sep=",")
Error in read.spss("/Users/kama/Analysis/Corporation.sav", header = TRUE,  : 
  unused argument(s) (header = TRUE, sep = ",")
> Corp<-read.spss("/Users/kama/Analysis/Corporation.sav")
re-encoding from UTF-8

Any suggestions please? 

Regards, 
Faradj
Stockholm University



--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-read-SPSS-file-in-R-tp4651896.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to rename the columns of as.table

2012-12-03 Thread Hard Core

Thanks man ... perfect ... thank you very much ;)

you are the best



--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-rename-the-columns-of-as-table-tp4651806p4651966.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Histogram plot help

2012-12-03 Thread YAddo

Dear All:

I plotted a histogram with Abline, clipping with color codes but i run into
some problems.   The "abline' does not show up at all,  and when i request
the 'prob=True' (to obtain the freqs), my clipped region colors the section
of the graph instead of the plot only.

Is there any way i can get the y-axis figures  to be in whole numbers rather
than decimals? 

Many thanks for your help.
YA

Here are the working codes i am tweaking.

Everything worked fine before i trying adding stuffs (prob=T, etc).


x <- rnorm(1000)
hist(x, xlim=c(-4,4),ylab="Prevalence",prob=T,lwd=3,las=1)
lines(density(x),col="black",lwd=2)
usr <- par("usr")
clip(usr[1], -2, usr[3], usr[4])
hist(x, col = 'red', add = TRUE)
clip(2, usr[2], usr[3], usr[4])
hist(x, col = 'blue', add = TRUE)
abline(v=c(-1),lty=1,lwd=3,col="black")
do.call("clip", as.list(usr))  # reset to plot region





--
View this message in context: 
http://r.789695.n4.nabble.com/Histogram-plot-help-tp4651958.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Chi-squared test when observed near expected



On Dec 3, 2012, at 1:40 PM, Troy S wrote:


Dear UseRs,

I'm running a chi-squared test where the expected matrix is the same  
as the

observed, after rounding.


  ... after rounding you say?


 R reports a X-squared of zero with a p value of
one.  I can justify this because any other result will deviate at  
least as

much from the expected because what we observe is the expected, after
rounding.  But the formula for X-squared, sum (O-E)^2/E gives a  
positive

value.


 If  O==E that sum would be identically 0 if the conditions stated  
held ...  which they do NOT for the case below.




 What is the reason for X-Squared being zero in this case?

Troy


trial<-as.table(matrix(c(26,16,13,7),ncol=2))
x<-chisq.test(trial)
x




data:  trial
X-squared = 0, df = 1, p-value = 1


x$expected

A B
A 26.41935 12.580645
B 15.58065  7.419355


x$statistic

  X-squared
5.596653e-31

(x$observed-x$expected)^2/x$expected

   A   B
A 0.006656426 0.013978495
B 0.011286983 0.023702665

sum((x$observed-x$expected)^2/x$expected)

[1] 0.05562457




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Speeding reading of a large file

2012-12-03 Thread Fisher Dennis

Colleagues,  

This past week, I asked the following question:

I have a file that looks that this:

TABLE NO.  1
 PTIDTIMEAMT FORMPERIOD  IPRED  
 CWRES   EVIDCP  PREDRES WRES
  2.0010E+03  3.9375E-01  5.E+03  2.E+00  0.E+00  
0.E+00  0.E+00  1.E+00  0.E+00  0.E+00 0.E+00  
0.E+00
  2.0010E+03  8.9583E-01  5.E+03  2.E+00  0.E+00  
3.3389E+00  0.E+00  1.E+00  0.E+00  3.5321E+00 0.E+00  
0.E+00
  2.0010E+03  1.4583E+00  5.E+03  2.E+00  0.E+00  
5.8164E+00  0.E+00  1.E+00  0.E+00  5.9300E+00 0.E+00  
0.E+00
  2.0010E+03  1.9167E+00  5.E+03  2.E+00  0.E+00  
8.3633E+00  0.E+00  1.E+00  0.E+00  8.7011E+00 0.E+00  
0.E+00
  2.0010E+03  2.4167E+00  5.E+03  2.E+00  0.E+00  
1.0092E+01  0.E+00  1.E+00  0.E+00  1.0324E+01 0.E+00  
0.E+00
  2.0010E+03  2.9375E+00  5.E+03  2.E+00  0.E+00  
1.1490E+01  0.E+00  1.E+00  0.E+00  1.1688E+01 0.E+00  
0.E+00
  2.0010E+03  3.4167E+00  5.E+03  2.E+00  0.E+00  
1.2940E+01  0.E+00  1.E+00  0.E+00  1.3236E+01 0.E+00  
0.E+00
  2.0010E+03  4.4583E+00  5.E+03  2.E+00  0.E+00  
1.1267E+01  0.E+00  1.E+00  0.E+00  1.1324E+01 0.E+00  
0.E+00

The file is reasonably large (> 10^6 lines) and the two line header is 
repeated periodically in the file.
I need to read this file in as a data frame.  Note that the number of 
columns, the column headers, and the number of replicates of the headers are 
not known in advance.

I received a number of replies, many of them quite useful.  Of these, one beat 
out all the others in my benchmarking using files ranging from 10^5 to 10^6 
lines.
That version, provided by Jim Holtman, was:
x   <- read.table(FILE, as.is = TRUE, skip=1, fill=TRUE, 
header = TRUE)
x[] <- lapply(x, as.numeric)
x   <- x[!is.na(x[,1]), ]

Other versions involved readLines, following by edits, following by cat (or 
write) to a temp file, then read.table again.  
The overhead with invoking readLines, write/cat, and read.table was 
substantially larger than the strategy of read.table / as.numeric / indexing

Thanks for the input from many folks.

Dennis

Dennis Fisher MD
P < (The "P Less Than" Company)
Phone: 1-866-PLessThan (1-866-753-7784)
Fax: 1-866-PLessThan (1-866-753-7784)
www.PLessThan.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Confidence bands with function survplot



On Dec 3, 2012, at 1:19 PM, Tian3507 wrote:



Dear all,

I am trying to plot KM curves with confidence bands with function  
survplot under package rms.


However, the following codes do not seem to work. The KM curves are  
produced, but the confidence bands are not there.


Any insights? Thanks in advance.

library(rms)
data generation
n <- 1000
set.seed(731)
age <- 50 + 12*rnorm(n)
label(age) <- "Age"
sex <- factor(sample(c('male','female'), n, TRUE))
cens <- 15*runif(n)
h <- .02*exp(.04*(age-50)+.8*(sex=='female'))
dt <- -log(runif(n))/h
label(dt) <- 'Follow-up Time'
e <- ifelse(dt <= cens,1,0)
dt <- pmin(dt, cens)
units(dt) <- 'Year'
dd <- datadist(age, sex)
options(datadist='dd')
S <- Surv(dt,e)
tte<-data.frame(dt,e,age,sex)
male group KM curve with bands#
f <- survfit(Surv(dt,e)~1, data=tte,subset=sex=='male')
survplot(f,conf="bands", conf.int=0.95)


Reading the "manual" would seem to be the first step:

" conf.int
. This argument is ignored for fits from survfit, which must have  
previously specified confidence interval specifications."


Although I then noted that your code (and the obvious shift of  
conf.int as an argument to survfit) _both_ created the same confidence  
bands on my machine, so we appear to have different versions (or OS or  
machines)  running. Reading the help page for survfit leaves me a bit  
puzzled since the default for conf.int is 0.95, so I am unable to  
explain why you did not get the confidence bands.


OS= Mac 10.6.8; R= 2.15.2;  rms_3.5-0  Hmisc_3.9-3

--

David.




notice no confidence band created#
female group KM curve with bands#
g <- survfit(Surv(dt,e)~1, data=tte,subset=sex!='male')
survplot(g,conf="bands", conf.int=0.95,col=3, add=T)


--

David Winsemius, MD
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Chi-squared test when observed near expected

2012-12-03 Thread Ted Harding

On 03-Dec-2012 21:40:35 Troy S wrote:
> Dear UseRs,
> I'm running a chi-squared test where the expected matrix is the same
> as the observed, after rounding. R reports a X-squared of zero with
> a p value of one. I can justify this because any other result will
> deviate at least as much from the expected because what we observe
> is the expected, after rounding. But the formula for X-squared,
> sum (O-E)^2/E gives a positive value. What is the reason for X-Squared
being zero in this case?
> 
> Troy
> 
>> trial<-as.table(matrix(c(26,16,13,7),ncol=2))
>> x<-chisq.test(trial)
>> x
> 
> data:  trial
> X-squared = 0, df = 1, p-value = 1
> 
>> x$expected
>  A B
> A 26.41935 12.580645
> B 15.58065  7.419355
>>
>> x$statistic
>X-squared
> 5.596653e-31
>> (x$observed-x$expected)^2/x$expected
> A   B
> A 0.006656426 0.013978495
> B 0.011286983 0.023702665
>> sum((x$observed-x$expected)^2/x$expected)
> [1] 0.05562457

The reason is that (by default, see ?chisq.test ) the statistic
is caluclated using the "continuity correction" (1/2 is subtracted
from each abs(O-E) difference). The default setting in chisq.test()
is "correct = TRUE". Try it with "correct = FALSE":

  x0<-chisq.test(trial,correct=FALSE)
  x0
  #  Pearson's Chi-squared test
  # data:  trial 
  # X-squared = 0.0556, df = 1, p-value = 0.8136

which agrees with your calculation of

  sum((x$observed-x$expected)^2/x$expected)
  # [1] 0.05562457

Hoping this helps,
Ted.

-
E-Mail: (Ted Harding) 
Date: 03-Dec-2012  Time: 22:44:14
This message was sent by XFMail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Chi-squared test when observed near expected

2012-12-03 Thread William Dunlap

> > sum((x$observed-x$expected)^2/x$expected)
> [1] 0.05562457

Read about Yate's continuity correction - your formula does not use it
and chisq.test does unless you suppress it:

  > chisq.test(trial)  
  
  Pearson's Chi-squared test with Yates' continuity correction

  data:  trial 
  X-squared = 0, df = 1, p-value = 1
  
  > chisq.test(trial, correct=FALSE)

  Pearson's Chi-squared test

  data:  trial 
  X-squared = 0.0556, df = 1, p-value = 0.8136


Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
> Behalf
> Of Troy S
> Sent: Monday, December 03, 2012 1:41 PM
> To: r-help@r-project.org
> Subject: [R] Chi-squared test when observed near expected
> 
> Dear UseRs,
> 
> I'm running a chi-squared test where the expected matrix is the same as the
> observed, after rounding.  R reports a X-squared of zero with a p value of
> one.  I can justify this because any other result will deviate at least as
> much from the expected because what we observe is the expected, after
> rounding.  But the formula for X-squared, sum (O-E)^2/E gives a positive
> value.  What is the reason for X-Squared being zero in this case?
> 
> Troy
> 
> > trial<-as.table(matrix(c(26,16,13,7),ncol=2))
> > x<-chisq.test(trial)
> > x
> 
> 
> 
> data:  trial
> X-squared = 0, df = 1, p-value = 1
> 
> > x$expected
>  A B
> A 26.41935 12.580645
> B 15.58065  7.419355
> >
> > x$statistic
>X-squared
> 5.596653e-31
> > (x$observed-x$expected)^2/x$expected
> A   B
> A 0.006656426 0.013978495
> B 0.011286983 0.023702665
> > sum((x$observed-x$expected)^2/x$expected)
> [1] 0.05562457
> >
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] r function definition

2012-12-03 Thread Jeff Newmiller

?source
---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

qq  wrote:

>I am a very new R user. I am trying to write functons and debug
>functions.
>One problem for me is that I need to alwasy copy the whole function
>body and
>resubmit to R console every time I changed even one line of the
>function.
>Because I have long algorithm function, copying and pasting is very
>tedious
>for me. I assume if I save the function files, R should be able to just
>use
>the new function body since it is a scripting language. Can somebody
>let me
>know his best practice of using R function?
>
>
>
>--
>View this message in context:
>http://r.789695.n4.nabble.com/r-function-definition-tp4651943.html
>Sent from the R help mailing list archive at Nabble.com.
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] discrepancy in fisher exact test between R and wiki formula

2012-12-03 Thread Ted Harding

On 03-Dec-2012 21:22:28 JiangMei wrote:
> Hi All. Sorry to bother you. I have a question about fisher exact test.
> 
> I counted the presence of gene mutation in two groups of samples.
> My data is as follows
> Presence   Absence
> GroupA   46
> GroupB   511
> 
> When using the formula of fisher exact test provided by wiki
> (http://en.wikipedia.org/wiki/Fisher%27s_exact_test), the p-value is 0.29.
> 
> But when calculated by R, the p-value is 0.69. My code is shown below
> counts<-c(4,5,6,11)
> data<-matrix(counts,nrow=2)
> fisher.test(data)
> 
> Why did I get two different numbers? Is there anything wrong with my R codes?
> 
> Wish your help! Thanks very much! I really appreciate it.

The reason is that the formula given in Wikipedia is for one particlar
set of values (a,b,c,d). In your case, a=4, b=6, c=5, d=11 and the
Wikipedia formula for p gives the probability of (a,b,c,d) = (4,6,5,11).

However, this is not the P-value for the test. For a 3-sided
alternative (see ?fisher.test ) the P-value is the sum of all such
probabilities for values of (a,b,c,d) such that a+b = 10, c+d = 16,
a+c = 9, b+d = 17 AND the probability p is less than or equal to
the probability of (4,6,5,11). So it includes the case that has been
observed and (in general) others, so will be greater (0.69) than the
value (0.29) given by the formula.

The default alternative for R's fisher.test() is "two-sided".
If you look at ?fisher.test() you will see:

  Two-sided tests are based on the probabilities of the tables,
  and take as 'more extreme' all tables with probabilities less
  than or equal to that of the observed table, the p-value being
  the sum of such probabilities.

I hope this helps.
Ted.

-
E-Mail: (Ted Harding) 
Date: 03-Dec-2012  Time: 22:24:00
This message was sent by XFMail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Nested ANCOVA question

2012-12-03 Thread Sean Bignami

Hello R experts,
I have having a difficult time figuring out how to perform and interpret an 
ANCOVA of my nested experimental data and would love any suggestions that you 
might have.

Here is the deal:

1) I have twelve tanks of fish (1-12), each with a bunch of fish in them
2) I have three treatments (1-3); 4 tanks per treatment. (each tank only has 
one treatment applied to it)
3) I sampled multiple fish from each tank (1-3) and would like to nest my tanks 
within each treatment (i.e. four tanks nested in treatment 1, four tanks in 
treatment 2 and four tanks in treatment 3) in order to account for any 
additional variance due to random tank effects within a treatment. 
4) The dependent variable in this case is the AREA of an anatomical structure, 
which is proportional to body length (the covariate). 
5) Here is a simplified example of my data-frame structure and the code to 
generate it. I have re-run the following analysis on this example data set for 
this email:

treatment<-c(rep(1:3,each=12))
tank<-c(rep(1:12,each=3))
fish<-c(rep(1:3, each=1, times=12))
length<-c(runif(36,15,25))
area<-c(length[1:12]*10+10,length[13:24]*10+20,length[25:36]*10)
 
fishdata<-data.frame(treatment,tank,fish,length,area);fishdata

#   treatment tank fish   length area
#1  111 16.71511 177.1511
#2  112 21.95281 229.5281
#3  113 20.39821 213.9821
#4  121 23.67241 246.7241
#5  122 17.35663 183.5663
#6  123 21.61087 226.1087
#7  131 21.20769 222.0769
#8  132 20.14224 211.4224
#9  133 18.50520 195.0520
#10 141 21.57603 225.7603
#11 142 20.27112 212.7112
#12 143 22.78739 237.8739
#13 251 19.80727 218.0727
#14 252 15.11602 171.1602
#...and so on...

plot(length,area)

I have tried the following code to run the ANCOVA with the nested design. Area 
is the dependent, treatment is a fixed factor, length is a covariate of area, 
and tank is nested within treatment. 

QUESTION 1. Have I written this code correctly? I a little confused if it 
should be tank/treatment of treatment/tank. From what I understand, this part 
is testing the equality of the slopes for each treatment group (the interaction 
term). 

RSslopes<-aov(area~length*treatment+Error(tank/treatment),data=fishdata)
summary(RSslopes)

#Error: tank
#  Df Sum Sq Mean Sq
#length  1  753.6   753.6
#
#Error: tank:treatment
#   Df Sum Sq Mean Sq
#length  1   46334633
#
#Error: Within
# Df Sum Sq Mean Sq  F value   Pr(>F)
#length1  25282   25282 2254.451  < 2e-16 ***
#treatment 1327 327   29.192 7.47e-06 ***
#length:treatment  1  5   50.4610.502
#Residuals30336  11  
#---
#Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1   1 



These results show no significant difference in slopes (P=0.502), so I 
interpret that as conforming to the assumption of equal slopes in order to run 
the actual ANCOVA and test for a treatment effect. 
QUESTION 2. How do I test to see if there is a tank effect at this point,? I'm 
not sure which Mean Sq value to test over the residuals, or if I should wait to 
do that until the next test?

This next part removes the interaction term to just test the treatment effect 
while removing the effect of length. right?

RSancova<-aov(area~length+treatment+Error(tank/treatment),data=fishdata)
summary(RSancova)

#Error: tank
#   Df Sum Sq Mean Sq
#length  1  753.6   753.6
#
#Error: tank:treatment
#   Df Sum Sq Mean Sq
#length  1   46334633
#
#Error: Within
#  Df Sum Sq Mean Sq F value  Pr(>F)
#length 1  25282   25282 2294.34 < 2e-16 ***
#treatment  1327 327   29.71 5.9e-06 ***
#Residuals 31342  11
#---
#Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1   1 


So I think this tells me that I do have a treatment effect...but I still would 
like to know how to arrange the Mean Sq for the F-test for tank effect. 

Now I test to see if removing the interaction term significantly affects the 
fit of the model...but this part doesn't work with a nested design!! (it does 
work if I do the exact same thing without the nested term.
 I get the following:


anova(RSslopes,RSancova) 
#Error in UseMethod("anova") : 
#  no applicable method for 'anova' applied to an object of class "c('aovlist', 
'listof')"

This has something to do with the nested design putting out an aov list, I 
think.

Now I really lose itbecause I know how to perform a Tukey HSD test to 
compare each combination of treatments if I had a regular non-nested ANCOVA, 
but when I try it with these tests it can't run...

 HSD.test(RSancova,"treatment")
Error in as.data.frame.default(x[[i]], optional = TRUE, stringsAsFactors = 
stringsAsFactors) : 
  cannot coerce class 'c("aovlist", "l

[R] Chi-squared test when observed near expected

2012-12-03 Thread Troy S

Dear UseRs,

I'm running a chi-squared test where the expected matrix is the same as the
observed, after rounding.  R reports a X-squared of zero with a p value of
one.  I can justify this because any other result will deviate at least as
much from the expected because what we observe is the expected, after
rounding.  But the formula for X-squared, sum (O-E)^2/E gives a positive
value.  What is the reason for X-Squared being zero in this case?

Troy

> trial<-as.table(matrix(c(26,16,13,7),ncol=2))
> x<-chisq.test(trial)
> x



data:  trial
X-squared = 0, df = 1, p-value = 1

> x$expected
 A B
A 26.41935 12.580645
B 15.58065  7.419355
>
> x$statistic
   X-squared
5.596653e-31
> (x$observed-x$expected)^2/x$expected
A   B
A 0.006656426 0.013978495
B 0.011286983 0.023702665
> sum((x$observed-x$expected)^2/x$expected)
[1] 0.05562457
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Forest plot

2012-12-03 Thread Min Dong

Hi, I am a novice in R. It will be greatly appreciated if someone
can advise me with the following questions.

1) How to highlight reference range in forest plot? For example, if 1.5-2
is the reference range, I would like to have all the area between 1.5-2 to
be highlighted (such as in grey color).

2) If I have handunds of objects, how to set the output plot to multiple
columns? For example, I have 500 objects included, and I want to have
objects 1-100 to the left (column #1), 101-200 in column#2 (next to
column#1)...etc, ect, how to do it?

Thank you very much!

Mindy

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] discrepancy in fisher exact test between R and wiki formula

2012-12-03 Thread JiangMei


Hi All. Sorry to bother you. I have a question about fisher exact test.

I counted the presence of gene mutation in two groups of samples. My data is as 
follows
Presence   Absence
GroupA   46
GroupB   511

When using the formula of fisher exact test provided by wiki 
(http://en.wikipedia.org/wiki/Fisher%27s_exact_test), the p-value is 0.29.

But when calculated by R, the p-value is 0.69. My code is shown below
counts<-c(4,5,6,11)
data<-matrix(counts,nrow=2)
fisher.test(data)

Why did I get two different numbers? Is there anything wrong with my R codes?

Wish your help! Thanks very much! I really appreciate it.


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] r function definition

2012-12-03 Thread qq

I am a very new R user. I am trying to write functons and debug functions.
One problem for me is that I need to alwasy copy the whole function body and
resubmit to R console every time I changed even one line of the function.
Because I have long algorithm function, copying and pasting is very tedious
for me. I assume if I save the function files, R should be able to just use
the new function body since it is a scripting language. Can somebody let me
know his best practice of using R function?



--
View this message in context: 
http://r.789695.n4.nabble.com/r-function-definition-tp4651943.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Confidence bands with function survplot

2012-12-03 Thread Tian3507


Dear all,
 
I am trying to plot KM curves with confidence bands with function survplot 
under package rms.
 
However, the following codes do not seem to work. The KM curves are produced, 
but the confidence bands are not there. 
 
Any insights? Thanks in advance.
 
library(rms)
data generation
n <- 1000
set.seed(731)
age <- 50 + 12*rnorm(n)
label(age) <- "Age"
sex <- factor(sample(c('male','female'), n, TRUE))
cens <- 15*runif(n)
h <- .02*exp(.04*(age-50)+.8*(sex=='female'))
dt <- -log(runif(n))/h
label(dt) <- 'Follow-up Time'
e <- ifelse(dt <= cens,1,0)
dt <- pmin(dt, cens)
units(dt) <- 'Year'
dd <- datadist(age, sex)
options(datadist='dd')
S <- Surv(dt,e)
tte<-data.frame(dt,e,age,sex)
male group KM curve with bands#
f <- survfit(Surv(dt,e)~1, data=tte,subset=sex=='male')
survplot(f,conf="bands", conf.int=0.95)
notice no confidence band created#
female group KM curve with bands#
g <- survfit(Surv(dt,e)~1, data=tte,subset=sex!='male')
survplot(g,conf="bands", conf.int=0.95,col=3, add=T)
 
Best regards, 
Hong
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] .Rd vs. .R, matrix multiplication

2012-12-03 Thread Christian Hoffmann


Hi,

I find it cumbersomesome the I have to use \%*\% in .Rd files vs. %*% in 
.R files. R CMD check will refuse %*% in .Rd files. I would like to have 
%*% in .Rd files to be able to execute expressions with matrix 
multiplication from .Rd files directly, but ESS (version 5.13) would 
refuse to execute this execution.


What can be done here?

TIA

Christian

--
Christian W. Hoffmann,
CH - 8915 Hausen am Albis, Switzerland
Rigiblickstrasse 15 b, Tel.+41-44-7640853
c-w.hoffm...@sunrise.ch,
christ...@echoffmann.ch,
www.echoffmann.ch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] CTM and survival analysis with heterogeneity

2012-12-03 Thread Sebastián Daza

Hello R experts,

I wonder if there is any package to estimate  this kind of models in R:

Multi-state Multi-spell Survival Models with Heterogeneity

One of the most powerful programs for survival models is CTM, the
Continuous Time Model, developed for NIH at NORC under the direction
of James J. Heckman. Generalizing the competing risks model, CTM
allows for transitions between any number of states, repeated spells
within a state, time varying covariates, person and state specific
heterogeneity, and arbitrary duration dependence specified in a highly
flexible hazard model. Although in all cases the baseline hazard is
fully parametric, it can be specified as a sufficiently rich function
of time to capture a very wide set of duration dependencies.

Any references?
Thanks!

-- 
Sebastián Daza

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using multicores in R

2012-12-03 Thread Jim Porzak

Moriah,

Since you are doing nested loops, Rcpp may be an easy speed-up. Follow
all the links here
http://blog.revolutionanalytics.com/2012/11/hadleys-guide-to-high-performance-r-with-rcpp.html
for details.

HTH,
Jim Porzak
Minted.com
San Francisco, CA
www.linkedin.com/in/jimporzak
use R! Group SF: www.meetup.com/R-Users/

On Mon, Dec 3, 2012 at 2:14 AM, moriah  wrote:
> Hi,
>
> I have an R script which is time consuming because it has two nested loops
> in it of at least 5000 iterations each, I have tried to use the multicore
> package but id doesn't seem to improve the elapsed time of the script(a
> shorter script for example) and I can't use the mcapply because of technical
> reasons.
>
> I was wondering how can I make my script use more cores and memory because I
> am running it on a server and it is a shame that it uses only one core.
>
> Thanks!
> Moriah
>
>
>
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Using-multicores-in-R-tp4651808.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Solving a multinomial gompertz partial differential equation in r

2012-12-03 Thread Thomas Petzoldt


On 12/3/2012 9:12 AM, Suzen, Mehmet wrote:

Hi Brandon,

You can try ReacTran package:

cran.r-project.org/web/packages/ReacTran/vignettes/PDE.pdf

Best,
-m


ReacTran is for managing the tr5ansport in reactive transport models, is 
relies on package deSolve that contains the ODE/PDE solvers, so I would 
recommend to start with this first.


Papers, slides and tutorials with full source code can be found at:

http://desolve.r-forge.r-project.org

... and an overview about packages dealing with differential equations 
can be found in the Taskview:


http://cran.r-project.org/web/views/DifferentialEquations.html


Hope it helps

Thomas Petzoldt

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Change case of factor in data frame



On Dec 2, 2012, at 12:46 PM, Audrey wrote:

I am trying to write a function to change the case of all of the  
text in a

data frame to lower case.


Define what you mean by "text".



 I do not have foreknowledge of the data frame
names or the data types of each column.

It seems that if one references the data frame by index, then it  
returns

class "data.frame" but if it is referenced by name, it returns class
"factor" or whatever the column actually is:
dat[1] returns class "data.frame" but
dat$name returns class "factor"


Yes. That is correct.


The problem is that, when one applies the tolower() and toupper()  
functions,

dat$name will return a column of lower/uppercase returns (now class
"character") but dat[1] will return an array of factor level indexes.


It return an integer vector with attributes: class = factor and levels.

Specifying dat[1] as.character during the function call does not  
work (e.g.

tolower(as.character(dat[1]))).


Because dat[1] is a list, not a vector.




So, I would loop over the column names, but I can not figure out how  
to

generically call them:
tst=names(dat);
dat$(tst[1]) returns an error
dat[tst[1]] returns class data.frame again

Thank you in advance!

change_case<-function(dat,wcase){
 # change case
 res=sapply(dat,class); # get classes
 ind<-res=='character';
 dat[ind]<-switch(wcase,
  'lower'=tolower(dat[ind]),
  'upper'=toupper(dat[ind])
 )
 rm(ind);
 ind<-res=='factor';
 dat[ind]<-switch(wcase,
  'lower'=factor(tolower(as.character(dat[ind]))),
  'upper'=factor(toupper(as.character(dat[ind])))
 )
 return(dat);
}


You probably need to study the help("[") page and learn the difference
between "[" and "[[". Changing to "[[" in a couple of places would  
probably allow success.


--

David Winsemius, MD
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Change case of factor in data frame



On Dec 3, 2012, at 9:47 AM, Audrey wrote:


Ok, it seems that the function to get generic field names is get()



Er, not really.


res=names(dat);
get(res[ind],pos=dat) will retrieve dat$name



There are far less baroque was of doing that (including dat$name and  
dat[["name"]]. Read:


?"["

If you had a numeric object, 'ind' then

dat[ind] would retrieve a sub-list, with as many columns as there were  
items in 'ind' and would have class data.frame.


If `ind` is numeric with one element only, then

dat[[ind]] <- tolower( as.character (dat[[ind]]  ) )  # should succeed  
at least to the extent of returning a vector of a different class ...  
if what were what you wanted as to change .


It would have been faster to change just the levels attributes for  
factor columns if you were willing to retain the class of that column  
as a factor.


levels(dat[[1]])<- toupper( levels(dat[[1]] )


--
David Winsemius, MD
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How do I make R randomForest model size smaller?

2012-12-03 Thread John Foreman

I've been training randomForest models on 7 million rows of data (41
features). Here's an example call:

myModel <- randomForest(RESPONSE~., data=mydata, ntree=50, maxnodes=30)

I thought surely with only 50 trees and 30 terminal nodes that the memory
footprint of "myModel" would be small. But it's 65 megs in a dump file. The
object seems to be holding all sorts of predicted, actual, and vote data
from the training process.

What if I just want the forest and that's it? I want a tiny dump file that
I can load later to make predictions off of quickly. I feel like the forest
by itself shouldn't be all that large...

Anyone know how to strip this sucker down to just something I can make
predictions off of going forward?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to pass a vector of characters with Rscript through commandline

2012-12-03 Thread MacQueen, Don

Use the commandArgs() function to get the parameters into R, then parse
them.

-Don

-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062

On 12/3/12 8:44 AM, "narge...@yahoo.com"  wrote:

>Hi,
>
>I have an R script and I want to it to accept a vector of characters as
>the input parameters in command line with the command Rscript.
>So for example I have this vector : each <- c("04","08","12","14") and I
>want to do this: Rscript script.R each
>How can I pass it?
>
>Thanks 
>   [[alternative HTML version deleted]]
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R-Forge not building packages?

2012-12-03 Thread Spencer Graves

  If I'm not mistaken, R-forge is run by volunteers who have real 
jobs otherwise.  The system occasionally grinds to a halt without them 
noticing.  So far, each time that happens, they restart something, but 
they have yet to identify and fix the root cause of why it stops.



  Spencer


On 12/3/2012 10:34 AM, Ulrich Staudinger wrote:

Thanks for all the feedback, I have written a message to R-Forge help ...


On Mon, Dec 3, 2012 at 7:21 PM, Yihui Xie  wrote:

I will not be surprised if it takes longer than a week to build a
package on R-Forge, although I have no idea why it has to be
incredibly slow (do they have Sys.sleep(7*24*60*60) somewhere in the
code?). I do have found an alternative solution, though, which is to
spend these days on teaching more R users (especially Windows users)
how to install R packages via R CMD INSTALL or
devtools::install_github(). That is usually a lot faster than R-Forge.

Regards,
Yihui
--
Yihui Xie 
Phone: 515-294-2465 Web: http://yihui.name
Department of Statistics, Iowa State University
2215 Snedecor Hall, Ames, IA


On Mon, Dec 3, 2012 at 12:05 PM, Ulrich Staudinger
 wrote:

Hi there,

I am waiting since days for my package to be built on R-Forge.
https://r-forge.r-project.org/R/?group_id=1518

R-Forge says:
  Version: 0.2 | Last change: 2012-11-27 21:37:05+01 | Rev.: 32
Build status: Building

But I am already at revision 37 and R-Forge doesn't move since 6 days

Can anyone help?

Thanks
Ulrich



--
Ulrich Staudinger

P: +41 79 702 05 95
E: ustaudin...@activequant.com

http://www.activequant.com


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Spencer Graves, PE, PhD
President and Chief Technology Officer
Structure Inspection and Monitoring, Inc.
751 Emerson Ct.
San José, CA 95126
ph:  408-655-4567
web:  www.structuremonitoring.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to read SPSS file in R

2012-12-03 Thread S Ellison



> I tried the following: 
> 
> 
>> library(foreign)
>> Corp<-read.spss("/Users/kama/Analysis/Corporation.sav", header=TRUE,
>> sep=",")
> Error in read.spss("/Users/kama/Analysis/Corporation.sav", header = TRUE,  : 
> unused argument(s) (header = TRUE, sep = ",")
>> Corp<-read.spss("/Users/kama/Analysis/Corporation.sav")
> re-encoding from UTF-8
> 
> Any suggestions please?

At the risk of being glib, Yes. 
i) Read the error message.
ii) Read the help file for ?read.css 

Then think about why supplying two unnecessary arguments might cause an error 
message that says "unused arguments..." and what the help file tells you about 
UTF-8 encoding. 

***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using multicores in R

2012-12-03 Thread Spencer Graves

  1.  Have you looked at CRAN Task View: High-Performance and 
Parallel Computing with R 
(http://cran.r-project.org/web/views/HighPerformanceComputing.html)?



  2.  Have you tried the "compiler" package?  If I understand 
correctly, R is a two-stage interpreter, first translating what we know 
as R into byte code, which is then interpreted by a byte code 
interpreter.  If my memory is correct, this approach can cut the compute 
time by a factor of 100.



  3.  Have you reviewed the section on "Profiling R code for speed" 
in the "Writing R Extensions" manual that becomes available after 
help.start()?  The profiling tools discussed there help identify the 
portion of more complex code that takes the most time.  The standard 
advice then is to experiment with writing the most time consuming 
portion several different ways.  I've seen many examples where writing 
what appears to be the same thing in R several different ways identifies 
one that is easily 10 and maybe 100 or 1000 times faster than the 
slowest alternative tried.



  4.  Have you tried using the "sos" package to search for other 
functions and packages in R that may already have good code doing some 
of the things you want to do?  The "findFn" function in "sos" searches 
the "functions" subset of the "RSiteSearch" database and returns the 
result sorted by package.  There are also a "union" and 
"writeFindFn2xls" functions to make it easy to manipulate and evaluate 
the results, described in a vignette. It's the best literature search I 
know for anything statistical: If I don't find it there, it's OK to look 
someplace else. [Caveat:  I'm the lead author of "sos", so I'm biased.]



  Best Wishes,
  Spencer


On 12/3/2012 6:24 AM, Steve Lianoglou wrote:

And also:

On Monday, December 3, 2012, Uwe Ligges wrote:



On 03.12.2012 11:14, moriah wrote:


Hi,

I have an R script which is time consuming because it has two nested loops
in it of at least 5000 iterations each, I have tried to use the multicore
package but id doesn't seem to improve the elapsed time of the script(a
shorter script for example) and I can't use the mcapply because of
technical
reasons.


Errr, but otherwise multicore does not have an effect ...

See package "parallel" that offers various functions for parallel
computations. We cannot help much more if you do not tell us what the
technical reasons are why mcapply() does not work.


If the work you are doing within each iteration of the loop is trivial, you
will likely even see a decrease in performance if you try to parallelize it.

Without more info from you regarding your problem, there's little we can do
to help, tho.

  -Steve






--
Spencer Graves, PE, PhD
President and Chief Technology Officer
Structure Inspection and Monitoring, Inc.
751 Emerson Ct.
San José, CA 95126
ph:  408-655-4567
web:  www.structuremonitoring.com


--
Spencer Graves, PE, PhD
President and Chief Technology Officer
Structure Inspection and Monitoring, Inc.
751 Emerson Ct.
San José, CA 95126
ph:  408-655-4567
web:  www.structuremonitoring.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Daily Time Series, patterns.

2012-12-03 Thread arun

Hi,
In addition, you can use ?dayOfWeek() from library(timeDate)
set.seed(5)
quantity<-sample(c(120:220,NA),699,replace=TRUE)
Date=seq(as.Date("2011-01-01"),len=699,by="1 day")
dat3<-data.frame(Date=Date,quantity=quantity)
 nrow(dat3)
#[1] 699
library(timeDate)
dat3$tSeq<-timeSequence(dat3$Date[1],dat3$Date[699])
dat4<-dat3[isWeekday(dat3$tSeq),]

 dat4$DayWeek<-dayOfWeek(dat4$tSeq)
 head(dat4)
# Date quantity   tSeq DayWeek
#3  2011-01-03  213 2011-01-03 Mon
#4  2011-01-04  149 2011-01-04 Tue
#5  2011-01-05  130 2011-01-05 Wed
#6  2011-01-06  191 2011-01-06 Thu
#7  2011-01-07  173 2011-01-07 Fri
#10 2011-01-10  131 2011-01-10 Mon
library(zoo)
z<-zooreg(dat4[,2],frequency=5)
plot(stl(na.approx(z),"per")) 

A.K.

- Original Message -
From: mekbatub 
To: r-help@r-project.org
Cc: 
Sent: Monday, December 3, 2012 8:14 AM
Subject: Re: [R] Daily Time Series, patterns.

Hi Arun, thanks again, I think we are close.
The way You gave me looks good, but I sill have one problem, look at this:

Lets say, we have data like this:

>head(dat3)

      Date       quantity
2012-03-05 65.16
2012-03-06 70.67
2012-03-08 63.66
2012-03-09 70.05
2012-03-12 61.59
2012-03-13 58.98

Then we have:
>z <- zooreg(dat3[,2], frequency = 5)
>z
1(1)     1(2)     1(3)     1(4)     1(5)     2(1)    
65.16  70.67  63.66  70.05  61.59   58.98

This is for dates:
1(1)               1(2)                1(3)               1(4)            
1(5)                2(1)    
2012-03-05 2012-03-06 2012-03-08 2012-03-09 2012-03-12 2012-03-13

Since there was no releases in 2012-03-07 (warehouse was closed)

We should have:
1(1)     1(2)     *1(4)     1(5)     1(1)     2(2) *    
65.16  70.67  63.66  70.05  61.59   58.98

So I can’t in that case use “frequency=5”. I am trying to figure out how to
assign correctly number of the week day to quantity.

I am wondering if my way of thinking is correct, maybe I should fill out
those missing values like You suggest me last time, but the problem is there
are not missing values really, for ex. if the warehouse was closed, there
was not supposed to be releases that day so there is nothing to fill out
really. 
I am wondering is it at all possible to do this in R like I am trying to do
– without data continuity.

What You think about that?



--
View this message in context: 
http://r.789695.n4.nabble.com/Daily-Time-Series-patterns-tp4651569p4651836.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] switch() vs eval() for choosing pre-existing alternatives?

2012-12-03 Thread Mauricio Cornejo

Can anyone explain (perhaps with an example) why the R Language Definitions 
(Version 2.15.2, 2012-10-26, DRAFT) says the following in section 3.2.6?


"To choose from a list of alternatives that already exists switch() may not be 
the best way to select one for evaluation. It is often better to use eval() and 
the subset operator, [[, directly via eval(x[[condition]])."

I can't quite figure out the reasoning behind the recommendation or how to make 
good use of it.

Many thanks,
Mauricio
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Change case of factor in data frame

2012-12-03 Thread Audrey

Ok, it seems that the function to get generic field names is get()

res=names(dat);
get(res[ind],pos=dat) will retrieve dat$name



--
View this message in context: 
http://r.789695.n4.nabble.com/Change-case-of-factor-in-data-frame-tp4651696p4651919.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] match and substitute two variables

2012-12-03 Thread arun

Hi,
try this:
 paste(code,gsub("\\d+","",name)[match(code,gsub("\\D+","",name))],sep=" ")
#[1] "101001  Alta" "1032  Media"  "102  Bassa"   "101001  Alta" "102  Bassa"  
#[6] "1032  Media" 

A.K.



- Original Message -
From: irene 
To: r-help@r-project.org
Cc: 
Sent: Monday, December 3, 2012 10:32 AM
Subject: [R] match and substitute two variables

Hello, 
I have two variables (of different length and from two different data
frames):

code<- c("101001",  "1032", "102", "101001", "102", "1032");
name<- c("101001 Alta", "102 Bassa", "1032 Media");

and I would like to substitute the first variable with the  second variable
according to their shared numerical part, thus obtaining the following
result:

code.new
"101001 Alta"  "1032 Media" "102 Bassa""101001 Alta" "102 Bassa" "1032
Media"

I tried using: <- sapply(code, gsub, pattern="\\d+", replacement=name) but
the replacement cannot be of length more than one, thus my output is only
"101001 Alta" "101001 Alta"... I am not sure how to get the right answer...

Thank you!



--
View this message in context: 
http://r.789695.n4.nabble.com/match-and-substitute-two-variables-tp4651893.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] match and substitute two variables

2012-12-03 Thread irene

It works perfectly, thank you!



--
View this message in context: 
http://r.789695.n4.nabble.com/match-and-substitute-two-variables-tp4651893p4651906.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Excluding all missing values with dcast ("reshape2" package)

2012-12-03 Thread arun

Hi,
?na.omit()
dat1<-read.table(text="
Tool Step_Number
A 1 
A 2 
A 3
A 3 
B 1 
B 2
B 2 
B 3
B NA
NA 3 
",sep="",header=TRUE,stringsAsFactors=FALSE)   
 dcast(na.omit(dat1),Tool~Step_Number,length)
#Using Step_Number as value column: use value.var to override.
 # Tool 1 2 3
#1    A 1 1 2
#2    B 1 2 1

A.K.





- Original Message -
From: "michael.laviole...@dhhs.state.nh.us" 

To: r-help@r-project.org
Cc: 
Sent: Monday, December 3, 2012 10:07 AM
Subject: [R] Excluding all missing values with dcast ("reshape2" package)


Hello--I'm doing a simple crosstab using dcast:

rawfreq <- dcast(nh11brfs, race3~CHCCOPD, length)

with the results

               race3 Yes   No NA
1 White non-Hispanic 446 5473 21
2 Other non-Hispanic  29  211  0
3           Hispanic   6   81  1
4                 10   83  1

How would I modify this call to exclude all missing values; that is, to
obtain

               race3 Yes   No
1 White non-Hispanic 446 5473
2 Other non-Hispanic  29  211
3           Hispanic   6   81

Apologies if this has come up before, and thanks.

-M.L.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to pass a vector of characters with Rscript through commandline

2012-12-03 Thread narges_s

Hi,

I have an R script and I want to it to accept a vector of characters as the 
input parameters in command line with the command Rscript.
So for example I have this vector : each <- c("04","08","12","14") and I want 
to do this: Rscript script.R each 
How can I pass it?

Thanks 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to read SPSS file in R

2012-12-03 Thread F86

Dear R-users, 

I have som troubles with .sav file. How is it possible for us R-users to
read SPSS files. I know that is possible, 


I tried the following: 


> library(foreign)
> Corp<-read.spss("/Users/kama/Analysis/Corporation.sav", header=TRUE,
> sep=",")
Error in read.spss("/Users/kama/Analysis/Corporation.sav", header = TRUE,  : 
  unused argument(s) (header = TRUE, sep = ",")
> Corp<-read.spss("/Users/kama/Analysis/Corporation.sav")
re-encoding from UTF-8

Any suggestions please? 

Regards, 
Faradj
Stockholm University



--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-read-SPSS-file-in-R-tp4651896.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R-Forge not building packages?

2012-12-03 Thread Ulrich Staudinger

Thanks for all the feedback, I have written a message to R-Forge help ...


On Mon, Dec 3, 2012 at 7:21 PM, Yihui Xie  wrote:
> I will not be surprised if it takes longer than a week to build a
> package on R-Forge, although I have no idea why it has to be
> incredibly slow (do they have Sys.sleep(7*24*60*60) somewhere in the
> code?). I do have found an alternative solution, though, which is to
> spend these days on teaching more R users (especially Windows users)
> how to install R packages via R CMD INSTALL or
> devtools::install_github(). That is usually a lot faster than R-Forge.
>
> Regards,
> Yihui
> --
> Yihui Xie 
> Phone: 515-294-2465 Web: http://yihui.name
> Department of Statistics, Iowa State University
> 2215 Snedecor Hall, Ames, IA
>
>
> On Mon, Dec 3, 2012 at 12:05 PM, Ulrich Staudinger
>  wrote:
>> Hi there,
>>
>> I am waiting since days for my package to be built on R-Forge.
>> https://r-forge.r-project.org/R/?group_id=1518
>>
>> R-Forge says:
>>  Version: 0.2 | Last change: 2012-11-27 21:37:05+01 | Rev.: 32
>> Build status: Building
>>
>> But I am already at revision 37 and R-Forge doesn't move since 6 days
>>
>> Can anyone help?
>>
>> Thanks
>> Ulrich
>>
>>
>>
>> --
>> Ulrich Staudinger
>>
>> P: +41 79 702 05 95
>> E: ustaudin...@activequant.com
>>
>> http://www.activequant.com
>>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Ulrich Staudinger

P: +41 79 702 05 95
E: ustaudin...@activequant.com

http://www.activequant.com

AQ-R user? Join our mailing list:
http://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/aqr-user

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Non-quadratic plots



On Dec 3, 2012, at 9:29 AM, Jessica Streicher wrote:

Nevermind, found i can at least do it with pdf() and consorts, so  
i'll get it right in my images, and thats the main point.


On 03.12.2012, at 18:06, Jessica Streicher wrote:

I'd like to make plots that do not have the quadratic layout, like  
having a plot that is twice as wide as it is high (without  
distorting everything).

Sadly i wasn't able to find anything in par that does that.
Best would be with plot(), but i'd use the ggplot package as well  
if necessary.




When posting it is requested that you include your OS and R version.  
In graphics output questions it is common that OS info might be  
relevant. I'm reading here from the help pages on a Mac in R version  
2.15.2.  The height and width arguments control the setup of the  
quartz() device (and I suspect for the corresponding windows device.  
Other output devices also have height and width parameters as you  
note.  In base graphics there is an 'asp' (aspect ratio) argument for  
the interactive graphics window. In Lattice there are layout.height  
and layout.width that are arguments to `lattice.options`.


--
David Winsemius, MD
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R-Forge not building packages?

2012-12-03 Thread Yihui Xie

I will not be surprised if it takes longer than a week to build a
package on R-Forge, although I have no idea why it has to be
incredibly slow (do they have Sys.sleep(7*24*60*60) somewhere in the
code?). I do have found an alternative solution, though, which is to
spend these days on teaching more R users (especially Windows users)
how to install R packages via R CMD INSTALL or
devtools::install_github(). That is usually a lot faster than R-Forge.

Regards,
Yihui
--
Yihui Xie 
Phone: 515-294-2465 Web: http://yihui.name
Department of Statistics, Iowa State University
2215 Snedecor Hall, Ames, IA

On Mon, Dec 3, 2012 at 12:05 PM, Ulrich Staudinger
 wrote:
> Hi there,
>
> I am waiting since days for my package to be built on R-Forge.
> https://r-forge.r-project.org/R/?group_id=1518
>
> R-Forge says:
>  Version: 0.2 | Last change: 2012-11-27 21:37:05+01 | Rev.: 32
> Build status: Building
>
> But I am already at revision 37 and R-Forge doesn't move since 6 days
>
> Can anyone help?
>
> Thanks
> Ulrich
>
>
>
> --
> Ulrich Staudinger
>
> P: +41 79 702 05 95
> E: ustaudin...@activequant.com
>
> http://www.activequant.com
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R-Forge not building packages?




On 03.12.2012 19:05, Ulrich Staudinger wrote:

Hi there,

I am waiting since days for my package to be built on R-Forge.
https://r-forge.r-project.org/R/?group_id=1518

R-Forge says:
  Version: 0.2 | Last change: 2012-11-27 21:37:05+01 | Rev.: 32
Build status: Building

But I am already at revision 37 and R-Forge doesn't move since 6 days

Can anyone help?

Thanks
Ulrich





[failed to CC R-help in my first response]

The R-forge page says

"If you experience any problems or need help you can submit a support 
request to the R-Forge team or write an email to r-fo...@r-project.org."


There is nothing readers of R-help can do, actually.

Best,
Uwe Ligges

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R-Forge not building packages?

2012-12-03 Thread Ulrich Staudinger

Hi there,

I am waiting since days for my package to be built on R-Forge.
https://r-forge.r-project.org/R/?group_id=1518

R-Forge says:
 Version: 0.2 | Last change: 2012-11-27 21:37:05+01 | Rev.: 32
Build status: Building

But I am already at revision 37 and R-Forge doesn't move since 6 days

Can anyone help?

Thanks
Ulrich



-- 
Ulrich Staudinger

P: +41 79 702 05 95
E: ustaudin...@activequant.com

http://www.activequant.com

AQ-R user? Join our mailing list:
http://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/aqr-user

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Non-quadratic plots

2012-12-03 Thread Jessica Streicher

Nevermind, found i can at least do it with pdf() and consorts, so i'll get it 
right in my images, and thats the main point.

On 03.12.2012, at 18:06, Jessica Streicher wrote:

> I'd like to make plots that do not have the quadratic layout, like having a 
> plot that is twice as wide as it is high (without distorting everything). 
> Sadly i wasn't able to find anything in par that does that.
> Best would be with plot(), but i'd use the ggplot package as well if 
> necessary.
> 
> thanks, Jessica
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem with figures

2012-12-03 Thread Yihui Xie

I'm using the latest version of TeXLive under Ubuntu
(http://packages.ubuntu.com/quantal/tex/texlive) so I did not realize
the version of standalone can be a problem. My version is also 1.1a,
but I do not have any problems if I remove the preview option from
your example. Sounds like there is a subtle difference somewhere
between your Debian TeXLive and my Ubuntu TeXLive...

Regards,
Yihui
--
Yihui Xie 
Phone: 515-294-2465 Web: http://yihui.name
Department of Statistics, Iowa State University
2215 Snedecor Hall, Ames, IA

On Mon, Dec 3, 2012 at 9:29 AM, Shige Song  wrote:
> All right. I did some more digging. It turns out that the real problem is
> that the version of the "standalone" LaTeX package installed on my Debian
> system is 1.1a. The easiest fix is to replace the three files
> "standalone.cfg", "standalone.cls", and "standalone.sty" with the most
> recent version from CTAN, which is 1.1b. Now I use
> "\documentclass[preview]{standalone}" in my Rnw file to get the desired
> results.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Non-quadratic plots

2012-12-03 Thread Jessica Streicher

I'd like to make plots that do not have the quadratic layout, like having a 
plot that is twice as wide as it is high (without distorting everything). 
Sadly i wasn't able to find anything in par that does that.
Best would be with plot(), but i'd use the ggplot package as well if necessary.

thanks, Jessica
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Excluding all missing values with dcast ("reshape2" package)

2012-12-03 Thread Jessica Streicher

This is not a reproducible example ;)

anyways, dcast has an attribute:
drop
should missing combinations dropped or kept?

does that not do what you want?

On 03.12.2012, at 16:07, michael.laviole...@dhhs.state.nh.us wrote:

> 
> Hello--I'm doing a simple crosstab using dcast:
> 
> rawfreq <- dcast(nh11brfs, race3~CHCCOPD, length)
> 
> with the results
> 
>   race3 Yes   No NA
> 1 White non-Hispanic 446 5473 21
> 2 Other non-Hispanic  29  211  0
> 3   Hispanic   6   81  1
> 4 10   83  1
> 
> How would I modify this call to exclude all missing values; that is, to
> obtain
> 
>   race3 Yes   No
> 1 White non-Hispanic 446 5473
> 2 Other non-Hispanic  29  211
> 3   Hispanic   6   81
> 
> Apologies if this has come up before, and thanks.
> 
> -M.L.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to calculate the spatial correlation of several files?

Ok, so I was mistaken when I thought I had made a mistake. Change to 
"rows" again:


for (.f in seq_along(dir1)){
results[[.f]]<- cor(file1[.f, ] ,file2[.f, ])
}


Rui Barradas
Em 03-12-2012 16:26, Jonsson escreveu:

Thanks
you meant it shoud be:
file1=do.call(rbind, lapply(dir1, readBin, integer(), size = 2, n = 360 *
720,
   signed = T))
file2=do.call(rbind, lapply(dir2, readBin, integer(), size = 2, n = 360 *
720,
   signed = T))
Please see the error

for (.f in seq_along(dir1)){

+   results[[.f]]<- cor(file1[, .f] ,file2[, .f])
+ }
Error in file2[, .f] : incorrect number of dimensions



--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-calculate-the-spatial-correlation-of-several-files-tp4651888p4651901.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reading PDF files

2012-12-03 Thread jose romero

Hello:

Apart from readPDF in the tm package, you can use the pdf to text converter 
command in linux, which is "pdftotext".  Say "file.pdf" is your file, from R 
you'd use:

system("pdftotext file.pdf -layout")

This invokes the pdftotext command from within R and creates a file called 
"file.txt" with the converted pdf, which you'd have to read into R.  The 
-layout option is so the conversion to text is as similar as possible to the 
original layout of the pdf file.

Regards,

jose loreto romero palma
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to calculate the spatial correlation of several files?

2012-12-03 Thread Jonsson

Thanks
you meant it shoud be:
file1=do.call(rbind, lapply(dir1, readBin, integer(), size = 2, n = 360 *
720, 
  signed = T)) 
file2=do.call(rbind, lapply(dir2, readBin, integer(), size = 2, n = 360 *
720, 
  signed = T))
Please see the error
> for (.f in seq_along(dir1)){ 
+   results[[.f]]<- cor(file1[, .f] ,file2[, .f]) 
+ } 
Error in file2[, .f] : incorrect number of dimensions



--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-calculate-the-spatial-correlation-of-several-files-tp4651888p4651901.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to calculate the spatial correlation of several files?


Hello,

Sorry, made a mistake.
Em 03-12-2012 16:12, Rui Barradas escreveu:

Hello,

Inline.
Em 03-12-2012 15:15, Jonsson escreveu:

 dir1 <- list.files("C:\\Users\\aalyaari\\Desktop\\cor", "*.bin",
full.names = TRUE)
 dir2 <- list.files("C:\\Users\\aalyaari\\Desktop\\cor2", "*.bin",
full.names = TRUE)
 results <- list()
 for (.files in dir1){   # read in the 365 files as a vector of
numbers for dir1
 file1 <- do.call(rbind,(lapply(.files, readBin  , integer() 
, size =

2 ,
 n = 360 * 720 , signed = T)))}
 for (.files in dir2){   # read in the 365 files as a vector of
numbers for dir2
 file2<- do.call(rbind,(lapply(.files, readBin  , integer() , 
size =

2 ,
 n = 360 * 720 , signed = T)))   }
# Now each file  in both directories is a vector.


Are you sure? Shouldn't your file reading routines be

do.call(rbind, lapply(dir1, readBin, integer(), size = 2, n = 360 * 
720, signed = T))
do.call(rbind, lapply(dir2, readBin, integer(), size = 2, n = 360 * 
720, signed = T))



to lapply readBin to each file in dir1/dir2?

Anyway, to correlate the first row in file1 to the first row in file2, 
etc, try


Here. it should be "column" not "row".

Rui Barradas


for (.f in seq_along(dir1)){
results[[.f]]<- cor(file1[, .f] ,file2[, .f])
}


Hope this helps,

Rui Barradas


  I am not sure how
to tell R to correlate the first column in dir1 to the correspond column
from dir2. we will finally get only one spatial correlation map.
I tried to this:
  # calculate the  correlation so we will get a correlation map
 for (.files in seq_along(dir1)){
 results[[length(results) + 1L]]<- cor(file1 ,file2)
 }
I got error:Error in cor(file1, file2) : allocMatrix: too many elements
specified`





--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-calculate-the-spatial-correlation-of-several-files-tp4651888.html

Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to calculate the spatial correlation of several files?


Hello,

Inline.
Em 03-12-2012 15:15, Jonsson escreveu:

 dir1 <- list.files("C:\\Users\\aalyaari\\Desktop\\cor", "*.bin",
full.names = TRUE)
 dir2 <- list.files("C:\\Users\\aalyaari\\Desktop\\cor2", "*.bin",
full.names = TRUE)
 results <- list()
 for (.files in dir1){   # read in the 365 files as a vector of
numbers for dir1
 file1 <- do.call(rbind,(lapply(.files, readBin  , integer() , size =
2 ,
 n = 360 * 720 , signed = T)))}
 for (.files in dir2){   # read in the 365 files as a vector of
numbers for dir2
 file2<- do.call(rbind,(lapply(.files, readBin  , integer() , size =
2 ,
 n = 360 * 720 , signed = T)))   }
# Now each file  in both directories is a vector.


Are you sure? Shouldn't your file reading routines be

do.call(rbind, lapply(dir1, readBin, integer(), size = 2, n = 360 * 720, 
signed = T))
do.call(rbind, lapply(dir2, readBin, integer(), size = 2, n = 360 * 720, 
signed = T))



to lapply readBin to each file in dir1/dir2?

Anyway, to correlate the first row in file1 to the first row in file2, 
etc, try


for (.f in seq_along(dir1)){
results[[.f]]<- cor(file1[, .f] ,file2[, .f])
}


Hope this helps,

Rui Barradas


  I am not sure how
to tell R to correlate the first column in dir1 to the correspond column
from dir2. we will finally get only one spatial correlation map.
I tried to this:
  # calculate the  correlation so we will get a correlation map
 for (.files in seq_along(dir1)){
 results[[length(results) + 1L]]<- cor(file1 ,file2)
 }
I got error:Error in cor(file1, file2) : allocMatrix: too many elements
specified`





--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-calculate-the-spatial-correlation-of-several-files-tp4651888.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] match and substitute two variables


Hello,

Try the following.

for(nm in name){
code[grep(gsub("[ [:alpha:]]+", "", nm), code)] <- nm
}
code


Hope this helps,

Rui Barradas
Em 03-12-2012 15:32, irene escreveu:

Hello,
I have two variables (of different length and from two different data
frames):

code<- c("101001",  "1032", "102", "101001", "102", "1032");
name<- c("101001 Alta", "102 Bassa", "1032 Media");

and I would like to substitute the first variable with the  second variable
according to their shared numerical part, thus obtaining the following
result:

code.new
"101001 Alta"  "1032 Media" "102 Bassa""101001 Alta" "102 Bassa" "1032
Media"

I tried using: <- sapply(code, gsub, pattern="\\d+", replacement=name) but
the replacement cannot be of length more than one, thus my output is only
"101001 Alta" "101001 Alta"... I am not sure how to get the right answer...

Thank you!



--
View this message in context: 
http://r.789695.n4.nabble.com/match-and-substitute-two-variables-tp4651893.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] match and substitute two variables

2012-12-03 Thread irene

Hello, 
I have two variables (of different length and from two different data
frames):

code<- c("101001",  "1032", "102", "101001", "102", "1032");
name<- c("101001 Alta", "102 Bassa", "1032 Media");

and I would like to substitute the first variable with the  second variable
according to their shared numerical part, thus obtaining the following
result:

code.new
"101001 Alta"  "1032 Media" "102 Bassa""101001 Alta" "102 Bassa" "1032
Media"

I tried using: <- sapply(code, gsub, pattern="\\d+", replacement=name) but
the replacement cannot be of length more than one, thus my output is only
"101001 Alta" "101001 Alta"... I am not sure how to get the right answer...

Thank you!



--
View this message in context: 
http://r.789695.n4.nabble.com/match-and-substitute-two-variables-tp4651893.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] multiline text() with different cex sizes




On 03.12.2012 16:13, Michael Friendly wrote:

Well, no one answered this back in August, and now I have the same
problem with another figure.

Again, is there some way to position several lines of text within
a plot written at different cex sizes so that the relative spacing of
the lines remains fixed when the plot is resized or written to
a device?

That is, I'd like to have such caption text treated as a fixed block
which, under rescaling would expand contract uniformly.



Not in base R graphics, but in package "grid".

Best,
Uwe Ligges


-Michael


On 8/22/2012 2:58 PM, Michael Friendly wrote:

This question has probably been asked & answered before, but I can't
find it: How to print a
multiline figure caption in a plot, where different lines have different
fonts and font sizes,
and so that the lines of text are spaced in a reasonable way.

Here's a simple  example, where I have to keep tweaking the y coordinate
and the
cex combos, while I also manually adjust the aspect ratio of the plot.
But, once I get it looking
OK on the screen, when I print to a PDF device, the lines of text move
around.

op <- par(mar=c(4,4,1,1)+.01)   # tight bounding box
plot(1:79, type="n",
 xlab="Year", ylab="", cex.lab=1.5,
 xlim=c(1500, 2000))

text(1610, 75, "Specimen of a\nChart of Biography", cex=2.5, font=4)
text(1610, 69, "of Milestones Authors", cex=1.5, font=4)
par(op)

In my real application, I would also add other lines of text in a
smaller font size.






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem with figures

2012-12-03 Thread Shige Song

All right. I did some more digging. It turns out that the real problem is
that the version of the "standalone" LaTeX package installed on my Debian
system is 1.1a. The easiest fix is to replace the three files
"standalone.cfg", "standalone.cls", and "standalone.sty" with the most
recent version from CTAN, which is 1.1b. Now I use
"\documentclass[preview]{standalone}" in my Rnw file to get the desired
results.


On Sun, Dec 2, 2012 at 11:10 PM, Yihui Xie  wrote:

> fig=TRUE is irrelevant here, and knitr does not need fig=TRUE at all
> (plots are automatically recorded).
>
> The real problem is the [preview] option; remove it and you are all set.
>
> Next time if you have problems with tikzDevice, you can take a look at
> the LaTeX log to know what exactly is wrong, e.g. in this case you
> will see
>
> ! LaTeX Error: Option clash for package preview.
>
> The log file is at figure/fig1.log by default in your case.
>
> Regards,
> Yihui
> --
> Yihui Xie 
> Phone: 515-294-2465 Web: http://yihui.name
> Department of Statistics, Iowa State University
> 2215 Snedecor Hall, Ames, IA
>
>
> On Sun, Dec 2, 2012 at 5:36 PM, Shige Song  wrote:
> > Easiest way: copy and paste the code into Rstudio and hit "compile pdf".
> > >From the command line, I believe you can do "knit2pdf example.Rnw".
> >
> > Shige
> >
> >
> > On Sun, Dec 2, 2012 at 6:12 PM, Duncan Murdoch  >wrote:
> >
> >> On 12-12-02 5:42 PM, Shige Song wrote:
> >>
> >>> I am having problem making ggplot2, tikzDevice, and knitr working
> >>> together.
> >>> I used a very simple example:
> >>>
> >>
> >> I don't use knitr so I can't really help, but you didn't tell us how you
> >> passed this file to knitr, so maybe nobody can.  However, if you were
> using
> >> Sweave, you would need to mention that the code chunk produces a figure
> >> (using "fig=TRUE" in the <<>>= header).
> >>
> >> Duncan Murdoch
> >>
> >>  ---**example.Rnw---**--
> >>> \documentclass[preview]{**standalone}
> >>>
> >>> \begin{document}
> >>>
> >>> \begin{figure}
> >>> <>=
> >>> library(ggplot2)
> >>> qplot(displ, hwy, data = mpg, colour = factor(cyl))
> >>> @
> >>> \end{figure}
> >>>
> >>> \end{document}
> >>> --**--**
> >>> -
> >>> I got "... !  ==> Fatal error occurred, no output PDF file produced!
> >>> label: fig1 (with options)
> >>> List of 3
> >>>   $ eval: logi TRUE
> >>>   $ echo: logi FALSE
> >>>   $ dev : chr "tikz"
> >>>
> >>> Error in process_file(text, output) :
> >>>Quitting from lines 6-8: (test_Rnw.Rnw) Error in
> >>> getMetricsFromLatex(**TeXMetrics) :
> >>> TeX was unable to calculate metrics for the following string
> >>> or character:
> >>>
> >>>  hwy
> >>>
> >>> Common reasons for failure include:
> >>>* The string contains a character which is special to LaTeX unless
> >>>  escaped properly, such as % or $.
> >>>* The string makes use of LaTeX commands provided by a package and
> >>>  the tikzDevice was not told to load the package.
> >>>
> >>> The contents of the LaTeX log of the aborted run have been printed
> above,
> >>> it may contain additional details as to why the metric calculation
> failed.
> >>>
> >>> Calls: knit -> process_file
> >>>
> >>> Execution halted"
> >>>
> >>> Best,
> >>> Shige
> >>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] cubic spline

2012-12-03 Thread Martin Maechler

> Ben Bolker 
> on Sat, 1 Dec 2012 21:49:47 + writes:

> Martin Maechler  stat.math.ethz.ch> writes:
> [snip]

>> but definitely *no* need to use a function from an extra
>> CRAN package .. as someone else ``erronously'' suggested.
>> 
>> Note that spline() and splinefun() together with approx()
>> and approxfun() are among the several hundred functions
>> that were already part of "pre-alpha" R, i.e., before R
>> had a version number or *any* packages ...  and yes, the
>> README then started with the two lines
>> 
>> | R Source Code (Tue Jun 20 14:33:47 NZST 1995) |
>> Copyright 1993, 1994, 1995 by Robert Gentleman and Ross
>> Ihaka
>> 
>> and it would be *really* *really* great if people did not
>> add stuff to their packages that has been part of R for
>> longer than they have even heard of R.
>> 
>> Martin Maechler, ETH Zurich

>   To be fair, the 'fields' package has a pretty long
> history too -- I think it may have been ported from an
> S-PLUS 'package' (or whatever the correct terminology is)
> that existed quite a while ago.

>  I think it was the FUNFITS module. From
> http://lib.stat.cmu.edu/S/:

> funfits

> FUNFITS is a comprehensive S-Plus module for fitting
> functions and nonlinear time series, including
> multivariate splines, Kriging and neural networks.
> Contributed by Doug Nychka (nyc...@ucar.edu). [25/Apr/96]
> [24/Mar/97][24/Sep/99] (3 kbytes). The actual compressed
> tar file is available as funfits23.tar.gz in the S
> collection. Access this file via FTP, or the WWW, but not
> e-mail. (596k).  Older version avaulable at funfits.tar.Z

>   A quick look at funfits.tar.Z suggests that 'splint'
> existed in that version, in 1996 -- so respectably old.

Good point, Ben, thank you!

and of course Hans Borcher's one is even more relevant to the
original question.

Martin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to calculate the spatial correlation of several files?

2012-12-03 Thread Jonsson

dir1 <- list.files("C:\\Users\\aalyaari\\Desktop\\cor", "*.bin",
full.names = TRUE)
dir2 <- list.files("C:\\Users\\aalyaari\\Desktop\\cor2", "*.bin",
full.names = TRUE)
results <- list()
for (.files in dir1){   # read in the 365 files as a vector of
numbers for dir1
file1 <- do.call(rbind,(lapply(.files, readBin  , integer() , size =
2 ,
n = 360 * 720 , signed = T)))}
for (.files in dir2){   # read in the 365 files as a vector of
numbers for dir2
file2<- do.call(rbind,(lapply(.files, readBin  , integer() , size =
2 , 
n = 360 * 720 , signed = T)))   }
   # Now each file  in both directories is a vector. I am not sure how
to tell R to correlate the first column in dir1 to the correspond column
from dir2. we will finally get only one spatial correlation map.
I tried to this:
 # calculate the  correlation so we will get a correlation map
for (.files in seq_along(dir1)){  
results[[length(results) + 1L]]<- cor(file1 ,file2)
}
I got error:Error in cor(file1, file2) : allocMatrix: too many elements
specified`





--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-calculate-the-spatial-correlation-of-several-files-tp4651888.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] multiline text() with different cex sizes

2012-12-03 Thread Michael Friendly

Well, no one answered this back in August, and now I have the same 
problem with another figure.


Again, is there some way to position several lines of text within
a plot written at different cex sizes so that the relative spacing of 
the lines remains fixed when the plot is resized or written to

a device?

That is, I'd like to have such caption text treated as a fixed block
which, under rescaling would expand contract uniformly.

-Michael


On 8/22/2012 2:58 PM, Michael Friendly wrote:

This question has probably been asked & answered before, but I can't
find it: How to print a
multiline figure caption in a plot, where different lines have different
fonts and font sizes,
and so that the lines of text are spaced in a reasonable way.

Here's a simple  example, where I have to keep tweaking the y coordinate
and the
cex combos, while I also manually adjust the aspect ratio of the plot.
But, once I get it looking
OK on the screen, when I print to a PDF device, the lines of text move
around.

op <- par(mar=c(4,4,1,1)+.01)   # tight bounding box
plot(1:79, type="n",
 xlab="Year", ylab="", cex.lab=1.5,
 xlim=c(1500, 2000))

text(1610, 75, "Specimen of a\nChart of Biography", cex=2.5, font=4)
text(1610, 69, "of Milestones Authors", cex=1.5, font=4)
par(op)

In my real application, I would also add other lines of text in a
smaller font size.




--
Michael Friendly Email: friendly AT yorku DOT ca
Professor, Psychology Dept. & Chair, Quantitative Methods
York University  Voice: 416 736-2100 x66249 Fax: 416 736-5814
4700 Keele StreetWeb:   http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Excluding all missing values with dcast ("reshape2" package)

2012-12-03 Thread Michael . Laviolette


Hello--I'm doing a simple crosstab using dcast:

rawfreq <- dcast(nh11brfs, race3~CHCCOPD, length)

with the results

   race3 Yes   No NA
1 White non-Hispanic 446 5473 21
2 Other non-Hispanic  29  211  0
3   Hispanic   6   81  1
4 10   83  1

How would I modify this call to exclude all missing values; that is, to
obtain

   race3 Yes   No
1 White non-Hispanic 446 5473
2 Other non-Hispanic  29  211
3   Hispanic   6   81

Apologies if this has come up before, and thanks.

-M.L.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Calculation of extremely low p-values (in lm)


Hello,

It's easy to see what's going on by reading the sources, to be open 
source is one of the strong points of R, we know exactly how the values 
are computed. A reviewer might like to have an explanation of what R does.
The op could check with Friedrich Leisch's "Creating R Packages: A 
Tutorial", it's running example on S3 classes is precisely the linear 
model. The relevant functions and the way to call them are as follows. 
Note that the p-values are computed using the distribution function, 
pt(), that gives the area under the density, and that the returned 
values are multiplied by two, since the test is two-sided. I've edited 
the code a bit, to give an other way of computing the p-values. The 
results are the same as the results of R's summary.lm() in package stats 
and the code is easy to follow.



linmodEst <- function(x, y){
## compute QR-decomposition of x
qx <- qr(x)
## compute (x'x)^(-1) x'y
coef <- solve.qr(qx, y)
## degrees of freedom and standard deviation of residuals
df <- nrow(x) - ncol(x)
sigma2 <- sum((y - x %*% coef)^2)/df
## compute sigma^2 * (x'x)^-1
vcov <- sigma2 * chol2inv(qx$qr)
colnames(vcov) <- rownames(vcov) <- colnames(x)
list(coefficients = coef,
vcov = vcov,
sigma = sqrt(sigma2),
df = df)
}

summary.linmod <- function(object, ...){
se <- sqrt(diag(object$vcov))
tval <- coef(object) / se
TAB <- cbind(Estimate = coef(object),
StdErr = se,
t.value = tval,
p.value = 2*pt(-abs(tval), df=object$df),
p.value2 = 2*pt(abs(tval), df=object$df, lower.tail = FALSE))
res <- list(call=object$call,
coefficients=TAB)
#class(res) <- "summary.linmod"
res
}


mod <- linmodEst(cbind(Const = 1, x_var), y_var)
summary.linmod(mod)


Hope this helps,

Rui Barradas


Em 03-12-2012 14:26, Robert Baer escreveu:

On 12/3/2012 6:20 AM, Sindri wrote:

Dear R-users

Please excuse me if this topic has been covered before, but I was 
unable to

find anything relevant by searching

I am currently doing a comparison of two biological variables that 
have a
highly significant linear relationship.   I know that the p-value of 
linear
regression is not so interesting in itself, but this particular value 
does

raise a question.

How does R calculate (extremely low) p-values for linear regression?

For my data I got a p-value on the order of 10^-9 and a reviewer 
commented
on this.  I tried to run the same analysis in both SAS and Sigmastat 
to be

sure that I was doing it right, but both these programs only return a
p-value of p < 0.0001
Since I am unable to reproduce my results in another statistics 
program, it

would be nice to be able to explain this unusally low p-value to the
reviewers.
This is a matter of you understanding that the p-value is an area 
under a probability density curve.  R is simply printing out the 
actual area in a tail of some distribution.  The other statistical 
program is making the assumption that you are using the p-value to 
compare to a cutoff alpha value that is (in most fields) never set 
much below p<0.001.  If p < alpha the "hypothesis test crowd" , would 
choose to reject NULL  hypothesis, so the other statistics programs 
take the attitude --  "why provide more detail?".  R chooses to give 
you the actual number and let you do what you will with it.  You could 
probably benefit from reviewing hypothesis testing in a basic 
statistics book if this is not clear.


Note that 10e-9 is indeed less than 0.0001, so the programs don't 
disagree.  R just provides more detail.




This "problem" can be illustrated with the following made-up data:

x_var<-c(0.149,0.178,0.3474,0.167,0.121,0.182,0.176,0.448,0.091,0.083,0.090,0.407,0.378,0.132,0.227,0.172,0.088,0.392,0.425,0.150,0.319,0.190,0.171,0.290,0.214,0.431,0.193) 



y_var<-c(0.918,0.394,0.131,0.9084,0.916,0.934,0.928,0.279,0.830,0.927,0.964,0.323,0.097,0.914,0.614,0.790,0.984,0.530,0.207,0.858,0.408,0.919,0.869,0.347,0.834,0.276,0.940) 



fit<-lm(y_var~x_var)


summary(fit)

Call:
lm(formula = y_var ~ x_var)

Residuals:
  Min   1Q   Median   3Q  Max
-0.39152 -0.06027  0.00933  0.10024  0.22711

Coefficients:
 Estimate Std. Error t value Pr(>|t|)
(Intercept)  1.186960.06394  18.562 3.90e-16 ***
x_var   -2.255290.24788  -9.098 2.08e-09 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.1503 on 25 degrees of freedom
Multiple R-squared: 0.768,Adjusted R-squared: 0.7588
F-statistic: 82.78 on 1 and 25 DF,  p-value: 2.083e-09


With kind regards,
Sindri Traustason



-
-
Sindri Traustason
Glostrup Hospital Ophthalmology Research Dept.
Copenhagen, Demark

--
View this message in context: 
http://r.789695.n4.nabble.com/Calculation-of-extremely-low-p-values-in-lm-tp4651823.html

Sent from the R help mailing list archive at Nabble.com.

__

[R] help on interpreting nonlinear regresson modelling results

2012-12-03 Thread Andras Farkas

Dear All,
 
wondering if you know of a good resource on-line to read on interpreting the 
parameter estimate result "statistics' for the output of a nonlinear regression 
model. I am specifically having a hard time finding information on 
the interpretation  of t value and Pr > /t/ in layman terms. 
 
any direction or help would be greatly apreciated,
 
thanks,
 
Andras
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] xlsx file read in R

2012-12-03 Thread R. Michael Weylandt

On Mon, Dec 3, 2012 at 1:27 PM, Nico Met  wrote:
>  Dear all,
>
> this is the code I used to read the .xlsx file but get an error message
>
> library(rJava)
> library(XLConnectJars)
>  library(XLConnect)
>  file<-readWorksheetFromFile("all_in.xlsx", sheet = 1)
>
> Error: NoClassDefFoundError (Java): Could not initialize class
> org.apache.poi.POIXMLDocument
>
> No idea where it went wrong
>
> Regards
>
> Nico
>

Can you run the examples from some of the package help pages or is the
whole thing not working?

Michael

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help R CMD check

2012-12-03 Thread Duncan Murdoch


On 03/12/2012 8:46 AM, Monica Rinaldi wrote:

Hello,

I'm trying to make one new package  and when I'm run the following statement

R CMD check new

R return the next error:

* checking whether package 'new' can be installed ... ERROR

* install options are ' --no-html'

Errore: 1:8: unexpected symbol
1: export R
   ^
Esecuzione interrotta



You probably have more information in the check log file, but that looks 
like a syntax error in the NAMESPACE file.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Calculation of extremely low p-values (in lm)

2012-12-03 Thread Robert Baer

On 12/3/2012 6:20 AM, Sindri wrote:

Dear R-users

Please excuse me if this topic has been covered before, but I was unable to
find anything relevant by searching

I am currently doing a comparison of two biological variables that have a
highly significant linear relationship. I know that the p-value of linear
regression is not so interesting in itself, but this particular value does
raise a question.

How does R calculate (extremely low) p-values for linear regression?

For my data I got a p-value on the order of 10^-9 and a reviewer commented
on this. I tried to run the same analysis in both SAS and Sigmastat to be
sure that I was doing it right, but both these programs only return a
p-value of p < 0.0001
Since I am unable to reproduce my results in another statistics program, it
would be nice to be able to explain this unusally low p-value to the
reviewers.
This is a matter of you understanding that the p-value is an area under
a probability density curve. R is simply printing out the actual area
in a tail of some distribution. The other statistical program is making
the assumption that you are using the p-value to compare to a cutoff
alpha value that is (in most fields) never set much below p<0.001. If p
< alpha the "hypothesis test crowd" , would choose to reject NULL
hypothesis, so the other statistics programs take the attitude -- "why
provide more detail?". R chooses to give you the actual number and let
you do what you will with it. You could probably benefit from reviewing
hypothesis testing in a basic statistics book if this is not clear.

Note that 10e-9 is indeed less than 0.0001, so the programs don't
disagree. R just provides more detail.

This "problem" can be illustrated with the following made-up data:

x_var<-c(0.149,0.178,0.3474,0.167,0.121,0.182,0.176,0.448,0.091,0.083,0.090,0.407,0.378,0.132,0.227,0.172,0.088,0.392,0.425,0.150,0.319,0.190,0.171,0.290,0.214,0.431,0.193)

y_var<-c(0.918,0.394,0.131,0.9084,0.916,0.934,0.928,0.279,0.830,0.927,0.964,0.323,0.097,0.914,0.614,0.790,0.984,0.530,0.207,0.858,0.408,0.919,0.869,0.347,0.834,0.276,0.940)

fit<-lm(y_var~x_var)

summary(fit)

Call:
lm(formula = y_var ~ x_var)

Residuals:
Min 1Q Median 3Q Max
-0.39152 -0.06027 0.00933 0.10024 0.22711

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.186960.06394 18.562 3.90e-16 ***
x_var -2.255290.24788 -9.098 2.08e-09 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.1503 on 25 degrees of freedom
Multiple R-squared: 0.768, Adjusted R-squared: 0.7588
F-statistic: 82.78 on 1 and 25 DF, p-value: 2.083e-09

With kind regards,
Sindri Traustason

-
-
Sindri Traustason
Glostrup Hospital Ophthalmology Research Dept.
Copenhagen, Demark

--
View this message in context:
http://r.789695.n4.nabble.com/Calculation-of-extremely-low-p-values-in-lm-tp4651823.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
__
Robert W. Baer, Ph.D.
Professor of Physiology
Kirksille College of Osteopathic Medicine
A. T. Still University of Health Sciences
Kirksville, MO 63501 USA

Re: [R] Using multicores in R

2012-12-03 Thread Steve Lianoglou

And also:

On Monday, December 3, 2012, Uwe Ligges wrote:

>
>
> On 03.12.2012 11:14, moriah wrote:
>
>> Hi,
>>
>> I have an R script which is time consuming because it has two nested loops
>> in it of at least 5000 iterations each, I have tried to use the multicore
>> package but id doesn't seem to improve the elapsed time of the script(a
>> shorter script for example) and I can't use the mcapply because of
>> technical
>> reasons.
>>
>
> Errr, but otherwise multicore does not have an effect ...
>
> See package "parallel" that offers various functions for parallel
> computations. We cannot help much more if you do not tell us what the
> technical reasons are why mcapply() does not work.

If the work you are doing within each iteration of the loop is trivial, you
will likely even see a decrease in performance if you try to parallelize it.

Without more info from you regarding your problem, there's little we can do
to help, tho.

 -Steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Help R CMD check

2012-12-03 Thread Monica Rinaldi

Hello,

I'm trying to make one new package  and when I'm run the following statement

R CMD check new

R return the next error:

* checking whether package 'new' can be installed ... ERROR

* install options are ' --no-html'

Errore: 1:8: unexpected symbol
1: export R
  ^
Esecuzione interrotta



Thanks

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Daily Time Series, patterns.

2012-12-03 Thread mekbatub

Hi Arun, thanks again, I think we are close.
The way You gave me looks good, but I sill have one problem, look at this:

Lets say, we have data like this:

>head(dat3)

  Date   quantity
2012-03-05 65.16
2012-03-06 70.67
2012-03-08 63.66
2012-03-09 70.05
2012-03-12 61.59
2012-03-13 58.98

Then we have:
>z <- zooreg(dat3[,2], frequency = 5)
>z
1(1) 1(2) 1(3) 1(4) 1(5) 2(1) 
65.16  70.67  63.66  70.05  61.59   58.98

This is for dates:
1(1)   1(2)1(3)   1(4) 
1(5)2(1) 
2012-03-05 2012-03-06 2012-03-08 2012-03-09 2012-03-12 2012-03-13

Since there was no releases in 2012-03-07 (warehouse was closed)

We should have:
1(1) 1(2) *1(4) 1(5) 1(1) 2(2) *
65.16  70.67  63.66  70.05  61.59   58.98

So I can’t in that case use “frequency=5”. I am trying to figure out how to
assign correctly number of the week day to quantity.

I am wondering if my way of thinking is correct, maybe I should fill out
those missing values like You suggest me last time, but the problem is there
are not missing values really, for ex. if the warehouse was closed, there
was not supposed to be releases that day so there is nothing to fill out
really. 
I am wondering is it at all possible to do this in R like I am trying to do
– without data continuity.

What You think about that?



--
View this message in context: 
http://r.789695.n4.nabble.com/Daily-Time-Series-patterns-tp4651569p4651836.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Calculation of extremely low p-values (in lm)

2012-12-03 Thread Sindri

Dear R-users

Please excuse me if this topic has been covered before, but I was unable to
find anything relevant by searching

I am currently doing a comparison of two biological variables that have a
highly significant linear relationship.   I know that the p-value of linear
regression is not so interesting in itself, but this particular value does
raise a question.

How does R calculate (extremely low) p-values for linear regression?  

For my data I got a p-value on the order of 10^-9 and a reviewer commented
on this.  I tried to run the same analysis in both SAS and Sigmastat to be
sure that I was doing it right, but both these programs only return a
p-value of p < 0.0001
Since I am unable to reproduce my results in another statistics program, it
would be nice to be able to explain this unusally low p-value to the
reviewers.


This "problem" can be illustrated with the following made-up data:

x_var<-c(0.149,0.178,0.3474,0.167,0.121,0.182,0.176,0.448,0.091,0.083,0.090,0.407,0.378,0.132,0.227,0.172,0.088,0.392,0.425,0.150,0.319,0.190,0.171,0.290,0.214,0.431,0.193)

y_var<-c(0.918,0.394,0.131,0.9084,0.916,0.934,0.928,0.279,0.830,0.927,0.964,0.323,0.097,0.914,0.614,0.790,0.984,0.530,0.207,0.858,0.408,0.919,0.869,0.347,0.834,0.276,0.940)

fit<-lm(y_var~x_var)

> summary(fit)

Call:
lm(formula = y_var ~ x_var)

Residuals:
 Min   1Q   Median   3Q  Max 
-0.39152 -0.06027  0.00933  0.10024  0.22711 

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)  1.186960.06394  18.562 3.90e-16 ***
x_var   -2.255290.24788  -9.098 2.08e-09 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 0.1503 on 25 degrees of freedom
Multiple R-squared: 0.768,  Adjusted R-squared: 0.7588 
F-statistic: 82.78 on 1 and 25 DF,  p-value: 2.083e-09 


With kind regards,
Sindri Traustason



-
-
Sindri Traustason
Glostrup Hospital Ophthalmology Research Dept.
Copenhagen, Demark

--
View this message in context: 
http://r.789695.n4.nabble.com/Calculation-of-extremely-low-p-values-in-lm-tp4651823.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] internal validation_logistic regression results

2012-12-03 Thread martina CHITTANI


Hi,I'm developing a case-control study on genotyping data.I have fitted a 
logistic regression with covariates on this dataset, to pinpoint association 
between some SNPs and response to a drug treatment.  
I do not have a sample for confirmation of results, so I would perform a 
internal validation, with random subsampling of dataset.I believed that DAAG 
package might be right for me, but I found that this package perform a 
cross-validation to select best suitable model.Contact me for any questions.
Thank you in advance.Best regards,Martina   
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] fitting a gamma frailty model (coxph)

2012-12-03 Thread Marco Munda

Dear all,

I have a data 
setwith
6 clusters, each containing 48 (possibly censored, in which case
"event = 0") survival times. The "x" column contains a binary explanatory
variable. I try to describe that data with a gamma frailty model as follows:

library(survival)

mod <- coxph(Surv(time, event) ~
   x + frailty.gamma(cluster, eps=1e-10, method="em", sparse=0),
  outer.max=1000, iter.max=1,
  data=data)

Here is the error message:

Error in if (history[2, 3] < (history[1, 3] + 1)) theta <-
mean(history[1:2,  :
  missing value where TRUE/FALSE needed

Does anyone have an idea on how to debug?

Yours sincerely,
Marco

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] xlsx file read in R

2012-12-03 Thread Nico Met

 Dear all,

this is the code I used to read the .xlsx file but get an error message

library(rJava)
library(XLConnectJars)
 library(XLConnect)
 file<-readWorksheetFromFile("all_in.xlsx", sheet = 1)

*Error: NoClassDefFoundError (Java): Could not initialize class
org.apache.poi.POIXMLDocument*

No idea where it went wrong

Regards

Nico

On Mon, Dec 3, 2012 at 2:07 PM, jim holtman  wrote:

> use the XLConnect package.
>
> On Mon, Dec 3, 2012 at 7:59 AM, Nico Met  wrote:
> > Dear all,
> >
> > How can I read .xlsx files in R
> >
> > Regards Nico
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> Jim Holtman
> Data Munger Guru
>
> What is the problem that you are trying to solve?
> Tell me what you want to do, not how you want to do it.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] xlsx file read in R

2012-12-03 Thread jim holtman

use the XLConnect package.

On Mon, Dec 3, 2012 at 7:59 AM, Nico Met  wrote:
> Dear all,
>
> How can I read .xlsx files in R
>
> Regards Nico
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] xlsx file read in R

2012-12-03 Thread R. Michael Weylandt

There are a couple of CRAN packages available -- my personal
recommendation is XLConnect.

MW

On Mon, Dec 3, 2012 at 12:59 PM, Nico Met  wrote:
> Dear all,
>
> How can I read .xlsx files in R
>
> Regards Nico
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using multicores in R




On 03.12.2012 11:14, moriah wrote:

Hi,

I have an R script which is time consuming because it has two nested loops
in it of at least 5000 iterations each, I have tried to use the multicore
package but id doesn't seem to improve the elapsed time of the script(a
shorter script for example) and I can't use the mcapply because of technical
reasons.


Errr, but otherwise multicore does not have an effect ...

See package "parallel" that offers various functions for parallel 
computations. We cannot help much more if you do not tell us what the 
technical reasons are why mcapply() does not work.


Best,
Uwe Ligges





I was wondering how can I make my script use more cores and memory because I
am running it on a server and it is a shame that it uses only one core.





Thanks!
Moriah




--
View this message in context: 
http://r.789695.n4.nabble.com/Using-multicores-in-R-tp4651808.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] xlsx file read in R

2012-12-03 Thread John Kane

http://cran.r-project.org/doc/manuals/R-data.html

John Kane
Kingston ON Canada


> -Original Message-
> From: nicome...@gmail.com
> Sent: Mon, 3 Dec 2012 13:59:18 +0100
> To: r-help@r-project.org
> Subject: [R] xlsx file read in R
> 
> Dear all,
> 
> How can I read .xlsx files in R
> 
> Regards Nico
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


GET FREE SMILEYS FOR YOUR IM & EMAIL - Learn more at 
http://www.inbox.com/smileys
Works with AIM®, MSN® Messenger, Yahoo!® Messenger, ICQ®, Google Talk™ and most 
webmails

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] xlsx file read in R




On 03.12.2012 13:59, Nico Met wrote:

Dear all,

How can I read .xlsx files in R


See the Data Import/Export manual?

Uwe Ligges




Regards Nico

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] xlsx file read in R

2012-12-03 Thread Gabor Grothendieck

On Mon, Dec 3, 2012 at 7:59 AM, Nico Met  wrote:
> Dear all,
>
> How can I read .xlsx files in R
>

See:

http://rwiki.sciviews.org/doku.php?id=tips:data-io:ms_windows

--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Warning message: In scan(file, what, nmax...)




On 03.12.2012 11:30, F86 wrote:

Dear David,

Than you for helping me.

I tried with ","

  Data1<-read.table("/Users/kama/Analysis/GDP10.csv",header=TRUE,sep=",")
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
:
   line 116 did not have 2 elements


So please how us lines 115-120 of that file.

Uwe Ligges




i also tried:

Data1<-readLines("/Users/kama/Analysis/GDP10.csv",n=10)

Data1

  [1] "Country10;Year10;GDP" "Andorra;2010;41,138"  "Andorra;2009;44,591"
  [4] "Andorra;2008;49,981"  "Andorra;2007;48,431"  "Andorra;2006;43,541"
  [7] "Andorra;2005;40,821"  "Andorra;2004;38,381"  "Andorra;2003;32,55"
[10] "Andorra;2002;25,532"









--
View this message in context: 
http://r.789695.n4.nabble.com/Warning-message-In-scan-file-what-nmax-tp4651689p4651809.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] xlsx file read in R

2012-12-03 Thread Nico Met

Dear all,

How can I read .xlsx files in R

Regards Nico

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] error of installing/building an R package (PortfolioAnalytics) on Win 7

2012-12-03 Thread Joshua Ulrich

On Mon, Dec 3, 2012 at 2:13 AM, R. Michael Weylandt
 wrote:
> On Mon, Dec 3, 2012 at 12:42 AM, Jack Bryan  wrote:
>>
>> Thanks for your reply.
>>
>> On
>> http://cran.r-project.org/bin/windows/Rtools/
>>
>> Frozen
>> Rtools216.exe   R >2.15.1 to R 2.16.x  No
>> Frozen means "no available" ?
>>
>> So, PortfolioAnalytics cannot be used on Linux or Win until Mar. 2012 ?
>>
>> Are there substitutes ?
>>
>> Any help will be appreciated.
>>
>
> I'm not sure that's your problem -- I just downloaded Portfolio and
> PerformanceAnalytics and built them both from source with no problem.
> (So something's afoot with the r-forge build and I have a hunch it's a
> dependency on a more recent xts than is available on CRAN, but I could
> be wrong)
>
Your hunch is correct.  Performance Analytics on R-Forge requires the
latest xts from R-Forge.

> Can you install from the command line? svn checkout + R CMD install.
> I'd start with building xts from source (admittedly harder because it
> has c code unlike PA) but so goes dependency management.
>
> Since you're on Windows, I'd also just (re-)remind you to make sure
> your path has no spaces in it.
>
> Michael
>

Best,
--
Joshua Ulrich  |  about.me/joshuaulrich
FOSS Trading  |  www.fosstrading.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem with figures

2012-12-03 Thread Shige Song

Thanks, Yihui. It turns out that getting rid of the "preview" option is not
enough and one must include "tikz" and "print" to make it work.

Shige


On Sun, Dec 2, 2012 at 11:10 PM, Yihui Xie  wrote:

> fig=TRUE is irrelevant here, and knitr does not need fig=TRUE at all
> (plots are automatically recorded).
>
> The real problem is the [preview] option; remove it and you are all set.
>
> Next time if you have problems with tikzDevice, you can take a look at
> the LaTeX log to know what exactly is wrong, e.g. in this case you
> will see
>
> ! LaTeX Error: Option clash for package preview.
>
> The log file is at figure/fig1.log by default in your case.
>
> Regards,
> Yihui
> --
> Yihui Xie 
> Phone: 515-294-2465 Web: http://yihui.name
> Department of Statistics, Iowa State University
> 2215 Snedecor Hall, Ames, IA
>
>
> On Sun, Dec 2, 2012 at 5:36 PM, Shige Song  wrote:
> > Easiest way: copy and paste the code into Rstudio and hit "compile pdf".
> > >From the command line, I believe you can do "knit2pdf example.Rnw".
> >
> > Shige
> >
> >
> > On Sun, Dec 2, 2012 at 6:12 PM, Duncan Murdoch  >wrote:
> >
> >> On 12-12-02 5:42 PM, Shige Song wrote:
> >>
> >>> I am having problem making ggplot2, tikzDevice, and knitr working
> >>> together.
> >>> I used a very simple example:
> >>>
> >>
> >> I don't use knitr so I can't really help, but you didn't tell us how you
> >> passed this file to knitr, so maybe nobody can.  However, if you were
> using
> >> Sweave, you would need to mention that the code chunk produces a figure
> >> (using "fig=TRUE" in the <<>>= header).
> >>
> >> Duncan Murdoch
> >>
> >>  ---**example.Rnw---**--
> >>> \documentclass[preview]{**standalone}
> >>>
> >>> \begin{document}
> >>>
> >>> \begin{figure}
> >>> <>=
> >>> library(ggplot2)
> >>> qplot(displ, hwy, data = mpg, colour = factor(cyl))
> >>> @
> >>> \end{figure}
> >>>
> >>> \end{document}
> >>> --**--**
> >>> -
> >>> I got "... !  ==> Fatal error occurred, no output PDF file produced!
> >>> label: fig1 (with options)
> >>> List of 3
> >>>   $ eval: logi TRUE
> >>>   $ echo: logi FALSE
> >>>   $ dev : chr "tikz"
> >>>
> >>> Error in process_file(text, output) :
> >>>Quitting from lines 6-8: (test_Rnw.Rnw) Error in
> >>> getMetricsFromLatex(**TeXMetrics) :
> >>> TeX was unable to calculate metrics for the following string
> >>> or character:
> >>>
> >>>  hwy
> >>>
> >>> Common reasons for failure include:
> >>>* The string contains a character which is special to LaTeX unless
> >>>  escaped properly, such as % or $.
> >>>* The string makes use of LaTeX commands provided by a package and
> >>>  the tikzDevice was not told to load the package.
> >>>
> >>> The contents of the LaTeX log of the aborted run have been printed
> above,
> >>> it may contain additional details as to why the metric calculation
> failed.
> >>>
> >>> Calls: knit -> process_file
> >>>
> >>> Execution halted"
> >>>
> >>> Best,
> >>> Shige
> >>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Réponse automatique

2012-12-03 Thread j . boutet




Bonjour,



Je serais en congés jusqu'au Jeudi 6 Décembre.

Pour des raisons d'urgence, vous pourrez me contacter par téléphone au 06 46 34 
81 03.



Cordialement,



--



Jérôme Boutet

Conservatoire d'espaces naturels de Picardie

1, place Ginkgo - village Oasis

80 044 AMIENS cedex 

tél : 03 22 89 84 24 



> From: r-help-requ...@r-project.org
> Subject: R-help Digest, Vol 118, Issue 3
> Date: Mon, 03 Dec 2012 12:00:08 +0100

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [mgcv][gam] Manually defining my own knots?

2012-12-03 Thread Simon Wood


Andrew,

I think you mean

dumb.example2$coefficients = averaged.models
   ~~~

currently your code adds a new vector 'coeff' to the gam object, rather 
than modifying 'coefficients'. (A good example of why forgiving 
languages like R are dangerous, and you should really write everything 
in something utterly unforgiving like C - only kidding).


Simon

On 03/12/12 05:01, Andrew Crane-Droesch wrote:

Hi Simon,

Thanks for your help.  I've got another question if you don't mind -- is
it possible to "swap out" a set of coefficients of a gamObject in order
to change the results when that gamObject is plotted?  The (silly)
example below illustrates that this is possible with the Vp matrix.  But
it is not working for me as I'd like it to for the coefficients.

library(mgcv)
#Random data
x = runif(1000,0,1)
y = (log(x^2)+x^3)/sin(x)
dumb.knots = c(.1,.2,.3)
dumb.example1 = gam(y~s(x,k=3),knots=list(x=dumb.knots))
plot(dumb.example1)

x = runif(1000,0,1)
y = (log(x^2)+x^3)/sin(x)
dumb.knots = c(.1,.2,.3)
dumb.example2 = gam(y~s(x,k=3),knots=list(x=dumb.knots))
plot(dumb.example2)

cbind(dumb.example1$coeff,dumb.example2$coeff)

averaged.models=(dumb.example1$coeff+dumb.example2$coeff)/2
correc = matrix(5,3,3)#5 is totally arbitrary, standing in for a proper
MI correction
changed.vcv=correc+(dumb.example1$Vp+dumb.example2$Vp)/2

par(mfrow = c(1,2))
plot(dumb.example2,ylim=c(-500,200))
dumb.example2$coeff = averaged.models
dumb.example2$Vp = changed.vcv
plot(dumb.example2,ylim=c(-500,200))

The confidence bands expand but the location of the fit doesn't change!
What part of the gamObject controls the plot of the smooth?

On 12/02/2012 02:15 AM, Simon Wood wrote:

Hi Andrew,

mgcv matches the knots to the smooth arguments by name. If an element
of 'knots' has
no name it will be ignored. The following will do what you want...

dumb.example = gam(y~s(x,k=3),knots=list(x=dumb.knots))

best,
Simon

On 29/11/12 23:44, Andrew Crane-Droesch wrote:

Dear List,

I'm using GAMs in a multiple imputation project, and I want to be able
to combine the parameter estimates and covariance matrices from each
completed dataset's fitted model in the end.  In order to do this, I
need the knots to be uniform for each model with partially-imputed
data.  I want to specify these knots based on the quantiles of the
unique values of the non-missing original data, ignoring the NA's.  When
I fit the GAM with the imputed data included, I don't want mgcv to use
the data that it is supplied to figure out the knots, because this will
lead to un-comparable results when the many fitted models are combined.

Here is a caricatured example of what I want to do:

#Random data
x = runif(1000,0,1)
y = (log(x^2)+x^3)/sin(x)
example = gam(y~s(x))
plot(example)

#But I want to define my own knots
dumb.knots = c(.7,.8,.9)
dumb.example = gam(y~s(x,k=3),knots=list(dumb.knots))
plot(dumb.example)
dumb.example2 = gam(y~s(x,k=3))
plot(dumb.example2)

Dumb example 1 is the same as dumb example 2, but it shouldn't be.

Once I figure out how to do this, I'll take the fitted coefficients from
each model and average them, then take the vcv's from each model and
average them, and add a correction to account for within and between
imputation variability, then plug them into a gamObject$coeffient and
gamObject$Vp matrix, plot/summarize, and have my result. Comments
welcome on whether or not this would be somehow incorrect would be
welcome as well.  Still have a lot to learn!

Thanks,
Andrew

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.








--
Simon Wood, Mathematical Science, University of Bath BA2 7AY UK
+44 (0)1225 386603   http://people.bath.ac.uk/sw283

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Species scores on PCoA

2012-12-03 Thread avadhoot velankar

Dear members,

I am conducting principal coordinates anaysis on a set of morphometry data.
I have 6 variables and 27 samples. I used almost all the packages for PCoA,
 and plotted ordination using orditkplot.

I want to know is there any way i can add variables scores in the same
plot. I want to check which variable is contributing most to variation.

Thank you in anticipation

Avadhoot

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to find the lenth of more than ten variables?

2012-12-03 Thread Jim Lemon


On 12/03/2012 03:27 PM, killerkarthick wrote:

Hi
I have one data set with 20 variables. I want to find the length of each
variables at a time. Please help me ..
Thanks in advance


Hi killerkarthick,
This may do what you want:

unlist(lapply(my_data_set,length))

if the data set is a list.

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R beginner

2012-12-03 Thread avadhoot velankar

Hi andrew

Stick to one tutorial before you get confident enough. I found this one
very good. informative, easy to understand and best thing you basically try
everything out on sample files provided. word of caution though instead of
using ctrl+R command in script file, manually type it in consol initially
to learn about syntax

http://www.unt.edu/rss/class/Jon/R_SC/

welcome to R

avadhoot

On Mon, Dec 3, 2012 at 10:34 AM, andrewcd  wrote:

> My $.02: re-do an analysis that you did in another software package in R,
> making sure that you get the same results.  A good way to learn any
> language.
>
>
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/R-beginner-tp4651719p4651759.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to find the lenth of more than ten variables?




On 03.12.2012 05:27, killerkarthick wrote:

Hi
I have one data set with 20 variables. I want to find the length of each
variables at a time. Please help me ..
Thanks in advance


Within a data.frame, all columns have the same length that is given by 
nrow(). Looks like you have to rephrase your question so that we 
understand it


Uwe Ligges






--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-find-the-lenth-of-more-than-ten-variables-tp4651751.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to make the cell background of a table informative?