date:20120715

[R] rgamma function

2012-07-15 Thread Chandler Zuo


Hi,

Has anyone encountered the problem of rgamma function in C? The 
following simplified program always dies for me, and I wonder if anyone 
can tell me the reason.


#include Rmath.h
#include time.h
#include Rinternals.h

SEXP generateGamma ()
{
srand(time(NULL));
return (rgamma(5000,1));
}

Has anyone encountered a similar problem before? Is there another way of 
generating Gamma random variable in C?


P.S. I have no problem compiling and loading this function in R.

Thanks for suggestions in advance!

--Chandler

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] set Rprofile.site,can't work

2012-07-15 Thread ????????

my system :debian
in console:
nano   /home/tiger/R-2.15.1/etc/Rprofile.site


here is my content:


.First - function(){

cat(\nWelcome at, date(), \n)
}

#
.Last - function(){
cat(\nGoodbye at , date(), \n)
}


when i save it ,reopen my  R ,
why there is no  
Welcome at Sun Jul 15 07:53:58 2012 
in my  R?
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Can't understand syntax

2012-07-15 Thread Charles Stangor

OK, I need help!!

I've been searching, but I don't understand the logic of some this
dataframe addressing syntax.

What is this type of code called?

test [[v3]] [is.na(test[[v2]])] -10  #choose column v3 where column v2
is == 4 and replace with 10

and where is it documented?


The code below works for what I want to do (find the non-missing value in a
row), but why?

test - read.table(text=
v1  v2  v3  result
3  NA  NA  NA
NA  3   NA NA
NA  NA   3 NA

, header=TRUE)

test [[result]] [!(is.na(test[[v1]]))] - test [[v1]] [!(is.na
(test[[v1]]))]
test [[result]] [!(is.na(test[[v2]]))] - test [[v2]] [!(is.na
(test[[v2]]))]
test [[result]] [!(is.na(test[[v3]]))] - test [[v3]] [!(is.na
(test[[v3]]))]

thanks!


On Fri, Jul 13, 2012 at 6:41 AM, Rui Barradas ruipbarra...@sapo.pt wrote:

 Hello,

 Check the structure of what you have, df and newdf. You will see that in
 df dateTime is of class POSIXlt and in newDf newDateTime is of class
 POSIXct.

 Solution:

 [...]
 df$dateTime - strptime(df$dateTime,%m/%d/%Y %H:%M)
 df$dateTime - as.POSIXct(df$dateTime)
 [...]

 Hope this helps,

 Rui Barradas

 Em 13-07-2012 10:24, vioravis escreveu:

 I have the following dataframe with the first column being of type
 datetime:

 dateTime - c(10/01/2005 0:00,
10/01/2005 0:20,
10/01/2005 0:40,
10/01/2005 1:00,
10/01/2005 1:20)
 var1 - c(1,2,3,4,5)
 var2 - c(10,20,30,40,50)
 df - data.frame(dateTime = dateTime, var1 = var1, var2 = var2)
 df$dateTime - strptime(df$dateTime,%m/%d/%Y %H:%M)

 I want to create 10 minute interval data as follows:

 minTime - min(df$dateTime)
 maxTime - max(df$dateTime)
 newTime - seq(minTime,maxTime,600)
 newDf - data.frame(newDateTime = newTime)
 newDf - merge(newDf,df,by.x = newDateTime,by.y = dateTime,all.x =
 TRUE)

 The objective here is to create a data frame with values from df for the
 datetime in df and NA for the missing ones. However, I am getting the
 following data frame with both Var1 and Var2 having all NAs.

  newDf

newDateTime var1 var2
 1 2005-10-01 00:00:00   NA   NA
 2 2005-10-01 00:10:00   NA   NA
 3 2005-10-01 00:20:00   NA   NA
 4 2005-10-01 00:30:00   NA   NA
 5 2005-10-01 00:40:00   NA   NA
 6 2005-10-01 00:50:00   NA   NA
 7 2005-10-01 01:00:00   NA   NA
 8 2005-10-01 01:10:00   NA   NA
 9 2005-10-01 01:20:00   NA   NA

 Can someone help me on how to do the merge based on the two datetime
 columns?

 Thank you.

 Ravi






 --
 View this message in context: http://r.789695.n4.nabble.com/**
 Merging-on-Datetime-Column-**tp4636417.htmlhttp://r.789695.n4.nabble.com/Merging-on-Datetime-Column-tp4636417.html
 Sent from the R help mailing list archive at Nabble.com.

 __**
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/**
 posting-guide.html http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 __**
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/**
 posting-guide.html http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Charles Stangor
Professor and Associate Chair

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ROC curves with ROCR

2012-07-15 Thread blerta

Hi, 

I don't really understand how ROCR works.  Here's another example with a
randomforest model: I have the training dataset(bank_training) and testing
dataset(bank_testing) and I ran a randomForest as below:

bankrf-randomForest(y~., bank_training, mtry=4, ntree=2,  
keep.forest=TRUE,importance=TRUE) 
bankrf.pred-predict(bankrf, bank_testing)
library(ROCR)
pred-prediction(bankrf.pred$y, bank_testing$y)

Here I get the error that the prediction format is incorrect? Where is the
mistake?

Thanks in advance

--
View this message in context: 
http://r.789695.n4.nabble.com/ROC-curves-with-ROCR-tp4636435.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] OT: Where's the new Tukey?

2012-07-15 Thread Larry White

I'm looking for a single book that provides a deep, yet readable
introduction to applied data analysis for general readers.

I'm looking for coverage on things like understanding randomness, natural
experiments, confounding, causality and correlation, data cleaning and
transforms, lagging, residuals, exploratory graphics, curve fitting,
descriptive stats Preferably with examples/case studies that illustrate
the art and craft of data analysis. No proofs or heavy math.

What have you got?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Loading in Large Dataset + variables via loop

2012-07-15 Thread cmc0605

Hello, I'm new to R with a (probably elementary) question.

Suppose I have a dataset called /A/ with /n/ locations, and each location
contains within it 3 time series of different variables (all of 100 years
length); each time series is of a weather variable (for each location there
is a temperature, precipitation, and pressure).  For instance, location 1
has a temperature1 time series, a precip1 time series, and a pressure1 time
series; location two has a temperature2, precip2, and pressure2
timeseries...That is, there are 100 rows, and (/n/*3)+1 columns.  The extra
column is the time.

I want to load in this dataset and declare a variable for each time series. 
The columns are in order of location, so it goes temp1, precip1,pressure1,
temp2,... and so forth in increasing column order.  There are always 100
rows.  Manually, Id have to do:

temp1=A[,1]
precip1=A[,2]
pressure1=A[,3]
temp2=A[,4]
precip2=A[,5]
pressure2=A[,6]
temp3=A[,7]
and so forth.

Problem is, n is large, so I don't want to repeat this pattern forever.  I
figure I need a loop both for the variable name (ie.., the variable at a
particular location) as well as for what column it reads from.

Any help...?



--
View this message in context: 
http://r.789695.n4.nabble.com/Loading-in-Large-Dataset-variables-via-loop-tp4636501.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] PLSR AND PCR ISSUES

2012-07-15 Thread kabir opeyemi

Dear all,
Please I am working on PCR and PLSR with pls package and my issue is the 
command to extract components. Please help with a solution.
Thanks.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Help for Fisher's exact test

2012-07-15 Thread Guanfeng Wang

Hi, R-help,
I have a group of data from RNA-seq want to be analyzed by  Fisher's
exact test in R. I want to compare the significant difference of about
30, individuals in two different samples, and I have no idea how to use
R, so could you  please give me some suggestions or the scripts for
Fisher's exact test? Thank you very much.

Best,
Guanfeng Wang

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] GAM Chi-Square Difference Test

2012-07-15 Thread wshadish

We are using GAM in mgcv (Wood), relatively new users, and wonder if anyone
can advise us on a problem we are encountering as we analyze many short time
series datasets. For each dataset, we have four models, each with intercept,
predictor x (trend), z (treatment), and int (interaction between x and z).
Our models are 

Model 1: gama1.1 - gam(y~x+z+int, family=quasipoisson) ##no smooths
Model 2: gama1.2 - gam(y~x+z+s(int, bs=cr), family=quasipoisson) ##smooth
the interaction
Model 3: gama1.3 - gam(y~s(x, bs=cr)+z+int, family=quasipoisson) ##smooth
the trend
Model 4: gama1.4 - gam(y~s(x, bs=cr)+z+s(int, bs=cr),
family=quasipoisson) ##smooth trend and interaction

We have three questions. One question is simple. We occasionally obtain edf
=1 and Ref.df=1 for some smoothed predictors (x, int). Because Wood says
that edf can be interpreted roughly as functional form (quadratic, cubic
etc) + 1, this would imply x^0 functional form for the predictor, and that
doesn't make a lot of sense. Does such a result for edf and rdf indicate a
problem (e.g., collinearity) or any particular interpretation? 

The other two questions concern which model fits the data best. We do look
at the usual various fit statistics (R^2, Dev, etc), but our question
concerns using the anova function to do model comparisons, e.g., 

anova(gama2.1,gama2.2, test=Chisq).

1. Is there research on the power of the model comparison test? Anecdotally,
the test seems to reject the null even in cases that would appear to have
only small differences. These are not hugely long time series, ranging from
about 17 to about 49, so we would not have thought them to yield large
power. 

2. More important, in a few cases, we are getting a result that looks like
this: 

anova(gamb1.1,gamb1.2, test=Chisq)
Analysis of Deviance Table

Model 1: y ~ x + z + int
Model 2: y ~ x + z + s(int, bs = cr)
  Resid. Df Resid. Dev Df   Deviance P(|Chi|)
130 36.713
230 36.713 1.1469e-05 1.0301e-05 6.767e-05 ***

We are inclined to think that the significance p value here is simply a
result of rounding error in the computation of the df difference and
deviance difference, and that we should treat this as indicating the models
are not different from each other. Has anyone experienced this before? Is
our interpretation reasonable?

Thanks to anyone who is able to offer advice. 

Will Shadish

--
View this message in context: 
http://r.789695.n4.nabble.com/GAM-Chi-Square-Difference-Test-tp4636523.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] alternate tick labels and tick marks with lattice xyplot

2012-07-15 Thread Leah Marian

Hi,

I would like to use xyplot to create a figure. Unfortunately, I cannot find
documentation in xyplot to specify alternating the x-axis tick labels with
the x-axis tick marks. I can do this with the regular R plot function as
follows.


#A small version of my data looks like this
data-data.frame(matrix(ncol=3,nrow=12))
data[,1]-rep(c(1,2,3),c(4,4,4))
data[,2]-rep(c(1,2,3,4),3)
data[,3]-runif(12,0,1)
names(data)-c(Chromosome, BasePair, Pvalue)
#using R's plot function, I would place the the chromosome label between
the
#tick marks as follows:
v1-c(4,8)
v2-c(2,6,10)
data$indice-seq(1:12)
plot(data$indice, -log10(data$Pvalue), type=l, xaxt=n, main=Result,
 xlab=Chromosome, ylab=expression(paste(-log[10], p-value)))
axis(1, v1,labels=FALSE )
axis(1, v2, seq(1:3), tick=FALSE, cex.axis=.6)

Can this be done with lattice xyplot?


-- 
Leah Preus
Biostatistician
Roswell Park Cancer Institute

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] functions of vectors : loop or vectorization

2012-07-15 Thread Julien Salanie

I have a read a lot about the benefits of vectorization in R. I have a
program that takes almost forever to run. A good way to see if I have
learned something ... My problem can be summarized like this : I have a
nonlinear function of several variables that I want to optimize over one
letting the other describe a family of curves. In short, I wan't to optimize
f(x,a,b) for several values of a and b.

It is easily done with a loop. Here's an example :

a = 1:5;
b = 1:5;
myfunction = function(x){y*x-(x+z)^2};
myresults = array(dim=c(length(a),length(b)));
for(y in a){ for(z in b) { myresults[y,z] =
optimize(myfunction,c(-10,10),maximum=TRUE)$maximum }};
myresults;

[,1] [,2] [,3] [,4] [,5]
[1,] -0.5 -1.5 -2.5 -3.5 -4.5
[2,]  0.0 -1.0 -2.0 -3.0 -4.0
[3,]  0.5 -0.5 -1.5 -2.5 -3.5
[4,]  1.0  0.0 -1.0 -2.0 -3.0
[5,]  1.5  0.5 -0.5 -1.5 -2.5

Of course, my real life problem is a bit more complicated and runs in days
...

I didn't find a straightforward way to do this using the apply family. I did
a small script that works. Here it is :

c = 1:5;
d = 1:5;
myfunction2 =
function(c,d){optimize(function(x){c*x-(x+d)^2},c(-10,10),maximum=TRUE)$maximum};
v.myfunction2 = Vectorize(myfunction2, c(c,d));
outer(c, d, v.myfunction2);

all.equal(myresults,outer(c, d, v.myfunction2));
[1] TRUE

I was quite happy with my trick of separating and wrapping the functions
until I increased the size of the two input vectors and checked for the
processing time. I made no gain. In that case :

 time.elapsed; time.elapsed2;
Time difference of 0.0816 secs
Time difference of 0.0792 secs

When I changed the size of the vectors and added a logarithm here and there
to complicate a bit, it doesn't change the problem. The two methods perform
identically. Am I missing something ? Is there a better way to vectorize the
problem to gain time ? How is it that my loop performs as well as outer ?
Thanks in advance for your help. All the best, Julien

--
View this message in context: 
http://r.789695.n4.nabble.com/functions-of-vectors-loop-or-vectorization-tp4636494.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R combining many vectors of predictable name into one date frame

2012-07-15 Thread Mathew Vickers

G'day R (power) users,

I have a many vectors, called:

ib1
ib2
ib3
...
ib100

and I would like them in one data frame (df) such that:

 df

ib1  ib2   ib3  ib4 .  ib100
x  x  xxx
x  x  xxx
x  x  xxx

I have attempted:

hold.list - list(objects(pattern=ib))
df - data.frame(hold.list)

but that didn't work

also:
do.call(rbind, (objects(pattern=ib)))

and that also didn't work. I tried a whole pile of other things, where I
also failed.


The number of vectors might differ each time I want to make the data frame,
so that in the example above, I have ib1 : ib100, but next time, I might
only have ib1 : ib2

Below is my (probably somewhat embarrassing) example script for generating
the vectors in the first place. Commented out toward the end are a few
attempts at doing the job I wanted to do.


temp - runif(100)
tripID - rep(1:10, 10)
uni - rep(1:4, 25)
temp - data.frame(temp, tripID, uni)

trips - unique(temp$tripID)
uni - unique(temp$uni[temp$tripID==trips[1]])

for (jj in 1:length(uni)){
  a - c()
  for (ii in 1:10){
a - c(a, IQR(temp$temp[temp$uni %in% sample(uni,jj)]))
assign(paste(ib,jj,sep=), a) # ib is short for ibuttons. The number
is how many were used to calc IQR
  }
#   hold.list - list(objects(patter=ib))
#   trip - data.frame(list=hold.list # I am trying to put everything into
a dataframe
#   do.call(rbind, list=hold.list)
#   do.call(rbind, list(objects(pattern=ib)))
}


thanks heaps if you can help. And sorry if this is mostly garble. This is
my first crack at soliciting help from the list.

cheers,

mat


-- 
Mathew Vickers
PhD Student
James Cook University
CSRIO
Australia, mate.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Quantile Regression - Testing for Non-causalities in quantiles

2012-07-15 Thread stefan23

Dear all,
I am searching for a way to compute a test comparable to Chuang et al.
(Causality in Quantiles and Dynamic Stock
Return-Volume Relations).  The aim of this test is to check wheter the
coefficient of a quantile regression granger-causes Y in a quantile range. I
have nearly computed everything but I am searching for an estimator of the
density of the distribution at several points of the distribution. As the
quantreg-package of Roger Koenker is also able to compute confidence
intervalls for quantile regression (which also contain data concerning the
estimated density) I wanted to ask wether someone could tell me if it is
possible to extract the density of the underlying distribution by using
the quantreg package.
I hope my question is not to confusing, thank you very, very much in
adavanve I appreciate every comment=)
Cheers
Stefan

--
View this message in context: 
http://r.789695.n4.nabble.com/Quantile-Regression-Testing-for-Non-causalities-in-quantiles-tp4636511.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] set Rprofile.site,can't work

2012-07-15 Thread Duncan Murdoch


On 12-07-14 7:54 PM, 水静流深 wrote:

my system :debian
in console:
nano   /home/tiger/R-2.15.1/etc/Rprofile.site


here is my content:


.First- function(){

cat(\nWelcome at, date(), \n)
}

#
.Last- function(){
cat(\nGoodbye at , date(), \n)
}


when i save it ,reopen my  R ,
why there is no
Welcome at Sun Jul 15 07:53:58 2012
in my  R?


That works for me, so I'd guess you've put the changes in the wrong 
place.  What do you have in your R_PROFILE environment variable?  What 
about R_HOME?  You should look at these from within R, using Sys.getenv().


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] how to pool imputed data sets with latent class analysis and binary logistic regression

2012-07-15 Thread Niklas Fischer

Dear All,

I've used mice package for my latent class analysis and binary logistic
regression

I've imputed five data sets and with long format I've added new variable
that shows latent class membership.

And then in addition to other variables, I'll use binary logistic
regression and try to pool the estimates.
However I couldn't create data.frame to mids objects, and therefore it
produced the error below:

Error in pool(fit) : The object must have class 'mira'

Do you have any suggestions? I'd appreciated if you have time and respond
my e-mail.

Bests,
Niklas

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] HLOOKUP in R

2012-07-15 Thread Silje Nord

Hi,

Is there a function similar to excel's hlookup in R ?

Thanks,
Silje

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] significance test interquartile ranges

2012-07-15 Thread Schaber , Jörg

Dear Peter,

thanks for your clarifications. Sample size is around 200 in each group. Would 
that justify your approach?

I found a couple of more tests for scale on continous variables, ie. 
Mood Test
Ansari-Bradley Test (that one is also implemented in R)
Klotz Test
Conover Test

Would one of those be suitable to test for different dispersion (e.g. IQR or 
the like) in non-normal distributions?

thanks,

joerg



Von: peter dalgaard [pda...@gmail.com]
Gesendet: Samstag, 14. Juli 2012 10:01
Bis: Prof Brian Ripley
Cc: Greg Snow; R-help; Schaber, Jörg
Betreff: Re: [R] significance test interquartile ranges

On Jul 14, 2012, at 08:16 , Prof Brian Ripley wrote:

 On 13/07/2012 21:37, Greg Snow wrote:
 A permutation test may be appropriate:

 Yes, it may, but precisely which one is unclear.  You are testing whether the 
 two samples have an identical distribution, whereas I took the question to be 
 a test of differences in dispersion, with differences in location allowed.

 I do not think this can be solved without further assumptions.  E.g people 
 often replace the two-sample t-test by the two-sample Wilcoxon test as a test 
 of differences in location, not realizing that the latter is also sensitive 
 to other aspects of the difference (e.g. both dispersion and shape).

(Brian knows this, of course, but I though it useful to insert a little 
quibbling.)

Sensitive is perhaps a little misleading here. The test statistic in the 
Wilcoxon test is essentially an estimate of the probability that a random 
observation in one group is bigger than a random observation in the other 
group. It isn't hard to imagine situation where that quantity is unaffected by 
a dispersion change so the test is not sensitive in the sense that it can 
detect dispersion changes between sufficiently large samples.

However, the point is that p values _rely on_ the null hypothesis that two 
distributions are exactly the same. This is mostly uncontroversial if you are 
testing for an irrelevant grouping, but if you need confidence intervals for 
the difference, you are implicitly assuming a location-shift model.

The same thing is true for permutation tests in general: You need to be rather 
careful about what the assumptions are that allows you to interchange things. 
Asymptotically, the distribution of the IQR depends on the values of the 
density at the true quartiles. These could be different in the two groups, and 
easily completely unrelated to those of a  pooled sample.

I think that I would suggest finding an error estimate for the IQR (or maybe 
log IQR) in each group separately, perhaps by bootstrapping, and then compare 
between groups with an asymptotic z test. The main caveat is whether you have 
sufficiently large sample sizes for asymptotics to hold.

Peter D.


 I nearly suggested (yesterday) doing the permutation test on differences from 
 medians in the two groups.  But really this is off-topic for R-help and needs 
 interaction with a knowledgeable statistician to refine the question.

 1. compute the ratio of the 2 IQR values (or other comparison of interest)
 2. combine the data from the 2 samples into 1 pool, then randomly
 split into 2 groups (matching sample sizes of original) and compute
 the ratio of the IQR values for the 2 new samples.
 3. repeat #2 a bunch of times (like for a total of 999 random splits)
 and combine with the original value.
 4. (optional, but strongly suggested) plot a histogram of all the
 ratios and place a reference line of the original ratio on the plot.
 5. calculate the proportion of ratios that are as extreme or more
 extreme than the original, this is the (approximate) p-value.

 I think it is an 'exact' (but random) p-value.


 On Fri, Jul 13, 2012 at 5:32 AM, Schaber, Jörg
 joerg.scha...@med.ovgu.de wrote:
 Hi,

 I have two non-normal distributions and use interquartile ranges as a 
 dispersion measure.
 Now I am looking for a test, which tests whether the interquartile ranges 
 from the two distributions are significantly different.
 Any idea?

 Thanks,

 joerg



 --
 Brian D. Ripley,  rip...@stats.ox.ac.uk
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com









__
R-help@r-project.org mailing list

[R] read mignight as 24:00 and not as 0:00

2012-07-15 Thread Sandy Adriaenssens

Dear all,

I have dataset which contains date and time in the format
yearmonthdayhour. I can read in these data correctly as follows:

mydata - read.csv(pm10_corine_gridcel_hourly_2011.csv, header = TRUE)
mydata$date - as.POSIXct(strptime(mydata$date, format = %Y%m%d%H,
tz=UTC))

However, midnight is defined as 24:00 in my original file (so the end of the
day), while the POSIXct function changes this to 0:00 (the beginning of the
next day).
So, my data now go from January 1 2011 1:00 to Januari 1 2012 0:00, in stead
of December 31 2011 24:00.

summary(mydata$date)
 Min.   1st Qu.Median 
2011-01-01 01:00:00 2011-04-02 06:45:00 2011-07-02 12:30:00 
 Mean   3rd Qu.  Max. 
2011-07-02 12:30:00 2011-10-01 18:15:00 2012-01-01 00:00:00 

I would like to change this 0:00 to 24:00 again since I want to include
these values in daily averages of the previous day (and not of the next
day). So the day of the month should also be diminished by 1.

I have tried extracting the hours which are 0 and converting them to 24, but
then I can't paste them back in the date/time of the original data.fram
again.

Are there maybe other solutions?

Thanks in advance,

Sandy

ifelse (as.POSIXlt(mydata[24,1])$hour = 0,as.POSIXlt(mydata[24,1])$hour =
24 


--
View this message in context: 
http://r.789695.n4.nabble.com/read-mignight-as-24-00-and-not-as-0-00-tp4636423.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Maximum number of patterns and speed in grep

2012-07-15 Thread mdvaan

Here's some data (which should give you the error messages):

# read in data
data - read.csv(https://dl.dropbox.com/u/13631687/data.csv;, header =
T, sep = ,)

# first paste all data
data1 - paste(data[,1], collapse = |)

# second paste subsets of the data
data2a - paste(data[1:750,1], collapse = |)
data2b - paste(data[751:1500,1], collapse = |)

# define the object to be searched
text - c(the first is Santa Fe Gold Corp, the second is Starpharma
Holdings)

# match
strapplyc(text, data1)
strapplyc(text, data2a)
strapplyc(text, data2b)

Thanks in advance!

Math



Gabor Grothendieck wrote
 
 On Fri, Jul 13, 2012 at 9:40 AM, mdvaan lt;mathijsdevaan@gt; wrote:
 Thanks, I see that it is working in the sample data. My data, however,
 gives
 me an error message:

 data - strapplyc(text, batch[[l]])
 Error in structure(.External(dotTcl, ..., PACKAGE = tcltk), class =
 tclObj) :
   [tcl] couldn't compile regular expression pattern: parentheses () not
 balanced.

 batch[[l]] is similar to your re string except that there is a larger
 variety of characters. I haven't been able to figure out which characters
 are causing trouble here. Any thoughts?

 Thank you very much.

 Math
 ...

 __
 R-help@ mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 Note part on last line about posting reproducible code.
 
 -- 
 Statistics  Software Consulting
 GKX Group, GKX Associates Inc.
 tel: 1-877-GKX-GROUP
 email: ggrothendieck at gmail.com
 
 __
 R-help@ mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

--
View this message in context: 
http://r.789695.n4.nabble.com/Maximum-number-of-patterns-and-speed-in-grep-tp4635613p4636472.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] significance test interquartile ranges

2012-07-15 Thread Schaber , Jörg

Thanks for your suggestions! 
The Siegel Tukey test and the permutation test sound promising, indeed.

I applied the wilcoxon test already, but understood that it mainly tests 
differences in the medians (location), even though being sensitive to all kinds 
of differences between distributions, similar to the K-S test.
I once heard that the K-S test is more sensitive to differences in the tails 
between distributions, whereas the U-test is more sensitive to differences in 
location in general. Can some knowledgeable statistician comment on that?

I do not understand the concern of Brian, saying that the permutation test 
suggested by Greg tests equality in distribution. When the test statistic is 
the ratio of IQRs, the permutation test calucates the p-value of this ratio 
under the null hypothesis that group label does not matter, i.e. that they are 
equal, right? But I am probable not knowledgeable statistician enough to judge 
that.

best,

joerg




Von: Prof Brian Ripley [rip...@stats.ox.ac.uk]
Gesendet: Samstag, 14. Juli 2012 08:16
Bis: Greg Snow
Cc: Schaber, Jörg; R-help
Betreff: Re: [R] significance test interquartile ranges

On 13/07/2012 21:37, Greg Snow wrote:
 A permutation test may be appropriate:

Yes, it may, but precisely which one is unclear.  You are testing
whether the two samples have an identical distribution, whereas I took
the question to be a test of differences in dispersion, with differences
in location allowed.

I do not think this can be solved without further assumptions.  E.g
people often replace the two-sample t-test by the two-sample Wilcoxon
test as a test of differences in location, not realizing that the latter
is also sensitive to other aspects of the difference (e.g. both
dispersion and shape).

I nearly suggested (yesterday) doing the permutation test on differences
from medians in the two groups.  But really this is off-topic for R-help
and needs interaction with a knowledgeable statistician to refine the
question.

 1. compute the ratio of the 2 IQR values (or other comparison of interest)
 2. combine the data from the 2 samples into 1 pool, then randomly
 split into 2 groups (matching sample sizes of original) and compute
 the ratio of the IQR values for the 2 new samples.
 3. repeat #2 a bunch of times (like for a total of 999 random splits)
 and combine with the original value.
 4. (optional, but strongly suggested) plot a histogram of all the
 ratios and place a reference line of the original ratio on the plot.
 5. calculate the proportion of ratios that are as extreme or more
 extreme than the original, this is the (approximate) p-value.

I think it is an 'exact' (but random) p-value.


 On Fri, Jul 13, 2012 at 5:32 AM, Schaber, Jörg
 joerg.scha...@med.ovgu.de wrote:
 Hi,

 I have two non-normal distributions and use interquartile ranges as a 
 dispersion measure.
 Now I am looking for a test, which tests whether the interquartile ranges 
 from the two distributions are significantly different.
 Any idea?

 Thanks,

 joerg



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Compilation Error with Rcpp

2012-07-15 Thread Sven D

Hello,

I am trying to reproduce a code example from 
http://www.babelgraph.org/wp/?p=358 babelgraph  when compiling the function
to call the C++ code I get the following error:

Error in compileCode(f, code, language = language, verbose = verbose) : 
  Compilation ERROR, function(s)/method(s) not created! 
In addition: Warning message:
running command 'C:/PROGRA~1/R/R-215~1.0/bin/i386/R CMD SHLIB
file141c7ac23195.cpp 2 file141c7ac23195.cpp.err.txt' had status 1 

Has anyone an idea what this means? Its not clear to me what the error would
be. I doubt its a source code error, but am happy to provide the source if
necessary. My sessioninfo:

R version 2.15.0 (2012-03-30)
Platform: i386-pc-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United
Kingdom.1252   
[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
   
[5] LC_TIME=English_United Kingdom.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base 

other attached packages:
[1] inline_0.3.8 Rcpp_0.9.10  plyr_1.7.1  

loaded via a namespace (and not attached):
[1] tools_2.15.0


Thanks


Sven





--
View this message in context: 
http://r.789695.n4.nabble.com/Compilation-Error-with-Rcpp-tp4636522.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] HLOOKUP in R

2012-07-15 Thread Michael Sumner

What does Excel's HLOOKUP do?

On Saturday, July 14, 2012, Silje Nord wrote:

 Hi,

 Is there a function similar to excel's hlookup in R ?

 Thanks,
 Silje

 __
 R-help@r-project.org javascript:; mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Michael Sumner
Hobart, Australia
e-mail: mdsum...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] significance test interquartile ranges

2012-07-15 Thread peter dalgaard


On Jul 14, 2012, at 19:58 , Schaber, Jörg wrote:

 Dear Peter,
 
 thanks for your clarifications. Sample size is around 200 in each group. 
 Would that justify your approach?

It's certainly better than 10... 

I did a small check on the IgM data from the ISwR package (298 obs.) and found 
something somewhat amusing: Discretization effects can kick in rather 
profoundly with data sets of that magnitude. 

The IgM data are discretized to 1 decimal digit, which is fairly common for 
continuous data in practice

 table(IgM)
IgM
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9   1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8   2 2.1 
  3   7  19  27  32  35  38  38  22  16  16   6   7   9   6   2   3   3   3   2 
2.2 2.5 2.7 4.5 
  1   1   1   1 
 summary(IgM)
   Min. 1st Qu.  MedianMean 3rd Qu.Max. 
  0.100   0.500   0.700   0.803   1.000   4.500 
 IQR(IgM)
[1] 0.5

However, if we want to look at the sample distribution of a quantile, we get 
some curious effects as the variation of the estimate is close to the 
discretization error. Try a simple bootstrap sample from the empirical CDF:

 medians - replicate(1,median(sample(IgM,replace=T)))
 table(medians)
medians
 0.6 0.65  0.7 0.75  0.8 
  136 9035  179  767 

However, if we smoothen the empirical CDF by adding a little noise, we do get 
something that does look passably (although not perfectly) gaussian:

 x - IgM + runif(IgM, -.05,.05)
 medians2 - replicate(1,median(sample(x,replace=T)))
 hist(medians2)
 qqnorm(medians2)

Interestingly, adding noise has the counterintuitive effect of reducing the 
standard error of the medians:

 sd(medians)
[1] 0.02748966
 sd(medians2)
[1] 0.02347363

(It's not _that_ counterintuitive given that the definition of the median isn't 
quite the same for discrete data.)

Back to the IQR. You can do much the same thing:

 iqrs - replicate(1,IQR(sample(IgM,replace=T)))
 table(iqrs)
iqrs
  0.3 0.375   0.4  0.45 0.475   0.5  0.55 0.575   0.6 
   6042  3885 7   640  5100 387   176 

or, use the smoothed one replacing IgM by x (defined above).

Now, what if we wanted to compare two IQRs? I'll cheat and reuse the same ECDF 
for both groups.

 i1 - replicate(1,IQR(sample(IgM,replace=T)))
 i2 - replicate(1,IQR(sample(IgM,replace=T)))
 qqnorm((i1-i2)/sd(i1-i2))
 mean(abs(i1-i2)/sd(i1-i2)  2)
[1] 0.9698

So, not really all that bad, but it is a bit fortuitous given the discreteness 
of the distribution.

Same thing with the x comes out quite a bit nicer

 ix1 - replicate(1,IQR(sample(x,replace=T)))
 ix2 - replicate(1,IQR(sample(x,replace=T)))
 qqnorm((ix1-ix2)/sd(ix1-ix2))
 mean(abs(ix1-ix2)/sd(ix1-ix2)  2)
[1] 0.9546

So, my conclusion would be that yes, you can use bootstrap techniques with data 
of that size, but you need to watch out for discretization effects by checking 
the bootstrap sample distributions and you might want to add a little 
smoothing-noise for stability. 

As always with bootstrapping, beware that the simulation is never done under 
the null hypothesis, one merely hopes that the distribution of the resampled 
estimates around the observed estimate is sufficiently similar to that of the 
estimator around the true estimate that it can be used for tests and confidence 
intervals, implicitly using a location-shift argument. This gets particularly 
dubious when there are discretization effects because the jumps occur at values 
that do not depend on the parameters. 

(Pragmatically speaking, you might not be interested at all in differences in 
IQR which are comparable to discretization error, though.) 


 
 I found a couple of more tests for scale on continous variables, ie. 
 Mood Test
 Ansari-Bradley Test (that one is also implemented in R)
 Klotz Test
 Conover Test
 
 Would one of those be suitable to test for different dispersion (e.g. IQR or 
 the like) in non-normal distributions?
 

That is what they were designed to do... I'm not all that well acquainted with 
them, but given what I have seen from that general area and period, they should 
likely be studied with a critical eye to hidden assumptions. Quite a lot of 
work has been published with the general structure of let's do some sensible 
transformations of data and apply a nonparametric test, then call the whole 
procedure assumption-free (in those days, 1950s and 1960s, essentially, 
computer simulations were not readily available to show people the error of 
their ways...).

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] NaN in hurdle model please?

2012-07-15 Thread Highland Statistics Ltd

Simplify your model. Does your TandemRepeat have a lot of levels? Or is 
your sample size very small?

Alain


Dear all,
I am
fitting  a hurdle model in the following way:
HNB -
hurdle(chro ~ as.factor(TandemRepeat)| as.factor(TandemRepeat), data 
=data_negbin_fin,
dist = negbin)
But the std.
error for log(theta) = NA
Count model
coefficients (truncated negbin with log link):
  Estimate  Std. Error z value   
Pr(|z|)
Log(theta)13.5062 NANA   NA
  
And it
gives the following error:
In
sqrt(diag(vc_count)[kx + 1]) : NaNs
  
Can
somebody help me please.
Thanks so
much
  
Ana




-- 

Dr. Alain F. Zuur
First author of:

1. Analysing Ecological Data (2007).
Zuur, AF, Ieno, EN and Smith, GM. Springer. 680 p.
URL: www.springer.com/0-387-45967-7


2. Mixed effects models and extensions in ecology with R. (2009).
Zuur, AF, Ieno, EN, Walker, N, Saveliev, AA, and Smith, GM. Springer.
http://www.springer.com/life+sci/ecology/book/978-0-387-87457-9


3. A Beginner's Guide to R (2009).
Zuur, AF, Ieno, EN, Meesters, EHWG. Springer
http://www.springer.com/statistics/computational/book/978-0-387-93836-3


4. Zero Inflated Models and Generalized Linear Mixed Models with R. (2012) 
Zuur, Saveliev, Ieno.
http://www.highstat.com/book4.htm

Other books: http://www.highstat.com/books.htm


Statistical consultancy, courses, data analysis and software
Highland Statistics Ltd.
6 Laverock road
UK - AB41 6FN Newburgh
Tel: 0044 1358 788177
Email: highs...@highstat.com
URL: www.highstat.com
URL: www.brodgar.com


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Getting objects from quantmod ticker list

2012-07-15 Thread Cren

# Thank you, Michael: it works fine!

--
View this message in context: 
http://r.789695.n4.nabble.com/Getting-objects-from-quantmod-ticker-list-tp4635708p4636440.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Multivariate apply.rolling()

2012-07-15 Thread Cren

# I've read that rollapply, and its wrapper apply.rolling()
# from PerformanceAnalytics package, do not work with multivariate
# time series neither their output can be a multivariate time series.

# Then I was wondering if any other function like those exists, or
# if I need to write my own function to perform multivariate
# time serie rolling analysis.

# Something like this:

# Let 'X' be your multivariate time series:
# output - matrix(NA, ncol = ncol(X), nrow = nrow(X))
# width - 199
# for(i in 1:(nrow(output) - width) {
# data - X[i:(i + width),]   
# output[i,] - function(data)
#}
# rownames(output) - rownames(as.timeSeries(X))

# ...and this should be a (probably not efficient) way to do it.
# Any better idea?

# Thanks,


--
View this message in context: 
http://r.789695.n4.nabble.com/Multivariate-apply-rolling-tp4636442.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Column create and Update using function

2012-07-15 Thread arun

Hi Antony,

There is still some confusion as to what you actually want as result.  

For example, your statement

-In this i need to check each particular column values are between Max and
Min value.
If the coulmn value not coming between Max and Min, then i need to create
another coulmn

This implies to check for each columns in the dataset and create colname_QF

My reply was based on these facts.  As the Min and Max were assigned to 3 and 
6, I was under the assumption that this is for the whole dataset.  Then, you 
mentioned that it was only for column ABC.  I guess the Min and Max were the 
ones you found from the original dataset for the first column.


So, based on Min and Max for each column, your condition should be met. 


-If the coulmn value not coming between Max and Min, then i need to create
another coulmn
by adding column name header with _QF. and assign a string like RC for
that particular row.



If that is the case, try the code below:

dat1-read.table(text=
ABC   XYZ  PQR  RDN   SQT 
2 4    3 2 8
5 4    8 3 9
7 1    3 4 1
3 2    4 3 2
4 6    5 7 4 
,sep=,header=TRUE)

mindat1-apply(dat1,2,min)
maxdat1-apply(dat1,2,max)
minmaxdat-data.frame(rbind(mindat1,maxdat1))
rownames(minmaxdat)-1:nrow(minmaxdat)
 func1-function(x,y,z)
 {ifelse(y[[x]]  max(z[[x]])  y[[x]]  min(z[[x]]),y[[x]]-,RC)}
 dat2-data.frame(sapply(names(dat1),function(x) func1(x,dat1,minmaxdat)))
colnames(dat2)-paste(colnames(dat2),QF,sep=_)
dat3-data.frame(cbind(dat1,dat2))
dat3
  ABC XYZ PQR RDN SQT ABC_QF XYZ_QF PQR_QF RDN_QF SQT_QF
1   2   4   3   2   8 RC    RC RC   
2   5   4   8   3   9   RC    RC
3   7   1   3   4   1 RC RC RC    RC
4   3   2   4   3   2   
5   4   6   5   7   4    RC    RC   
##

A.K.





- Original Message -
From: Rantony antony.akk...@ge.com
To: r-help@r-project.org
Cc: 
Sent: Friday, July 13, 2012 2:42 AM
Subject: [R] Column create and Update using function

Hi,

here i have a Max and Min values
Min -3
Max -6
and also a matrix like this,

ABC        XYZ         PQR
--       ---        ---
2                 4                3
5                 4                8
7                 1                3

In this i need to check each particular column values are between Max and
Min value.
If the coulmn value not coming between Max and Min, then i need to create
another coulmn
by adding column name header with _QF. and assign a string like RC for
that particular row.

For eg:- i need to checkout coulmn ABC.
Here  2,5,6 are the values we need to checkout with Min,Max values. and here
Min -3,Max -6
First need to create a new column called ABC_QF with current matrix.

ABC        XYZ         PQR          ABC_QF
--       ---        ---         ---
2                 4                3               RC
5                 4                8
7                 1                3

Next, for 5 , it coming in between 3 to 6. so nothing to do.

ABC        XYZ         PQR          ABC_QF
--       ---        ---         ---
2                 4                3               RC
5                 4                8
7                 1                3

Next, for 7 , its not coming in between 3 to 6. so put RC

ABC        XYZ         PQR          ABC_QF
--       ---        ---         ---
2                 4                3               RC
5                 4                8
7                 1                3               RC
---
This is the requirement. i did it using for-loop,it will check each value
and it taking time when bulk of data come.
Any hope to do using lappy,appy kind of functions ? Because at a time
complete coulmn should get update.
Could you please help me urgently ?

- Thanks
Antony. 

--
View this message in context: 
http://r.789695.n4.nabble.com/Column-create-and-Update-using-function-tp4636400.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Column create and Update using function

2012-07-15 Thread arun

Hi,

Try this:

dat1-read.table(text=
ABC    XYZ    PQR
2    4    3
5    4    8
7    1    3
,sep=,header=TRUE)

 newdat-apply(dat1,2,function(x) ifelse(x6  x3,,RC))
 colnames(newdat)-paste(colnames(newdat),QF,sep=_)
 dat2-data.frame(cbind(dat1,newdat))
  ABC XYZ PQR ABC_QF XYZ_QF PQR_QF
1   2   4   3 RC    RC
2   5   4   8   RC
3   7   1   3 RC RC RC
dat2
 
A.K.



- Original Message -
From: Rantony antony.akk...@ge.com
To: r-help@r-project.org
Cc: 
Sent: Friday, July 13, 2012 2:42 AM
Subject: [R] Column create and Update using function

Hi,

here i have a Max and Min values
Min -3
Max -6
and also a matrix like this,

ABC        XYZ         PQR
--       ---        ---
2                 4                3
5                 4                8
7                 1                3

In this i need to check each particular column values are between Max and
Min value.
If the coulmn value not coming between Max and Min, then i need to create
another coulmn
by adding column name header with _QF. and assign a string like RC for
that particular row.

For eg:- i need to checkout coulmn ABC.
Here  2,5,6 are the values we need to checkout with Min,Max values. and here
Min -3,Max -6
First need to create a new column called ABC_QF with current matrix.

ABC        XYZ         PQR          ABC_QF
--       ---        ---         ---
2                 4                3               RC
5                 4                8
7                 1                3

Next, for 5 , it coming in between 3 to 6. so nothing to do.

ABC        XYZ         PQR          ABC_QF
--       ---        ---         ---
2                 4                3               RC
5                 4                8
7                 1                3

Next, for 7 , its not coming in between 3 to 6. so put RC

ABC        XYZ         PQR          ABC_QF
--       ---        ---         ---
2                 4                3               RC
5                 4                8
7                 1                3               RC
---
This is the requirement. i did it using for-loop,it will check each value
and it taking time when bulk of data come.
Any hope to do using lappy,appy kind of functions ? Because at a time
complete coulmn should get update.
Could you please help me urgently ?

- Thanks
Antony. 

--
View this message in context: 
http://r.789695.n4.nabble.com/Column-create-and-Update-using-function-tp4636400.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Arrange two columns into a five variable dataframe

2012-07-15 Thread arun



Hi,

You could use either one of these methods:
#Method 1:

#dat1 : data

list1-split(dat1,dat1$group)
dat2-data.frame(list1)
dat2-data.frame(list1[[5]][1],list1[[4]][1],list1[[3]][1],list1[[2]][1],list1[[1]][1])
colnames(dat2)-c(rev(levels(dat1$group)))
head(dat2)
  Group 1 Group 2 Group 3 Group 4 Group 5
1  40  46  21  35  16
2  37  42  40  37  19
3  44  65  44  49  19
4  47  46  54  46  32
5  47  58  36  63  33
6  47  42  40  39  33


#Method 2:
#dat1:data
library(reshape)

dat3-data.frame(dat1,ID=rep(1:25,5))
dat4-reshape(dat3,idvar=ID,timevar=group,direction=wide)
dat4-dat4[,-1]
colnames(dat4)-rev(levels(dat3$group))
head(dat4)
   Group 1 Group 2 Group 3 Group 4 Group 5
1  40  46  21  35  16
2  37  42  40  37  19
3  44  65  44  49  19
4  47  46  54  46  32
5  47  58  36  63  33
6  47  42  40  39  33


#Method 3:
#dat1: data
dat3-data.frame(dat1,ID=rep(1:25,5))
library(reshape2)

dat5-dcast(melt(dat3,id.vars=c(ID,group)),ID~variable+group)

dat5-dat5[,-1]
colnames(dat5)-levels(dat3$group)
dat5-dat5[,c(5:1)]
head(dat5)

 Group 1 Group 2 Group 3 Group 4 Group 5
1  40  46  21  35  16
2  37  42  40  37  19
3  44  65  44  49  19
4  47  46  54  46  32
5  47  58  36  63  33
6  47  42  40  39  33




 identical(dat2,dat4)
[1] TRUE
 identical(dat2,dat5)
[1] TRUE


A.K.







- Original Message -
From: darnold dwarnol...@suddenlink.net
To: r-help@r-project.org
Cc: 
Sent: Friday, July 13, 2012 11:37 PM
Subject: [R] Arrange two columns into a five variable dataframe

Hi,

I hope that folks can give me some simple approaches to taking the data set
below, which is accumulated in two columns called long and group, then
arrange the data is the long column into a data frame containing five
variables: Group 1, Group 2, Group 3, Group 4, and Group 5.  I am
hoping for a few different techniques which I can pass on to my students.

Thanks

David Arnold
College of the Redwoods


 dput(flies)
structure(list(long = c(40L, 37L, 44L, 47L, 47L, 47L, 68L, 47L, 
54L, 61L, 71L, 75L, 89L, 58L, 59L, 62L, 79L, 96L, 58L, 62L, 70L, 
72L, 74L, 96L, 75L, 46L, 42L, 65L, 46L, 58L, 42L, 48L, 58L, 50L, 
80L, 63L, 65L, 70L, 70L, 72L, 97L, 46L, 56L, 70L, 70L, 72L, 76L, 
90L, 76L, 92L, 21L, 40L, 44L, 54L, 36L, 40L, 56L, 60L, 48L, 53L, 
60L, 60L, 65L, 68L, 60L, 81L, 81L, 48L, 48L, 56L, 68L, 75L, 81L, 
48L, 68L, 35L, 37L, 49L, 46L, 63L, 39L, 46L, 56L, 63L, 65L, 56L, 
65L, 70L, 63L, 65L, 70L, 77L, 81L, 86L, 70L, 70L, 77L, 77L, 81L, 
77L, 16L, 19L, 19L, 32L, 33L, 33L, 30L, 42L, 42L, 33L, 26L, 30L, 
40L, 54L, 34L, 34L, 47L, 47L, 42L, 47L, 54L, 54L, 56L, 60L, 44L
), group = structure(c(5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L), .Label = c(Group 5, Group 4, Group 3, Group 2, 
Group 1), class = factor)), .Names = c(long, group), row.names =
c(NA, 
-125L), class = data.frame)

--
View this message in context: 
http://r.789695.n4.nabble.com/Arrange-two-columns-into-a-five-variable-dataframe-tp4636503.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Fw: Column create and Update using function

2012-07-15 Thread arun





- Forwarded Message -
From: arun smartpink...@yahoo.com
To: Akkara, Antony (GE Energy, Non-GE) antony.akk...@ge.com
Cc: R help r-help@r-project.org
Sent: Friday, July 13, 2012 10:05 AM
Subject: Re: [R] Column create and Update using function

Hi,

I thought you want to check all the columns at once using apply or similar 
functions.

dat1[!(dat1$ABC6  dat1$ABC3),ABC_QF]-RC
dat1[is.na(dat1)]-
 dat1
  ABC XYZ PQR ABC_QF
1   2   4   3 RC
2   5   4   8   
3   7   1   3 RC
A.K.



- Original Message -
From: Akkara, Antony (GE Energy, Non-GE) antony.akk...@ge.com
To: arun smartpink...@yahoo.com
Cc: 
Sent: Friday, July 13, 2012 9:46 AM
Subject: RE: [R] Column create and Update using function

Hi Arun,

Here I need to check only with one particular column, not with all columns.
I tried with 

newdat-apply(dat1[,1],2,function(x) ifelse(x6  x3,,RC))

then I getting an error like this:
 Error in apply(dat1[, 1], 1, function(x) ifelse(x  6  x  3, , RC)) : 
  dim(X) must have a positive length 



- thanks 
Antony.

-Original Message-
From: arun [mailto:smartpink...@yahoo.com] 
Sent: Friday, July 13, 2012 6:00 PM
To: Akkara, Antony (GE Energy, Non-GE)
Cc: R help
Subject: Re: [R] Column create and Update using function

Hi,

Try this:

dat1-read.table(text=
ABC    XYZ    PQR
2    4    3
5    4    8
7    1    3
,sep=,header=TRUE)

 newdat-apply(dat1,2,function(x) ifelse(x6  x3,,RC))
 colnames(newdat)-paste(colnames(newdat),QF,sep=_)
 dat2-data.frame(cbind(dat1,newdat))
  ABC XYZ PQR ABC_QF XYZ_QF PQR_QF
1   2   4   3 RC    RC
2   5   4   8   RC
3   7   1   3 RC RC RC
dat2
 
A.K.



- Original Message -
From: Rantony antony.akk...@ge.com
To: r-help@r-project.org
Cc: 
Sent: Friday, July 13, 2012 2:42 AM
Subject: [R] Column create and Update using function

Hi,

here i have a Max and Min values
Min -3
Max -6
and also a matrix like this,

ABC        XYZ         PQR
--       ---        ---
2                 4                3
5                 4                8
7                 1                3

In this i need to check each particular column values are between Max and Min 
value.
If the coulmn value not coming between Max and Min, then i need to create 
another coulmn by adding column name header with _QF. and assign a string 
like RC for that particular row.

For eg:- i need to checkout coulmn ABC.
Here  2,5,6 are the values we need to checkout with Min,Max values. and here 
Min -3,Max -6 First need to create a new column called ABC_QF with current 
matrix.

ABC        XYZ         PQR          ABC_QF
--       ---        ---         ---
2                 4                3               RC
5                 4                8
7                 1                3

Next, for 5 , it coming in between 3 to 6. so nothing to do.

ABC        XYZ         PQR          ABC_QF
--       ---        ---         ---
2                 4                3               RC
5                 4                8
7                 1                3

Next, for 7 , its not coming in between 3 to 6. so put RC

ABC        XYZ         PQR          ABC_QF
--       ---        ---         ---
2                 4                3               RC
5                 4                8
7                 1                3               RC
---
This is the requirement. i did it using for-loop,it will check each value and 
it taking time when bulk of data come.
Any hope to do using lappy,appy kind of functions ? Because at a time complete 
coulmn should get update.
Could you please help me urgently ?

- Thanks
Antony. 

--
View this message in context: 
http://r.789695.n4.nabble.com/Column-create-and-Update-using-function-tp4636400.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] read.table with numeric row names

2012-07-15 Thread arun

Hi Peter,

I copied the data from your email and run it again.

dat1-read.table(text=
  2.5  3.6  7.1  7.9
  100  3  4  2    3
  200  3.1  4  3  3
  300  2.2  3.3  2    4
  ,sep=,header=TRUE) 

dat1
    X2.5 X3.6 X7.1 X7.9
100  3.0  4.0    2    3
200  3.1  4.0    3    3
300  2.2  3.3    2    4


colnames(dat1)-gsub(^[X](.*),\\1,colnames(dat1))


I am  not sure what happened with your end.  May be you could try
readtable(, fill=TRUE)

I guess Chi was able to read it as I understood from his email: 
(Thanks. It works very good.Chi

A.K.




- Original Message -
From: peter dalgaard pda...@gmail.com
To: arun smartpink...@yahoo.com
Cc: kexinz zhangchic...@gmail.com; R help r-help@r-project.org
Sent: Friday, July 13, 2012 10:27 AM
Subject: Re: [R] read.table with numeric row names


On Jul 13, 2012, at 04:27 , arun wrote:

 Hello,
 
 I saw your reply in nabble.  Sorry about that.  I thought the dataset had 
 only few columns.
 
 #You can read first line of a file using:
 readLines(foo.txt,n=1)[1]
 
 
 #The more generic colname substitution
 dat1-read.table(text= 
  2.5  3.6  7.1  7.9 
  100  3      4      2    3 
  200  3.1  4      3      3 
  300  2.2  3.3  2    4 
  ,sep=,header=TRUE)

(This didn't survive too well in mail:

 dat1-read.table(text= 
+  2.5  3.6  7.1  7.9 
+  100  3      4      2    3 
+  200  3.1  4      3      3 
+  300  2.2  3.3  2    4 
+  ,sep=,header=TRUE)  
Error in read.table(text =  \n 2.5  3.6  7.1  7.9 \n 100  3      4      2    3 
\n 200  3.1  4      3      3 \n 300  2.2  3.3  2    4 \n ,  : 
  more columns than column names

Not sure exactly what happened there...)



  
 #The code should remove the X from the column names (row names?)
 

However, adding check.names=FALSE should be more expedient.

 colnames(dat1)-gsub(^[X](.*),\\1,colnames(dat1))
 dat1
     2.5 3.6 7.1 7.9
 100 3.0 4.0   2   3
 200 3.1 4.0   3   3
 300 2.2 3.3   2   4
 plot(colMeans(dat1)~as.numeric(names(dat1)),xlab=Column_Name,ylab=Column_Mean)
 
 A.K.
 
 
 
 
 - Original Message -
 From: kexinz zhangchic...@gmail.com
 To: r-help@r-project.org
 Cc: 
 Sent: Thursday, July 12, 2012 2:50 PM
 Subject: [R] read.table with numeric row names
 
 I have a text file like this
          2.5  3.6  7.1  7.9
 100   3      4       2     3
 200   3.1   4      3      3
 300   2.2   3.3   2     4
 
 I used r - read.table(a.txt, header=T)
 The row names becomes X2.5, X3.6... What I need is the row names are
 numeric, so I can use the row names as numbers on x-axis for plotting. e.g.
 plot(colMeans(r)~names(r)), something like this. How to do this?
 
 Thanks.
 
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/read-table-with-numeric-row-names-tp4636342.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R combining many vectors of predictable name into one date frame

2012-07-15 Thread Rui Barradas


Hello,

Try the following.


ib1 - 1:10
ib2 - rnorm(10)

hold.list - objects(pattern=ib)
df - sapply(hold.list, get)
df


Note that you don't need list(), and that sapply() returns a data.frame 
if possible.


Also, 'df' is the name of an R function, use something else like 'df1'.

Hope this helps

Rui Barradas

Em 14-07-2012 00:51, Mathew Vickers escreveu:

G'day R (power) users,

I have a many vectors, called:

ib1
ib2
ib3
...
ib100

and I would like them in one data frame (df) such that:


df


ib1  ib2   ib3  ib4 .  ib100
x  x  xxx
x  x  xxx
x  x  xxx

I have attempted:

hold.list - list(objects(pattern=ib))
df - data.frame(hold.list)

but that didn't work

also:
do.call(rbind, (objects(pattern=ib)))

and that also didn't work. I tried a whole pile of other things, where I
also failed.


The number of vectors might differ each time I want to make the data frame,
so that in the example above, I have ib1 : ib100, but next time, I might
only have ib1 : ib2

Below is my (probably somewhat embarrassing) example script for generating
the vectors in the first place. Commented out toward the end are a few
attempts at doing the job I wanted to do.


temp - runif(100)
tripID - rep(1:10, 10)
uni - rep(1:4, 25)
temp - data.frame(temp, tripID, uni)

trips - unique(temp$tripID)
uni - unique(temp$uni[temp$tripID==trips[1]])

for (jj in 1:length(uni)){
   a - c()
   for (ii in 1:10){
 a - c(a, IQR(temp$temp[temp$uni %in% sample(uni,jj)]))
 assign(paste(ib,jj,sep=), a) # ib is short for ibuttons. The number
is how many were used to calc IQR
   }
#   hold.list - list(objects(patter=ib))
#   trip - data.frame(list=hold.list # I am trying to put everything into
a dataframe
#   do.call(rbind, list=hold.list)
#   do.call(rbind, list(objects(pattern=ib)))
}


thanks heaps if you can help. And sorry if this is mostly garble. This is
my first crack at soliciting help from the list.

cheers,

mat




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] rgamma function

2012-07-15 Thread peter dalgaard


On Jul 14, 2012, at 04:55 , Chandler Zuo wrote:

 Hi,
 
 Has anyone encountered the problem of rgamma function in C? The following 
 simplified program always dies for me, and I wonder if anyone can tell me the 
 reason.
 
 #include Rmath.h
 #include time.h
 #include Rinternals.h
 
 SEXP generateGamma ()
 {
srand(time(NULL));
return (rgamma(5000,1));
 }
 
 Has anyone encountered a similar problem before? Is there another way of 
 generating Gamma random variable in C?
 
 P.S. I have no problem compiling and loading this function in R.

It doesn't even give off a warning??

The prototype in Rmath.h is

double  rgamma(double, double);

and you should be returning an SEXP. As soon as something tries to interpret 
the double value as a pointer -- Poof!

Notice that rgamma in C is not the same function as the R counterpart, in 
particular it isn't vectorized, so only generates one random number at a time. 
The long and the short of it is that you need to read up on sections 5.9 and 
5.10 of Writing R Extensions.




 
 Thanks for suggestions in advance!
 
 --Chandler
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Loading in Large Dataset + variables via loop

2012-07-15 Thread Rui Barradas


Hello,

Why do you need 9 variables in your environment if they are time series 
that correspond to the same period? You should use time series functions.


#install.packages('zoo')
library(zoo)

# Make up a dataset
Year - seq(from=as.Date(1901-01-01), by=year, length.out=100)
dat - data.frame(matrix(rnorm(100*9), ncol=9), Year)

# assign names.
varNames - expand.grid(c(temp, precip, pressure), 1:3, 
stringsAsFactors=FALSE)

varNames - as.vector(apply(colNames, 1, paste, collapse=))
varNames - c(varNames, Year)
names(dat) - varNames
head(dat)

# and transform it into a time series of class 'zoo'
z - zoo(dat[, 1:9], order.by=dat$Year)
str(z)
head(z)


Another way would be, like you say, to use a loop to put the variables 
in a list. Something like


lst - list()
for(i in 1:9) lst[[i]] - dat[, i]
names(lst) - varNames


Note that I've used a dataset called 'dat' n place of your 'A'. You 
should post a data example, like the posting guide says. Using dput().


Hope this helps,

Rui Barradas



Em 14-07-2012 03:44, cmc0605 escreveu:

Hello, I'm new to R with a (probably elementary) question.

Suppose I have a dataset called /A/ with /n/ locations, and each location
contains within it 3 time series of different variables (all of 100 years
length); each time series is of a weather variable (for each location there
is a temperature, precipitation, and pressure).  For instance, location 1
has a temperature1 time series, a precip1 time series, and a pressure1 time
series; location two has a temperature2, precip2, and pressure2
timeseries...That is, there are 100 rows, and (/n/*3)+1 columns.  The extra
column is the time.

I want to load in this dataset and declare a variable for each time series.
The columns are in order of location, so it goes temp1, precip1,pressure1,
temp2,... and so forth in increasing column order.  There are always 100
rows.  Manually, Id have to do:

temp1=A[,1]
precip1=A[,2]
pressure1=A[,3]
temp2=A[,4]
precip2=A[,5]
pressure2=A[,6]
temp3=A[,7]
and so forth.

Problem is, n is large, so I don't want to repeat this pattern forever.  I
figure I need a loop both for the variable name (ie.., the variable at a
particular location) as well as for what column it reads from.

Any help...?



--
View this message in context: 
http://r.789695.n4.nabble.com/Loading-in-Large-Dataset-variables-via-loop-tp4636501.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Maximum number of patterns and speed in grep

2012-07-15 Thread Gabor Grothendieck

On Fri, Jul 13, 2012 at 1:41 PM, mdvaan mathijsdev...@gmail.com wrote:
 Here's some data (which should give you the error messages):

 # read in data
 data - read.csv(https://dl.dropbox.com/u/13631687/data.csv;, header =
 T, sep = ,)

 # first paste all data
 data1 - paste(data[,1], collapse = |)

 # second paste subsets of the data
 data2a - paste(data[1:750,1], collapse = |)
 data2b - paste(data[751:1500,1], collapse = |)

 # define the object to be searched
 text - c(the first is Santa Fe Gold Corp, the second is Starpharma
 Holdings)

 # match
 strapplyc(text, data1)
 strapplyc(text, data2a)
 strapplyc(text, data2b)

 Thanks in advance!


Although it seems that strapplyc can handle larger regular expressions
than grep in R it seems neither can handle as many as in your example
so process it in chunks:

k - 3000 # chunk size

f - function(from, text) {
to - min(from + k - 1, nrow(data))
r - paste(data[seq(from, to), 1], collapse = |)
r - gsub([().*?+{}], , r)
strapply(text, r)
}
ix - seq(1, nrow(data), k)
out - lapply(text, function(text) unlist(lapply(ix, f, text)))


-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] HLOOKUP in R

2012-07-15 Thread Liviu Andronic

On Fri, Jul 13, 2012 at 9:25 PM, Silje Nord silje.nordg...@gmail.com wrote:
 Is there a function similar to excel's hlookup in R ?

Try match(). I think it provides hlookup() functionality.

Liviu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] variable (column) in a data frame

2012-07-15 Thread Paulo Barata


To the R help list,

When using a data frame, there is no warning or error message 
when I refer to a non-existent variable inside the data frame.

Example:

##--

a - c(1,2,3)
b - c(11,22,33)
df - data.frame(a,b)
df

## correct: there is a column in df named 'a'
## the sum is correctly performed
sum(df$a==2)

## incorrect: there is no column in df named 'aaa', 
## but the sum is performed anyway without either warning or error
sum(df$aaa==2)

##--

Is there some way to make R issue either a warning or an error
message in such a situation?

I am using R version 2.15.1 64-bit on Windows 7 Professional.

Thank you very much.

Paulo Barata

-
Paulo Barata

ENSP - Fundação Oswaldo Cruz
Rua Leopoldo Bulhões 1480 - 8A
21041-210  Rio de Janeiro - RJ
Brazil
E-mail: paulo.bar...@ensp.fiocruz.br

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] variable (column) in a data frame

2012-07-15 Thread John Kane

This seems more or less correct to me.

1 sum(df$a==1)
[1] 1
1 sum(df$a==2)
[1] 1
1 sum(df$aaa==2)
[1] 0

There is no df$aaa so the length is 0 which is what I think you are asking.
What am I missing?


John Kane
Kingston ON Canada


 -Original Message-
 From: paulo.bar...@ensp.fiocruz.br
 Sent: Sun, 15 Jul 2012 11:30:37 -0300
 To: r-help@r-project.org
 Subject: [R] variable (column) in a data frame
 
 
 To the R help list,
 
 When using a data frame, there is no warning or error message
 when I refer to a non-existent variable inside the data frame.
 
 Example:
 
 ##--
 
 a - c(1,2,3)
 b - c(11,22,33)
 df - data.frame(a,b)
 df
 
 ## correct: there is a column in df named 'a'
 ## the sum is correctly performed
 sum(df$a==2)
 
 ## incorrect: there is no column in df named 'aaa',
 ## but the sum is performed anyway without either warning or error
 sum(df$aaa==2)
 
 ##--
 
 Is there some way to make R issue either a warning or an error
 message in such a situation?
 
 I am using R version 2.15.1 64-bit on Windows 7 Professional.
 
 Thank you very much.
 
 Paulo Barata
 
 -
 Paulo Barata
 
 ENSP - Fundação Oswaldo Cruz
 Rua Leopoldo Bulhões 1480 - 8A
 21041-210  Rio de Janeiro - RJ
 Brazil
 E-mail: paulo.bar...@ensp.fiocruz.br
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


GET FREE SMILEYS FOR YOUR IM  EMAIL - Learn more at 
http://www.inbox.com/smileys
Works with AIM®, MSN® Messenger, Yahoo!® Messenger, ICQ®, Google Talk™ and most 
webmails

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] variable (column) in a data frame

2012-07-15 Thread Paulo Barata


Dr. Dalgaard,

Thank you. But pre-checking with is.null() or using with()
doesn't solve the problem of catching spelling mistakes
in the name of a variable inside a data frame, when using 
the df$var notation often in a program.

Is there some way for R to behave, in relation to a variable 
inside a data frame, the same way it behaves for a variable 
not in a data frame? For example:

##
a - c(1,2,3)

## the variable exists, we get a correct answer
a==1

## the variable does not exist, R rightly points this out
aaa==1
##

My point is, if we make a spelling mistake in a program when referring
to a variable inside a data frame, using the df$var notation, 
there seems to be no way of getting warned about that. 

Thank you once again.

Paulo Barata

-


-- Original Message ---
From: peter dalgaard pda...@gmail.com
To: Paulo Barata paulo.bar...@ensp.fiocruz.br
Sent: Sun, 15 Jul 2012 16:47:35 +0200
Subject: Re: [R] variable (column) in a data frame

 On Jul 15, 2012, at 16:30 , Paulo Barata wrote:
 
  
  To the R help list,
  
  When using a data frame, there is no warning or error message 
  when I refer to a non-existent variable inside the data frame.
  
  Example:
  
  ##--
  
  a - c(1,2,3)
  b - c(11,22,33)
  df - data.frame(a,b)
  df
  
  ## correct: there is a column in df named 'a'
  ## the sum is correctly performed
  sum(df$a==2)
  
  ## incorrect: there is no column in df named 'aaa', 
  ## but the sum is performed anyway without either warning or error
  sum(df$aaa==2)
  
  ##--
  
  Is there some way to make R issue either a warning or an error
  message in such a situation?
 
 
 You can pre-check for is.null(df$aaa) or use with(df, sum(aaa==2)).
 
 -- 
 Peter Dalgaard, Professor,
 Center for Statistics, Copenhagen Business School
 Solbjerg Plads 3, 2000 Frederiksberg, Denmark
 Phone: (+45)38153501
 Email: pd@cbs.dk  Priv: pda...@gmail.com
 
 -- 
 This message has been scanned for viruses and
 dangerous content by MailScanner, and is
 believed to be clean.
--- End of Original Message ---

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] variable (column) in a data frame

2012-07-15 Thread peter dalgaard


On Jul 15, 2012, at 17:41 , Paulo Barata wrote:

 
 Dr. Dalgaard,
 
 Thank you. But pre-checking with is.null() or using with()
 doesn't solve the problem of catching spelling mistakes
 in the name of a variable inside a data frame, when using 
 the df$var notation often in a program.
 
 Is there some way for R to behave, in relation to a variable 
 inside a data frame, the same way it behaves for a variable 
 not in a data frame? For example:

You could try reading the 2nd half of my one-line reply

 
 You can pre-check for is.null(df$aaa) or use with(df, sum(aaa==2)).

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Can't understand syntax

2012-07-15 Thread Rui Barradas


Hello,

Thank you, I'm gald it helped. Two notes.

1. I don't believe 1t's a problem with the documentation, though many 
times, and R is not an exception, there are books that explain in 
simpler terms what the docs alreay explain well. Check out the 
contributed link in http://cran.r-project.org/
(it's on the left, bottom-most). There are several books that though 
have specific areas, start with an introduction to R.


2. You've missinterpreted a point in my post, data.frames are list. 
Stricktly speakng they are also, like any other list, collections of 
list, but that's NOT the way they should be seen. It's more natural to 
see them as implementing the statistical concepts of variables and 
observations. In this case a column is a variable/vector (not list) and 
 whithin a vector, we have observations. In the general case  variables 
need not have the same number of observations; if they do, the list can 
become tabular, a data.frame. And we can speak of rows.


Rule: call the columns variables and call the rows observations or, 
well, or columns and rows. But think of the columns as vectors.


Like I've said above, a column is still a list, and can hold any type of 
data. Some questions are about keeping entire matrices as elements of a 
data.frame column, but the answer is yes, it is possible but NO, don't 
do that.


Em 15-07-2012 16:05, Charles Stangor escreveu:

Rui,

Thank you SO MUCH!! This was exactly the explanation I needed

Now I can see that dataframes are collections of lists where each column
is a list.

I find that R documentation is either very superficial or completely
arcane but I'm getting it!

Thanks again.

Chuck

On Sat, Jul 14, 2012 at 7:41 PM, Rui Barradas ruipbarra...@sapo.pt
mailto:ruipbarra...@sapo.pt wrote:

Hello,

It's more simple than you believe it is. One thing at a time.

First, in order to lighten the instructions, create index vectors.

test2 - test  # save 'test' for later

na.v1 - is.na http://is.na(test[[v1]])
na.v2 - is.na http://is.na(test[[v2]])
na.v3 - is.na http://is.na(test[[v3]])


Now use them.


test[[ result ]][ !na.v1 ] - test[[ v1 ]][ !na.v1 ]
test[[ result ]][ !na.v2 ] - test[[ v2 ]][ !na.v2 ]
test[[ result ]][ !na.v3 ] - test[[ v3 ]][ !na.v3 ]


Note that above, for instance, n the first line, on each side of
'-' we have two different types of indexing, in a certain sense.

One, a data.frame is a list of a special type, each list member is a
(random?) variable and all variables have the same number of
observations. So test[[ result ]] refers to a vector of the
data.frame.

Another is the indexing of that vectors' elements. Imagine that we
had assigned

test.res - test[[ result ]]

and then accessed the elements of 'test.res' with

test.res[ !na.v1 ] - ...etc...

That's what we are doing.
Considering that a df is a list with a tabular form, we could also
use the row/column type of indexing. Maybe this would be more
intuitive. Equivalent, exactly equivalent to the code above is:


test2[  !na.v1 , result ] - test2[ !na.v1 , v1 ]
test2[  !na.v2 , result ] - test2[ !na.v2 , v2 ]
test2[  !na.v3 , result ] - test2[ !na.v3 , v3 ]

all.equal(test, test2) # TRUE


Hope this helps,

Rui Barradas

Em 14-07-2012 21:22, Charles Stangor escreveu:

OK, I need help!!

I've been searching, but I don't understand the logic of some this
dataframe addressing syntax.

What is this type of code called?

test [[v3]] [is.na http://is.na
http://is.na(test[[v2]])] -10  #choose column
v3 where column v2 is == 4 and replace with 10

and where is it documented?


The code below works for what I want to do (find the non-missing
value
in a row), but why?

test - read.table(text=
v1  v2  v3  result
3  NA  NA  NA
NA  3   NA NA
NA  NA   3 NA

, header=TRUE)

test [[result]] [!(is.na http://is.na
http://is.na(test[[v1]]))] - test
[[v1]] [!(is.na http://is.na http://is.na(test[[v1]]))]
test [[result]] [!(is.na http://is.na
http://is.na(test[[v2]]))] - test
[[v2]] [!(is.na http://is.na http://is.na(test[[v2]]))]
test [[result]] [!(is.na http://is.na
http://is.na(test[[v3]]))] - test
[[v3]] [!(is.na http://is.na http://is.na(test[[v3]]))]

thanks!


On Fri, Jul 13, 2012 at 6:41 AM, Rui Barradas
ruipbarra...@sapo.pt mailto:ruipbarra...@sapo.pt
mailto:ruipbarra...@sapo.pt mailto:ruipbarra...@sapo.pt wrote:

 Hello,

 Check the structure of what you have, df and newdf. You
will see
 that in df dateTime is of class POSIXlt and in newDf
newDateTime is
 of class POSIXct.

 Solution:

 [...]
 df$dateTime -

Re: [R] variable (column) in a data frame

2012-07-15 Thread Peter Ehlers


On 2012-07-15 08:41, Paulo Barata wrote:


Dr. Dalgaard,

Thank you. But pre-checking with is.null() or using with()
doesn't solve the problem of catching spelling mistakes
in the name of a variable inside a data frame, when using
the df$var notation often in a program.

Is there some way for R to behave, in relation to a variable
inside a data frame, the same way it behaves for a variable
not in a data frame? For example:

##
a - c(1,2,3)

## the variable exists, we get a correct answer
a==1

## the variable does not exist, R rightly points this out
aaa==1
##

My point is, if we make a spelling mistake in a program when referring
to a variable inside a data frame, using the df$var notation,
there seems to be no way of getting warned about that.


You could wean yourself from the $-habit. It's convenient but can
lead to the problems you're experiencing (and this has been
discussed before). For programming, if you're prone to make
spelling errors, you should prefer df[, aaa]. See ?Extract.

Peter Ehlers



Thank you once again.

Paulo Barata

-


-- Original Message ---
From: peter dalgaard pda...@gmail.com
To: Paulo Barata paulo.bar...@ensp.fiocruz.br
Sent: Sun, 15 Jul 2012 16:47:35 +0200
Subject: Re: [R] variable (column) in a data frame


On Jul 15, 2012, at 16:30 , Paulo Barata wrote:



To the R help list,

When using a data frame, there is no warning or error message
when I refer to a non-existent variable inside the data frame.

Example:

##--

a - c(1,2,3)
b - c(11,22,33)
df - data.frame(a,b)
df

## correct: there is a column in df named 'a'
## the sum is correctly performed
sum(df$a==2)

## incorrect: there is no column in df named 'aaa',
## but the sum is performed anyway without either warning or error
sum(df$aaa==2)

##--

Is there some way to make R issue either a warning or an error
message in such a situation?



You can pre-check for is.null(df$aaa) or use with(df, sum(aaa==2)).

--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

--- End of Original Message ---

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Imposing more than one condition to if

2012-07-15 Thread Santiago Guallar

Hi,
 
I have a dataset which contains several time records for a number of days, plus 
a variable (light) that allows to determine night time (lihgt= 0) and daytime 
(light 0). I need to obtain get dusk time and dawn time for each day and place 
them in two columns.

This is the starting point (d):
day time light 
1 1   20 
1 12 10 
1 11 6 
1 9   0 
1 6   0 
1 12 0
...
30 8 0 
30 3 0 
30 8 0
30 3 0
30 8 8
30 9 20


And this what I want to get:
day time light dusk dawn
1 1  20 11 10
1 1210 11 10
1 11 6  11 10
1 9   0  11 10
1 6   0  11 10
1 12 0  11 10
...
30 8 0   9 5
30 3 0   9 5
30 8 0   9 5
30 3 0   9 5
30 8 8   9 5
30 9 20 9 5

This is the code for data frame d:
day= rep(1:30, each=10)
n= length(dia); x= c(1:24)
time= sample(x, 300, replace= T)
light= rep(c(20,10,6,0,0,0,0,0,8,20), 30)
d=data.frame(day,time,light)

I'd need to impose a double condition like the next but if does not take more 
than one:
attach(d)
for (i in 1: n){
if (light[i-1]2  light[i]2){
d$dusk- time[i-1]
}
if (light[i-1]2  light[i]2){
d$dawn- time[i]
}
}
detach(d)
d

Thank you for your help
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Can't understand syntax

2012-07-15 Thread Rui Barradas


Hello, again.

Inline.

Em 15-07-2012 16:17, Charles Stangor escreveu:

Rui,

Since you are so generous, may I ask you one more question?  What is the
deal with the text after the semicolon in the statement below? Is this
an ifelse or something?  Why is it needed ?

obrigado.

df1 - read.table(text=
cola colb colc cold cole
1NA59   NA   17
226   NA   14   NA
33NA   11   15   19
448   12   NA   NA
, header=TRUE)

df2 - read.table(text=
cola colb colc cold cole
1  8 10 12 14 16
, header=TRUE)

df1[[cola]][is.na(df1[[cola]])] - df2[[cola]];
df1[[cola]]  #?? what's happening after the semi?


It's printing df1[[cola]]. Just that. The sem-colon ends an 
instruction and starts a new one.

If it's confusing, put what follows it in a new line.

Rui Barradas


df1



On Sat, Jul 14, 2012 at 7:41 PM, Rui Barradas ruipbarra...@sapo.pt
mailto:ruipbarra...@sapo.pt wrote:

Hello,

It's more simple than you believe it is. One thing at a time.

First, in order to lighten the instructions, create index vectors.

test2 - test  # save 'test' for later

na.v1 - is.na http://is.na(test[[v1]])
na.v2 - is.na http://is.na(test[[v2]])
na.v3 - is.na http://is.na(test[[v3]])


Now use them.


test[[ result ]][ !na.v1 ] - test[[ v1 ]][ !na.v1 ]
test[[ result ]][ !na.v2 ] - test[[ v2 ]][ !na.v2 ]
test[[ result ]][ !na.v3 ] - test[[ v3 ]][ !na.v3 ]


Note that above, for instance, n the first line, on each side of
'-' we have two different types of indexing, in a certain sense.

One, a data.frame is a list of a special type, each list member is a
(random?) variable and all variables have the same number of
observations. So test[[ result ]] refers to a vector of the
data.frame.

Another is the indexing of that vectors' elements. Imagine that we
had assigned

test.res - test[[ result ]]

and then accessed the elements of 'test.res' with

test.res[ !na.v1 ] - ...etc...

That's what we are doing.
Considering that a df is a list with a tabular form, we could also
use the row/column type of indexing. Maybe this would be more
intuitive. Equivalent, exactly equivalent to the code above is:


test2[  !na.v1 , result ] - test2[ !na.v1 , v1 ]
test2[  !na.v2 , result ] - test2[ !na.v2 , v2 ]
test2[  !na.v3 , result ] - test2[ !na.v3 , v3 ]

all.equal(test, test2) # TRUE


Hope this helps,

Rui Barradas

Em 14-07-2012 21:22, Charles Stangor escreveu:

OK, I need help!!

I've been searching, but I don't understand the logic of some this
dataframe addressing syntax.

What is this type of code called?

test [[v3]] [is.na http://is.na
http://is.na(test[[v2]])] -10  #choose column
v3 where column v2 is == 4 and replace with 10

and where is it documented?


The code below works for what I want to do (find the non-missing
value
in a row), but why?

test - read.table(text=
v1  v2  v3  result
3  NA  NA  NA
NA  3   NA NA
NA  NA   3 NA

, header=TRUE)

test [[result]] [!(is.na http://is.na
http://is.na(test[[v1]]))] - test
[[v1]] [!(is.na http://is.na http://is.na(test[[v1]]))]
test [[result]] [!(is.na http://is.na
http://is.na(test[[v2]]))] - test
[[v2]] [!(is.na http://is.na http://is.na(test[[v2]]))]
test [[result]] [!(is.na http://is.na
http://is.na(test[[v3]]))] - test
[[v3]] [!(is.na http://is.na http://is.na(test[[v3]]))]

thanks!


On Fri, Jul 13, 2012 at 6:41 AM, Rui Barradas
ruipbarra...@sapo.pt mailto:ruipbarra...@sapo.pt
mailto:ruipbarra...@sapo.pt mailto:ruipbarra...@sapo.pt wrote:

 Hello,

 Check the structure of what you have, df and newdf. You
will see
 that in df dateTime is of class POSIXlt and in newDf
newDateTime is
 of class POSIXct.

 Solution:

 [...]
 df$dateTime - strptime(df$dateTime,%m/%d/%Y %H:%M)
 df$dateTime - as.POSIXct(df$dateTime)
 [...]

 Hope this helps,

 Rui Barradas

 Em 13-07-2012 10:24, vioravis escreveu:

 I have the following dataframe with the first column
being of
 type datetime:

 dateTime - c(10/01/2005 0:00,
 10/01/2005 0:20,
 10/01/2005 0:40,
 10/01/2005 1:00,
 10/01/2005 1:20)
 var1 - c(1,2,3,4,5)
 var2 - c(10,20,30,40,50)
 df - data.frame(dateTime = dateTime, var1 = var1, var2
= var2)
 df$dateTime - strptime(df$dateTime,%m/%d/%Y %H:%M)

 I want to create 10 minute interval

Re: [R] minor axis ticks in trellis graphics?

2012-07-15 Thread Peter Ehlers


On 2012-07-13 01:05, Martin Ivanov wrote:

  Dear R users,

I need to add minor axis ticks to my graph. In traditional R this is easily 
achievable by simply
adding a second axis with the minor ticks. But how to do that in trellis? I am 
already out of ideas.

Any suggestions will be appreciated.


Haven't seen a response yet, so I'll give it a shot,
sure to be replaced by something much simpler by
Deepayan when he finds the time.

Here are two ways:

1.
Assign appropriate values to the elements of
the xscale.components list. I prefer this.

## make some data
d - data.frame(x = 1:12, y = rnorm(12))
at.ticks - c(4,8)
at.labels - c(2,6,10)
the_labels - letters[1:3]

library(lattice)

## define a function to modify the xscale components;
## this function will be used inside xyplot().
myxscale.components - function(...)
{
ans - xscale.components.default(...)
ans$bottom$ticks$at - at.ticks
ans$bottom$labels$at - at.labels
ans$bottom$labels$labels - the_labels
ans
}

## do the plot
xyplot(y ~ x, data = d,
scales = list(tck = c(1,0)),
xscale.components = myxscale.components)

You can put the modifying function inside the xyplot call.
See ?axis.components.


2.
This is more like the base graphics way.
We create the plot without the x-axis and then
use the trellis.focus/unfocus functions in
conjunction with the panel.axis() function.
See ?panel.axis for details.

Here's the function to apply after the xyplot call:

myfocus - function(){
  trellis.focus(panel, 1, 1,
 clip.off = TRUE,
 highlight = FALSE)

  ## put the ticks in
  panel.axis(side = bottom,
 at = at.ticks,
 labels = FALSE,
 ticks = TRUE,
 tck = 1, outside = TRUE
  )

  ## put the labels in
  panel.axis(side = bottom,
 at = at.labels,
 labels = the_labels,
 ticks = FALSE,
 tck = 0, outside = TRUE,
 rot = 0   # optional; try it without
  )
  trellis.unfocus()
}


xyplot(y ~ x, data = d,
scales = list(
 y = list(tck = c(1,0)),
 x = list(tck = c(0,0),
at = 1, label =   # to give us some bottom space
 )))

## Now add the axis ticks and labels
myfocus()


Peter Ehlers




Best regards,

Martin


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] variable (column) in a data frame

2012-07-15 Thread Paulo Barata


Dr. Dalgaard,

Thank you. You are right, with() is able to catch
spelling errors in the name of variables inside a data frame.

But couldn't some error or warning be included in R when referring
to a non-existent variable inside a data frame with the df$var
notation, without the use of with()? 

Is there any reason why R does not have such a kind of error
message?

Thank you again.

Paulo Barata

-


-- Original Message ---
From: peter dalgaard pda...@gmail.com
To: Paulo Barata paulo.bar...@ensp.fiocruz.br
Cc: r-help@r-project.org
Sent: Sun, 15 Jul 2012 18:14:22 +0200
Subject: Re: [R] variable (column) in a data frame

 On Jul 15, 2012, at 17:41 , Paulo Barata wrote:
 
  
  Dr. Dalgaard,
  
  Thank you. But pre-checking with is.null() or using with()
  doesn't solve the problem of catching spelling mistakes
  in the name of a variable inside a data frame, when using 
  the df$var notation often in a program.
  
  Is there some way for R to behave, in relation to a variable 
  inside a data frame, the same way it behaves for a variable 
  not in a data frame? For example:
 
 You could try reading the 2nd half of my one-line reply
 
  
  You can pre-check for is.null(df$aaa) or use with(df, sum(aaa==2)).
 
 -- 
 Peter Dalgaard, Professor,
 Center for Statistics, Copenhagen Business School
 Solbjerg Plads 3, 2000 Frederiksberg, Denmark
 Phone: (+45)38153501
 Email: pd@cbs.dk  Priv: pda...@gmail.com
 
 -- 
 This message has been scanned for viruses and
 dangerous content by MailScanner, and is
 believed to be clean.
--- End of Original Message ---

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] LiblineaR: read/write model files?

2012-07-15 Thread Sam Steingold

 * Sam Steingold f...@tah.bet [2012-07-13 15:51:46 -0400]:

 How do I read/write liblinear models to files?
 E.g., if I train a model using the command line interface, I might want
 to load it into R to look the histogram of the weights.
 Or I might want to train a model in R and then apply it using a command
 line interface.

read.liblinear - function (file) {
  cat(read.liblinear(,file,)\n)
  lines - readLines(file)
  stopifnot(lines[6]==w)
  parsed - strsplit(lines[1:5], ,fixed=TRUE)
  stopifnot(parsed[[1]][1] == solver_type)
  stopifnot(parsed[[2]][1] == nr_class)
  stopifnot(parsed[[3]][1] == label)
  stopifnot(parsed[[4]][1] == nr_feature)
  stopifnot(parsed[[5]][1] == bias)
  stopifnot(as.numeric(parsed[[2]][2]) + 1 == length(parsed[[3]]))
  stopifnot(as.numeric(parsed[[4]][2]) + 6 == length(lines))
  ret - list(solver.type=parsed[[1]][2],
  label=parsed[[3]][2:length(parsed[[3]])],
  bias=as.numeric(parsed[[5]][2]),
  weight=as.numeric(lines[7:length(lines)]))
  nattr - length(ret$weight)
  n0 - length(which(ret$weight==0))
  cat(solver.type:,ret$solver.type,\nlabel:,ret$label,\nbias:,ret$bias,
  \nweight(total:,nattr,; 0:,n0,=,(100*n0/nattr),%)\n)
  print(summary(ret$weight))
  ret
}


-- 
Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000
http://www.childpsy.net/ http://palestinefacts.org http://honestreporting.com
http://www.PetitionOnline.com/tap12009/ http://americancensorship.org
Incorrect time synchronization.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Imposing more than one condition to if

2012-07-15 Thread John Kane

No idea of how to do what you want but your data set is not working.
I think that you want

 x= c(1:24)
day= rep(1:30, each=10)
time= sample(x, 300, replace= T)
light= rep(c(20,10,6,0,0,0,0,0,8,20), 30)
d=data.frame(day,time,light)
n= length(day)

John Kane
Kingston ON Canada


 -Original Message-
 From: sgual...@yahoo.com
 Sent: Sun, 15 Jul 2012 09:32:33 -0700 (PDT)
 To: r-help@r-project.org
 Subject: [R] Imposing more than one condition to if
 
 Hi,
 
 I have a dataset which contains several time records for a number of
 days, plus a variable (light) that allows to determine night time (lihgt=
 0) and daytime (light 0). I need to obtain get dusk time and dawn time
 for each day and place them in two columns.
 
 This is the starting point (d):
 day time light
 1 1   20
 1 12 10
 1 11 6
 1 9   0
 1 6   0
 1 12 0
 ...
 30 8 0
 30 3 0
 30 8 0
 30 3 0
 30 8 8
 30 9 20
 
 
 And this what I want to get:
 day time light dusk dawn
 1 1  20 11 10
 1 1210 11 10
 1 11 6  11 10
 1 9   0  11 10
 1 6   0  11 10
 1 12 0  11 10
 ...
 30 8 0   9 5
 30 3 0   9 5
 30 8 0   9 5
 30 3 0   9 5
 30 8 8   9 5
 30 9 20 9 5
 
 This is the code for data frame d:
 day= rep(1:30, each=10)
 n= length(dia); x= c(1:24)
 time= sample(x, 300, replace= T)
 light= rep(c(20,10,6,0,0,0,0,0,8,20), 30)
 d=data.frame(day,time,light)
 
 I'd need to impose a double condition like the next but if does not take
 more than one:
 attach(d)
 for (i in 1: n){
 if (light[i-1]2  light[i]2){
 d$dusk- time[i-1]
 }
 if (light[i-1]2  light[i]2){
 d$dawn- time[i]
 }
 }
 detach(d)
 d
 
 Thank you for your help
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


GET FREE SMILEYS FOR YOUR IM  EMAIL - Learn more at 
http://www.inbox.com/smileys
Works with AIM®, MSN® Messenger, Yahoo!® Messenger, ICQ®, Google Talk™ and most 
webmails

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Imposing more than one condition to if

2012-07-15 Thread Rui Barradas


Hello,

There are obvious bugs in your code, you are testing for light  2 or 
ligth  2 but this would mean that dusk and dawn are undetermined for 
light == 2 and that they happen at light == 1.


Without loops or compound logical conditions:


f - function(x){
x$dawn - x$time[ which.min(x$light) ]
x$dusk - x$time[ max(which(x$light == 0)) + 1 ]
x
}

do.call(rbind, by(d, d$day, f))

Hope this helps,

Rui Barradas

Em 15-07-2012 17:32, Santiago Guallar escreveu:

Hi,

I have a dataset which contains several time records for a number of days, plus a 
variable (light) that allows to determine night time (lihgt= 0) and daytime 
(light 0). I need to obtain get dusk time and dawn time for each day and place 
them in two columns.

This is the starting point (d):
day time light
1 1   20
1 12 10
1 11 6
1 9   0
1 6   0
1 12 0
...
30 8 0
30 3 0
30 8 0
30 3 0
30 8 8
30 9 20


And this what I want to get:
day time light dusk dawn
1 1  20 11 10
1 1210 11 10
1 11 6  11 10
1 9   0  11 10
1 6   0  11 10
1 12 0  11 10
...
30 8 0   9 5
30 3 0   9 5
30 8 0   9 5
30 3 0   9 5
30 8 8   9 5
30 9 20 9 5

This is the code for data frame d:
day= rep(1:30, each=10)
n= length(dia); x= c(1:24)
time= sample(x, 300, replace= T)
light= rep(c(20,10,6,0,0,0,0,0,8,20), 30)
d=data.frame(day,time,light)

I'd need to impose a double condition like the next but if does not take more 
than one:
attach(d)
for (i in 1: n){
if (light[i-1]2  light[i]2){
d$dusk- time[i-1]
}
if (light[i-1]2  light[i]2){
d$dawn- time[i]
}
}
detach(d)
d

Thank you for your help
[[alternative HTML version deleted]]



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Loading in Large Dataset + variables via loop

2012-07-15 Thread Rui Barradas


Hello,

Right, it should be 'varNames' in the apply. I guess I had something 
called colNames in my environment. I've just rm(list=ls()) and rerun the 
code, corrected. No errors this time.


varNames is the result of expand.grid, therefore does have a dim attribute.

The faulty instruction corrected is:

varNames - as.vector(apply(varNames, 1, paste, collapse=))

Rui Barradas

Em 15-07-2012 18:12, arun escreveu:

Hi Rui,

Getting some error messages:


varNames - as.vector(apply(colNames, 1, paste, collapse=))
Error in apply(colNames, 1, paste, collapse = ) :
   dim(X) must have a positive length

A.K.



- Original Message -
From: Rui Barradas ruipbarra...@sapo.pt
To: cmc0605 colos...@gmail.com
Cc: r-help@r-project.org
Sent: Sunday, July 15, 2012 8:12 AM
Subject: Re: [R] Loading in Large Dataset + variables via loop

Hello,

Why do you need 9 variables in your environment if they are time series
that correspond to the same period? You should use time series functions.

#install.packages('zoo')
library(zoo)

# Make up a dataset
Year - seq(from=as.Date(1901-01-01), by=year, length.out=100)
dat - data.frame(matrix(rnorm(100*9), ncol=9), Year)

# assign names.
varNames - expand.grid(c(temp, precip, pressure), 1:3,
stringsAsFactors=FALSE)
varNames - as.vector(apply(colNames, 1, paste, collapse=))
varNames - c(varNames, Year)
names(dat) - varNames
head(dat)

# and transform it into a time series of class 'zoo'
z - zoo(dat[, 1:9], order.by=dat$Year)
str(z)
head(z)


Another way would be, like you say, to use a loop to put the variables
in a list. Something like

lst - list()
for(i in 1:9) lst[[i]] - dat[, i]
names(lst) - varNames


Note that I've used a dataset called 'dat' n place of your 'A'. You
should post a data example, like the posting guide says. Using dput().

Hope this helps,

Rui Barradas



Em 14-07-2012 03:44, cmc0605 escreveu:

Hello, I'm new to R with a (probably elementary) question.

Suppose I have a dataset called /A/ with /n/ locations, and each location
contains within it 3 time series of different variables (all of 100 years
length); each time series is of a weather variable (for each location there
is a temperature, precipitation, and pressure).  For instance, location 1
has a temperature1 time series, a precip1 time series, and a pressure1 time
series; location two has a temperature2, precip2, and pressure2
timeseries...That is, there are 100 rows, and (/n/*3)+1 columns.  The extra
column is the time.

I want to load in this dataset and declare a variable for each time series.
The columns are in order of location, so it goes temp1, precip1,pressure1,
temp2,... and so forth in increasing column order.  There are always 100
rows.  Manually, Id have to do:

temp1=A[,1]
precip1=A[,2]
pressure1=A[,3]
temp2=A[,4]
precip2=A[,5]
pressure2=A[,6]
temp3=A[,7]
and so forth.

Problem is, n is large, so I don't want to repeat this pattern forever.  I
figure I need a loop both for the variable name (ie.., the variable at a
particular location) as well as for what column it reads from.

Any help...?



--
View this message in context: 
http://r.789695.n4.nabble.com/Loading-in-Large-Dataset-variables-via-loop-tp4636501.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R-implementation of Local Outlier Probabilities (LoOP)?

2012-07-15 Thread Johannes Graumann

Dear all,

Is anyone aware of an R implementation of LoOF (H.-P. Kriegel, P. Kröger, E. 
Schubert, A. Zimek; LoOP: Local Outlier Probabilities; In Proceedings of the 
18th ACM Conference on Information and Knowledge Management (CIKM), Hong 
Kong, China: 1649–1652, 2009.)? I found http://cran.r-
project.org/web/packages/Rlof/index.html, but would prefer the p-value'ish 
measure provided by LoOP.
Alternatives implemented in R would also be valuable ...

Thank you for your consideration. Sincerely, Joh

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] variable (column) in a data frame

2012-07-15 Thread Berend Hasselman


Paulo Barata-3 wrote
 
 Dr. Dalgaard,
 
 Thank you. You are right, with() is able to catch
 spelling errors in the name of variables inside a data frame.
 
 But couldn't some error or warning be included in R when referring
 to a non-existent variable inside a data frame with the df$var
 notation, without the use of with()? 
 
 Is there any reason why R does not have such a kind of error
 message?
 

See this discussion:
https://stat.ethz.ch/pipermail/r-help/2012-July/317562.html

Berend


--
View this message in context: 
http://r.789695.n4.nabble.com/variable-column-in-a-data-frame-tp4636561p4636579.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] HLOOKUP in R

2012-07-15 Thread jim holtman

Depending on what options of hlookup you want, 'match' will do exact
matching and 'findInterval' will determine range/interval matching.
What you need to do is follow the posting guide and provide an example
of exactly what you data looks like and what you expect the result to
be.

On Fri, Jul 13, 2012 at 3:25 PM, Silje Nord silje.nordg...@gmail.com wrote:
 Hi,

 Is there a function similar to excel's hlookup in R ?

 Thanks,
 Silje

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] HLOOKUP in R

2012-07-15 Thread santosh

Try ?match
Adapt it to your need

On Saturday, July 14, 2012 12:55:33 AM UTC+5:30, Silje Nord wrote:

 Hi, 

 Is there a function similar to excel's hlookup in R ? 

 Thanks, 
 Silje 

 __ 
 R-help@r-project.org mailing list 
 https://stat.ethz.ch/mailman/listinfo/r-help 
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html 
 and provide commented, minimal, self-contained, reproducible code. 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] permutation test on paired samples

2012-07-15 Thread Henric (Nilsson) Winell


Holger,

Thanks for providing a reproducible example.  However, since your
space key only works sporadically, the below is a little hard to read... ;)

On 2012-07-12 20:26, Holger Taschenberger wrote:
 Hi,

  I'm trying to run a permutation test on paired samples.

 First I tried the package exactRankTests:

 require(exactRankTests)
 x - c(1.83,0.50,1.62,2.48,1.68,1.88,1.55,3.06,1.30)
 y - c(0.878,0.647,0.598,2.05,1.06,1.29,1.06,3.14,1.29)

The relevant output missing here is

 wilcox.test(x,y,paired = TRUE,alternative = greater)

Wilcoxon signed rank test

data:  x and y
V = 40, p-value = 0.01953
alternative hypothesis: true location shift is greater than 0

 perm.test(y,x,paired = TRUE,exact = TRUE,alternative = greater)

1-sample Permutation Test (scores mapped into 1:m using rounded
scores)

data:  y and x
T = 41, p-value = 0.003906
alternative hypothesis: true mu is greater than 0


Firstly, you've interchanged the 'x' and 'y' in the second call. 
Secondly, and more important, the output says that (scores mapped into 
1:m using rounded scores).  In this case this can easily be avoided, 
and note the interchange of 'x' and 'y' to match your 'wilcox.test' 
call, using:


 yy - 1000 * y
 xx - 1000 * x
 perm.test(xx, yy, paired = TRUE, exact = TRUE,
+   alternative = greater)

1-sample Permutation Test

data:  xx and yy
T = 4114, p-value = 0.01367
alternative hypothesis: true mu is greater than 0


So, now that we've computed the correct p-value, let's see how to obtain 
this using the 'coin' package:



 Then I wanted to use the package 'coin':

 require(coin)
 x - c(1.83,0.50,1.62,2.48,1.68,1.88,1.55,3.06,1.30)
 y - c(0.878,0.647,0.598,2.05,1.06,1.29,1.06,3.14,1.29)
 xydat - data.frame(y = c(y,x),x = gl(2,length(x)),block = 
factor(rep(1:length(x),2)))


The relevant output missing here is

 wilcoxsign_test(y ~ x | block,data = xydat,alternative = 
greater,distribution = exact())


Exact Wilcoxon-Signed-Rank Test

data:  y by x (neg, pos)
 stratified by block
Z = 2.0732, p-value = 0.01953
alternative hypothesis: true mu is greater than 0

 oneway_test(y ~ x | block,data = xydat,alternative = 
greater,distribution = exact())


Exact 2-Sample Permutation Test

data:  y by x (1, 2)
 stratified by block
Z = -2.1948, p-value = 0.6982
alternative hypothesis: true mu is greater than 0


Using 'oneway_test' in this way does *not* correspond to a paired test. 
 The raw scores version of the Wilcoxon signed-rank test can be 
constructed using


 diff - x - y
 y - as.vector(t(cbind(abs(diff) * (diff  0),
+abs(diff) * (diff = 0
 x - factor(rep(c(neg, pos), length(diff)),
+ levels = c(pos, neg))
 b - gl(length(diff), 2)

 oneway_test(y ~ x | b, alternative = greater, distr = exact)

Exact 2-Sample Permutation Test

data:  y by x (pos, neg)
 stratified by b
Z = 2.1948, p-value = 0.01367
alternative hypothesis: true mu is greater than 0


And, as you can see, this is equal to the 'perm.test' result.


HTH,
Henric




 While the results of the Wilcoxon test are the same for both packages
 are the same, those of the permutation test are very different. So,
 obviously I'm doing something wrong here. Can somebody please help?

 Thanks a lot,
  Holger

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Power analysis for Cox regression with a time-varying covariate

2012-07-15 Thread Paul Miller

Hi Greg,
 
Thanks for your response. So far I've just been asked to investigate what the 
analysis likely would involve. The hope was that there were be some sort of 
quick and easy canned approach. I don't really think this is the case though. 
If I'm asked to do the actual analysis itself, I'll start out using the steps 
you've listed and see where that takes me.
 
Paul  

--- On Fri, 7/13/12, Greg Snow 538...@gmail.com wrote:


From: Greg Snow 538...@gmail.com
Subject: Re: [R] Power analysis for Cox regression with a time-varying covariate
To: Paul Miller pjmiller...@yahoo.com
Cc: r-help@r-project.org
Received: Friday, July 13, 2012, 3:29 PM


For something like this the best (and possibly only reasonable) option
is to use simulation. I have posted on the general steps for using
simulation for power studies in this list and elsewhere before, but
probably never with coxph.

The general steps still hold, but the complicated part here will be to
simulate the data.  I would recommend something along the lines of:

1. generate a value for the censoring time, possibly exponential or
weibull (for simplicity I would make this not dependent on the
covariates if reasonable).
2. generate a value for the covariate for the given time period
(sample function possibly), then generate a survival time for this
covariate value (possibly weibull distribution, or lognormal,
exponential, etc.)  If the survival time is less than the time period
and censoring time then you have an event and a time to the event.  If
the survival time is longer than the censoring time, but not longer
than the time period (for the covariate), then you have censoring and
you can record the time to censoring.  If the survival time is longer
than the time period then you have the row information for that time
period and can move on to the next time period where you will first
randomly choose the covariate value again, then generate another
survival time based on the covariate and given that they have already
survived a given amount.  Continue with this until you have an event
or censoring time for each subject.

On Fri, Jul 13, 2012 at 9:17 AM, Paul Miller pjmiller...@yahoo.com wrote:
 Hello All,

 Does anyone know where I can find information about how to do a power 
 analysis for Cox regression with a time-varying covariate using R or  some 
 other readily available software? I've done some searching online but haven't 
 found anything.

 Thanks,

 Paul

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] computing a subset using a loop

2012-07-15 Thread burton030

Dear all,

I have a data frame with different variables and I want to build different
subsets out of this data frame using some conditions and I want to use a
loop because there will be a lot of subsets and this would be saving a lot
of time.

I try to give you an overview about my data frame. I have a data frame named
Baumdaten and it has one column named transectID with different IDs
(A_SEF ,A_LEF, B_SEF etc.) there is another column named Baumart with
different species like Abies alba, Betula pendula, etc. I want to build now
subsets and the first subset should be named A_2_SEF_Abies_alba and should
contain all Abies alba that are living in A_2_SEF. So the normal code would
be

A_2_SEF_Abies_alba-subset(Baumdaten,Baumart==Abies
albapointID==A_2_SEF)

 The following step would be to replace Abies alba with Betula pendula and
so on after doing this for A_SEF I have to start with A_LEF so a lot of time
is needing thats why I want to ask if it is possible doing this by using a
loop? Hope you can understand my problem...



--
View this message in context: 
http://r.789695.n4.nabble.com/computing-a-subset-using-a-loop-tp4636564.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] variable (column) in a data frame

2012-07-15 Thread arun

Hi,

I guess you can try this:

#You will get the same result here:

 df$aaa==2
logical(0)
!df$aaa==2
logical(0)
#But it is different for the variable present in the dataframe

 df$a==4
[1] FALSE FALSE FALSE
 !df$a==4
[1] TRUE TRUE TRUE
 identical(df$aaa==2,!df$aaa==2)
[1] TRUE
 identical(df$a==4,!df$a==4)
[1] FALSE


A.K.






- Original Message -
From: Paulo Barata paulo.bar...@ensp.fiocruz.br
To: r-help@r-project.org
Cc: 
Sent: Sunday, July 15, 2012 10:30 AM
Subject: [R] variable (column) in a data frame


To the R help list,

When using a data frame, there is no warning or error message 
when I refer to a non-existent variable inside the data frame.

Example:

##--

a - c(1,2,3)
b - c(11,22,33)
df - data.frame(a,b)
df

## correct: there is a column in df named 'a'
## the sum is correctly performed
sum(df$a==2)

## incorrect: there is no column in df named 'aaa', 
## but the sum is performed anyway without either warning or error
sum(df$aaa==2)

##--

Is there some way to make R issue either a warning or an error
message in such a situation?

I am using R version 2.15.1 64-bit on Windows 7 Professional.

Thank you very much.

Paulo Barata

-
Paulo Barata

ENSP - Fundação Oswaldo Cruz
Rua Leopoldo Bulhões 1480 - 8A
21041-210  Rio de Janeiro - RJ
Brazil
E-mail: paulo.bar...@ensp.fiocruz.br

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] computing a subset using a loop

2012-07-15 Thread jim holtman

I looks like you want to use the 'split' function which would create a
list of dataframes with the various conditions:

result - split(Baumdaten, list(Baumdaten$transectID,
Baumdaten$Baumart), drop = TRUE)

On Sun, Jul 15, 2012 at 11:31 AM, burton030 burto...@hotmail.de wrote:
 Dear all,

 I have a data frame with different variables and I want to build different
 subsets out of this data frame using some conditions and I want to use a
 loop because there will be a lot of subsets and this would be saving a lot
 of time.

 I try to give you an overview about my data frame. I have a data frame named
 Baumdaten and it has one column named transectID with different IDs
 (A_SEF ,A_LEF, B_SEF etc.) there is another column named Baumart with
 different species like Abies alba, Betula pendula, etc. I want to build now
 subsets and the first subset should be named A_2_SEF_Abies_alba and should
 contain all Abies alba that are living in A_2_SEF. So the normal code would
 be

 A_2_SEF_Abies_alba-subset(Baumdaten,Baumart==Abies
 albapointID==A_2_SEF)

  The following step would be to replace Abies alba with Betula pendula and
 so on after doing this for A_SEF I have to start with A_LEF so a lot of time
 is needing thats why I want to ask if it is possible doing this by using a
 loop? Hope you can understand my problem...



 --
 View this message in context: 
 http://r.789695.n4.nabble.com/computing-a-subset-using-a-loop-tp4636564.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] extracting rows and columns from a big matrix

2012-07-15 Thread A J


Hi there and thanks in advance.

I have a large symmetrical matrix stored in a text file. After load in R I 
would like to extract the same number of columns and rows (symmetrical 
submatrix) using their labels.

I have tried this code in order to extract columns, but R console gives me the 
+ sign at the end of the code, pointing out incomplete command, so it is not 
working:

m-read.table(C:/backup/symmetrical.csv)

n-subset(m, select=c(X1, X7, X12, X15, X22, X26, X31, X34, 
X39, X44, x51, X58)

Therefore, I have no tried with row names yet.

Any suggestions? Sorry for the inconvenience. I have read some information 
about this but always have the same problem with + and I do not have any idea 
to follow.

Best,

AJ



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] extracting rows and columns from a big matrix

2012-07-15 Thread jim holtman

For a start, you are missing a quote and a parenthese on the
statement; probably should be: (another quote was also missing)

n-subset(m, select=c(X1, X7, X12,X15, X22, X26, X31,
X34, X39, X44, X51, X58))

Not sure what you want with the rownames; an example would help and
post with 'dput'.

On Sun, Jul 15, 2012 at 2:47 PM, A J anxu...@hotmail.com wrote:

 Hi there and thanks in advance.

 I have a large symmetrical matrix stored in a text file. After load in R I 
 would like to extract the same number of columns and rows (symmetrical 
 submatrix) using their labels.

 I have tried this code in order to extract columns, but R console gives me 
 the + sign at the end of the code, pointing out incomplete command, so it 
 is not working:

 m-read.table(C:/backup/symmetrical.csv)

 n-subset(m, select=c(X1, X7, X12, X15, X22, X26, X31, X34, 
 X39, X44, x51, X58)

 Therefore, I have no tried with row names yet.

 Any suggestions? Sorry for the inconvenience. I have read some information 
 about this but always have the same problem with + and I do not have any 
 idea to follow.

 Best,

 AJ




 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] extracting rows and columns from a big matrix

2012-07-15 Thread Sarah Goslee

You're missing a )

You close c() but not subset().

That's the most common cause of incomplete commands: it often works to
just keep typing ) return until you get the regular prompt.

Sarah

On Sun, Jul 15, 2012 at 2:47 PM, A J anxu...@hotmail.com wrote:

 Hi there and thanks in advance.

 I have a large symmetrical matrix stored in a text file. After load in R I 
 would like to extract the same number of columns and rows (symmetrical 
 submatrix) using their labels.

 I have tried this code in order to extract columns, but R console gives me 
 the + sign at the end of the code, pointing out incomplete command, so it 
 is not working:

 m-read.table(C:/backup/symmetrical.csv)

 n-subset(m, select=c(X1, X7, X12, X15, X22, X26, X31, X34, 
 X39, X44, x51, X58)

 Therefore, I have no tried with row names yet.

 Any suggestions? Sorry for the inconvenience. I have read some information 
 about this but always have the same problem with + and I do not have any 
 idea to follow.

 Best,

 AJ



-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Imposing more than one condition to if

2012-07-15 Thread jim holtman

try this:

 set.seed(1)
 day= rep(1:30, each=10)
 n= length(day); x= c(1:24)
 time= sample(x, 300, replace= T)
 light= rep(c(20,10,6,0,0,0,0,0,8,20), 30)
 d=data.frame(day,time,light)

 # create a dawn/dusk column to mark where it happens
 d$dawn - c(FALSE, (head(d$light, -1)  2)  (tail(d$light, -1)  2))
 d$dusk - c((head(d$light, -1)  2)  (tail(d$light, -1)  2), FALSE)

 # now split and recombine to get the values for the day
 result - do.call(rbind, lapply(split(d, d$day), function(.day){
+ # create new dataframe with the values
+ cbind(.day[, c('day', 'time', 'light')]
+ , dawn = .day$time[.day$dawn]
+ , dusk = .day$time[.day$dusk]
+ )
+ }))



 result
   day time light dawn dusk
1.1  1720   16   14
1.2  1910   16   14
1.3  1   14 6   16   14
1.4  1   22 0   16   14
1.5  15 0   16   14
1.6  1   22 0   16   14
1.7  1   23 0   16   14
1.8  1   16 0   16   14
1.9  1   16 8   16   14
1.10 1220   16   14
2.11 2520   10   17
2.12 2510   10   17
2.13 2   17 6   10   17
2.14 2   10 0   10   17
2.15 2   19 0   10   17
2.16 2   12 0   10   17
2.17 2   18 0   10   17
2.18 2   24 0   10   17
2.19 2   10 8   10   17
2.20 2   1920   10   17
3.21 3   2320   21   16
3.22 3610   21   16
3.23 3   16 6   21   16
3.24 34 0   21   16
3.25 37 0   21   16
3.26 3   10 0   21   16
3.27 31 0   21   16
3.28 3   10 0   21   16
3.29 3   21 8   21   16
3.30 3920   21   16
4.31 4   1220   18   12
4.32 4   1510   18   12
4.33 4   12 6   18   12
4.34 45 0   18   12
4.35 4   20 0   18   12
4.36 4   17 0   18   12
4.37 4   20 0   18   12
4.38 43 0   18   12
4.39 4   18 8   18   12
4.40 4   1020   18   12




On Sun, Jul 15, 2012 at 12:32 PM, Santiago Guallar sgual...@yahoo.com wrote:
 Hi,

 I have a dataset which contains several time records for a number of days, 
 plus a variable (light) that allows to determine night time (lihgt= 0) and 
 daytime (light 0). I need to obtain get dusk time and dawn time for each day 
 and place them in two columns.

 This is the starting point (d):
 day time light
 1 1   20
 1 12 10
 1 11 6
 1 9   0
 1 6   0
 1 12 0
 ...
 30 8 0
 30 3 0
 30 8 0
 30 3 0
 30 8 8
 30 9 20


 And this what I want to get:
 day time light dusk dawn
 1 1  20 11 10
 1 1210 11 10
 1 11 6  11 10
 1 9   0  11 10
 1 6   0  11 10
 1 12 0  11 10
 ...
 30 8 0   9 5
 30 3 0   9 5
 30 8 0   9 5
 30 3 0   9 5
 30 8 8   9 5
 30 9 20 9 5

 This is the code for data frame d:
 day= rep(1:30, each=10)
 n= length(dia); x= c(1:24)
 time= sample(x, 300, replace= T)
 light= rep(c(20,10,6,0,0,0,0,0,8,20), 30)
 d=data.frame(day,time,light)

 I'd need to impose a double condition like the next but if does not take more 
 than one:
 attach(d)
 for (i in 1: n){
 if (light[i-1]2  light[i]2){
 d$dusk- time[i-1]
 }
 if (light[i-1]2  light[i]2){
 d$dawn- time[i]
 }
 }
 detach(d)
 d

 Thank you for your help
 [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] extracting rows and columns from a big matrix

2012-07-15 Thread A J


Sorry so much for mistakes. 

It was an example code and I commited some mistakes typing it. But meaning the 
original code is right (I have checked several times) I am not sure about how 
to solve the problem of extracting columns and rows using labels from a squared 
matrix. I have enclosed a text file with the idea in order to understand it 
better.

Thanks again, and sorry for the inconvenience.

Best,

AJ



 Date: Sun, 15 Jul 2012 14:53:47 -0400
 Subject: Re: [R] extracting rows and columns from a big matrix
 From: jholt...@gmail.com
 To: anxu...@hotmail.com
 CC: r-help@r-project.org
 
 For a start, you are missing a quote and a parenthese on the
 statement; probably should be: (another quote was also missing)
 
 n-subset(m, select=c(X1, X7, X12,X15, X22, X26, X31,
 X34, X39, X44, X51, X58))
 
 Not sure what you want with the rownames; an example would help and
 post with 'dput'.
 
 On Sun, Jul 15, 2012 at 2:47 PM, A J anxu...@hotmail.com wrote:
 
  Hi there and thanks in advance.
 
  I have a large symmetrical matrix stored in a text file. After load in R I 
  would like to extract the same number of columns and rows (symmetrical 
  submatrix) using their labels.
 
  I have tried this code in order to extract columns, but R console gives me 
  the + sign at the end of the code, pointing out incomplete command, so it 
  is not working:
 
  m-read.table(C:/backup/symmetrical.csv)
 
  n-subset(m, select=c(X1, X7, X12, X15, X22, X26, X31, X34, 
  X39, X44, x51, X58)
 
  Therefore, I have no tried with row names yet.
 
  Any suggestions? Sorry for the inconvenience. I have read some information 
  about this but always have the same problem with + and I do not have any 
  idea to follow.
 
  Best,
 
  AJ
 
 
 
 
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 -- 
 Jim Holtman
 Data Munger Guru
 
 What is the problem that you are trying to solve?
 Tell me what you want to do, not how you want to do it.
  Original Square Matrix

X1  X7  X12 X15 X22 X26 X31 X34 X39 
X44 X51
X1  1   2   3   4   5   6   7   8   9   
10  11
X7  11  9   7   5   3   1   10  8   6   
4   2
X12 3   4   7   8   5   7   2   9   1   
3   2
X15 9   9   8   4   7   1   1   3   2   
5   3
X22 6   7   7   4   4   2   9   8   8   
1   1
X26 3   9   4   8   5   7   6   1   2   
3   8
X31 1   2   1   3   1   4   1   5   1   
6   1
X34 6   7   8   5   2   9   5   1   6   
8   9
X39 4   8   7   4   6   5   1   9   2   
7   5
X44 2   2   2   8   6   7   9   5   3   
7   7
X51 9   9   9   6   6   4   8   7   2   
1   3



Final Square Submatrix

X1  X12 X22 X31
X1  1   3   5   7
X12 3   7   5   2
X22 6   7   4   9
X31 1   1   1   1__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] read mignight as 24:00 and not as 0:00

2012-07-15 Thread Daniel Nordlund

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf Of Sandy Adriaenssens
 Sent: Friday, July 13, 2012 3:52 AM
 To: r-help@r-project.org
 Subject: [R] read mignight as 24:00 and not as 0:00

 Dear all,

 I have dataset which contains date and time in the format
 yearmonthdayhour. I can read in these data correctly as follows:

 mydata - read.csv(pm10_corine_gridcel_hourly_2011.csv, header = TRUE)
 mydata$date - as.POSIXct(strptime(mydata$date, format = %Y%m%d%H,
 tz=UTC))

 However, midnight is defined as 24:00 in my original file (so the end of
 the
 day), while the POSIXct function changes this to 0:00 (the beginning of
 the
 next day).
 So, my data now go from January 1 2011 1:00 to Januari 1 2012 0:00, in
 stead
 of December 31 2011 24:00.

 summary(mydata$date)
  Min.   1st Qu.Median
 2011-01-01 01:00:00 2011-04-02 06:45:00 2011-07-02 12:30:00
  Mean   3rd Qu.  Max.
 2011-07-02 12:30:00 2011-10-01 18:15:00 2012-01-01 00:00:00

 I would like to change this 0:00 to 24:00 again since I want to include
 these values in daily averages of the previous day (and not of the next
 day). So the day of the month should also be diminished by 1.

 I have tried extracting the hours which are 0 and converting them to 24,
 but
 then I can't paste them back in the date/time of the original data.fram
 again.

 Are there maybe other solutions?

 Thanks in advance,

 Sandy

 ifelse (as.POSIXlt(mydata[24,1])$hour = 0,as.POSIXlt(mydata[24,1])$hour
 =
 24

Sandy,

You really haven't given us enough information to provide a solution, but 
here are some questions and suggestions.  

Do you have any times less than 01:00:00 ?  You mention going from 01:00:00 to 
24:00:00 in you data.  I presume these are text fields and not time objects. Do 
you have fractional hours represented in your data, or are all times on the 
hour?

1.  If your times are always on the hour no minutes or second, i.e. 01:00 to 
24:00, then you could read them as is and then just subtract 1 hour from all 
date/time values.

2.  If you have fractional hours, e.g. 00:32:00 or 11:45, then you could 
possible just read the date/time values and whenever the time is exactly 
00:00:00, subtract 1 second from the value.  this will at least get you just 
before midnight on the previous day.

Whether either of these approaches will work for you depends on what your 
actual needs are.  If this doesn't work for you, you will need to write back to 
R-help and explain more about what your actual needs are, and and provide more 
detail about you actual dates and times (see questions above.

Hope this is somewhat helpful,

Dan

Daniel Nordlund
Bothell, WA USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] extracting rows and columns from a big matrix

2012-07-15 Thread jim holtman

Is this what you want:

 x - read.table(text =  X1 X7 X12 X15 X22 X26 X31 X34 X39 X44 X51
+ X1 1 2 3 4 5 6 7 8 9 10 11
+ X7 11 9 7 5 3 1 10 8 6 4 2
+ X12 3 4 7 8 5 7 2 9 1 3 2
+ X15 9 9 8 4 7 1 1 3 2 5 3
+ X22 6 7 7 4 4 2 9 8 8 1 1
+ X26 3 9 4 8 5 7 6 1 2 3 8
+ X31 1 2 1 3 1 4 1 5 1 6 1
+ X34 6 7 8 5 2 9 5 1 6 8 9
+ X39 4 8 7 4 6 5 1 9 2 7 5
+ X44 2 2 2 8 6 7 9 5 3 7 7
+ X51 9 9 9 6 6 4 8 7 2 1 3, header = TRUE)

 indx - c(X1, X12, X22, X31)
 x[indx, indx]
X1 X12 X22 X31
X1   1   3   5   7
X12  3   7   5   2
X22  6   7   4   9
X31  1   1   1   1


On Sun, Jul 15, 2012 at 3:43 PM, A J anxu...@hotmail.com wrote:
 Sorry so much for mistakes.


 It was an example code and I commited some mistakes typing it. But meaning
 the original code is right (I have checked several times) I am not sure
 about how to solve the problem of extracting columns and rows using labels
 from a squared matrix. I have enclosed a text file with the idea in order to
 understand it better.


 Thanks again, and sorry for the inconvenience.


 Best,


 AJ



 Date: Sun, 15 Jul 2012 14:53:47 -0400
 Subject: Re: [R] extracting rows and columns from a big matrix
 From: jholt...@gmail.com
 To: anxu...@hotmail.com
 CC: r-help@r-project.org

 For a start, you are missing a quote and a parenthese on the
 statement; probably should be: (another quote was also missing)

 n-subset(m, select=c(X1, X7, X12,X15, X22, X26, X31,
 X34, X39, X44, X51, X58))

 Not sure what you want with the rownames; an example would help and
 post with 'dput'.

 On Sun, Jul 15, 2012 at 2:47 PM, A J anxu...@hotmail.com wrote:
 
  Hi there and thanks in advance.
 
  I have a large symmetrical matrix stored in a text file. After load in R
  I would like to extract the same number of columns and rows (symmetrical
  submatrix) using their labels.
 
  I have tried this code in order to extract columns, but R console gives
  me the + sign at the end of the code, pointing out incomplete command, so
  it is not working:
 
  m-read.table(C:/backup/symmetrical.csv)
 
  n-subset(m, select=c(X1, X7, X12, X15, X22, X26, X31,
  X34, X39, X44, x51, X58)
 
  Therefore, I have no tried with row names yet.
 
  Any suggestions? Sorry for the inconvenience. I have read some
  information about this but always have the same problem with + and I do
  not have any idea to follow.
 
  Best,
 
  AJ
 
 
 
 
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.



 --
 Jim Holtman
 Data Munger Guru

 What is the problem that you are trying to solve?
 Tell me what you want to do, not how you want to do it.



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] rgamma function

2012-07-15 Thread Thomas Lumley

On Fri, Jul 13, 2012 at 7:55 PM, Chandler Zuo z...@stat.wisc.edu wrote:
 Hi,

 Has anyone encountered the problem of rgamma function in C? The following
 simplified program always dies for me, and I wonder if anyone can tell me
 the reason.

 #include Rmath.h
 #include time.h
 #include Rinternals.h

 SEXP generateGamma ()
 {
 srand(time(NULL));
 return (rgamma(5000,1));
 }

rgamma doesn't return an SEXP, it returns a double.  Also, the srand()
call is pointless.

 Has anyone encountered a similar problem before? Is there another way of
 generating Gamma random variable in C?

 P.S. I have no problem compiling and loading this function in R.

Strange. You should get compiler warnings that the return type is
incompatible.  I get

foo.c: In function ‘generateGamma’:
foo.c:7: warning: implicit declaration of function ‘srand’
foo.c:8: error: incompatible types in return

I thought the ANSI standard actually *required* a diagnostic for the
incompatible return types.

   -thomas

-- 
Thomas Lumley
Professor of Biostatistics
University of Auckland

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] About dpik function

2012-07-15 Thread chester123

Hi there and thanks in advance.

Nowadays I am working on the plug-in bandwidth selection with R. Firstly, my
1010 data is the return rate from Yahoo Finance.
Secondly, my code is following:
 r=read.table(/Users/user/Desktop/research/a.txt,sep=,,header=TRUE)
 x-r[8:1010,]
 library(KernSmooth)
 dpik(x,scalest=minim,level=2L,kernel=normal,canonical=FALSE,gridsize=401L,range.x=range(x),truncate=TRUE)

But the error happens like this:
Error in Summary.factor(c(233L, 917L, 381L, 748L, 272L, 242L, 269L, 963L,  : 
  range not meaningful for factors

I don't know what's wrong and i am a rookie, please help with that. Thanks!


--
View this message in context: 
http://r.789695.n4.nabble.com/About-dpik-function-tp4636590.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] extracting rows and columns from a big matrix

2012-07-15 Thread arun

Hello,

Try this:

dat1-read.table(text=
  X1 X7 X12 X15 X22 X26 X31 X34 X39 X44 X51
X1  1  2   3   4  5  6  7  8  9 10  11
X7  11  9  7  5   3  1 10 8 6  4  2
X12 3  4  7  8  5   7  2  9  1  3  2
X15 9  9  8  4  7  1   1  3  2  5  3
X22 6  7  7  4  4  2  9  8  8  1  1
X26 3  9  4  8  5  7  6  1  2  3  8
X31 1  2  1  3  1  4  1  5  1  6  1
X34 6  7  8  5  2  9  5  1  6  8  9
X39 4  8  7  4  6  5  1  9  2  7  5
X44 2  2  2  8  6  7  9  5  3  7  7
X51 9  9  9  6  6  4  8  7  2  1  3
,sep=, header=TRUE)

#Inorder to get your final submatrix:
#Either this:
 dat1[c(1,3,5,7),c(1,3,5,7)]


# or
dat1[(select=c(X1,X12,X22,X31)),(select=c(X1,X12,X22,X31))]
    X1 X12 X22 X31
X1   1   3   5   7
X12  3   7   5   2
X22  6   7   4   9
X31  1   1   1   1


#You can convert this data.frame to matrix
dat2-as.matrix(dat1[(select=c(X1,X12,X22,X31)),(select=c(X1,X12,X22,X31))])
 is.matrix(dat2)
[1] TRUE



A.K.







- Original Message -
From: A J anxu...@hotmail.com
To: jholt...@gmail.com
Cc: r-help@r-project.org
Sent: Sunday, July 15, 2012 3:43 PM
Subject: Re: [R] extracting rows and columns from a big matrix


Sorry so much for mistakes. 

It was an example code and I commited some mistakes typing it. But meaning the 
original code is right (I have checked several times) I am not sure about how 
to solve the problem of extracting columns and rows using labels from a squared 
matrix. I have enclosed a text file with the idea in order to understand it 
better.

Thanks again, and sorry for the inconvenience.

Best,

AJ



 Date: Sun, 15 Jul 2012 14:53:47 -0400
 Subject: Re: [R] extracting rows and columns from a big matrix
 From: jholt...@gmail.com
 To: anxu...@hotmail.com
 CC: r-help@r-project.org
 
 For a start, you are missing a quote and a parenthese on the
 statement; probably should be: (another quote was also missing)
 
 n-subset(m, select=c(X1, X7, X12,X15, X22, X26, X31,
 X34, X39, X44, X51, X58))
 
 Not sure what you want with the rownames; an example would help and
 post with 'dput'.
 
 On Sun, Jul 15, 2012 at 2:47 PM, A J anxu...@hotmail.com wrote:
 
  Hi there and thanks in advance.
 
  I have a large symmetrical matrix stored in a text file. After load in R I 
  would like to extract the same number of columns and rows (symmetrical 
  submatrix) using their labels.
 
  I have tried this code in order to extract columns, but R console gives me 
  the + sign at the end of the code, pointing out incomplete command, so it 
  is not working:
 
  m-read.table(C:/backup/symmetrical.csv)
 
  n-subset(m, select=c(X1, X7, X12, X15, X22, X26, X31, X34, 
  X39, X44, x51, X58)
 
  Therefore, I have no tried with row names yet.
 
  Any suggestions? Sorry for the inconvenience. I have read some information 
  about this but always have the same problem with + and I do not have any 
  idea to follow.
 
  Best,
 
  AJ
 
 
 
 
          [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 -- 
 Jim Holtman
 Data Munger Guru
 
 What is the problem that you are trying to solve?
 Tell me what you want to do, not how you want to do it.
                          
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] computing a subset using a loop

2012-07-15 Thread burton030

http://r.789695.n4.nabble.com/file/n4636585/Baumdaten_aufbereitet.csv
Baumdaten_aufbereitet.csv 

Here you have an overview about my data frame...

--
View this message in context: 
http://r.789695.n4.nabble.com/computing-a-subset-using-a-loop-tp4636564p4636585.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] computing a subset using a loop

2012-07-15 Thread burton030

Hi,

thanks for your reply but this code just gives me a list but no subsets but
I need subsets because I want to do some calculations with these subsets and
want do make some plots etc. Is there a solution for my problem? I ve posted
an example for the first subset...

http://r.789695.n4.nabble.com/file/n4636591/A_SEF_Abies_alba.csv
A_SEF_Abies_alba.csv 

--
View this message in context: 
http://r.789695.n4.nabble.com/computing-a-subset-using-a-loop-tp4636564p4636591.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] how to extract p-value in GenMatch function

2012-07-15 Thread shyam basnet

Dear R-Users,

I have a problem on extracting T-Stat and P-Value. I have written R-code below

library(Matching)

data(lalonde)
attach(lalonde)
names(lalonde)
Y - lalonde$re78

Tr - lalonde$treat
glm1 - 
glm(Tr~age+educ+black+hisp+married+nodegr+re74+re75,family=binomial,data=lalonde)

pscore.predicted - predict(glm1)

rr1 - Match(Y=Y,Tr=Tr,X=glm1$fitted,estimand=ATT, M=1,ties=TRUE,replace=TRUE)

summary(rr1)

 summary(rr1)

Estimate...  2624.3 
AI SE..  802.19 
T-stat.  3.2714 
p.val..  0.0010702 

Original number of observations..  445 
Original number of treated obs...  185 
Matched number of observations...  185 
Matched number of observations  (unweighted).  344 

In above output, I can extract Estimate and AI SE with below code:
rr1$est

rr1$se

But the problem is I could not extract T-statistic and P-value from the above 
output.


Could you please someone help me to resolve this problem?
Thanking you,

Best Regards,

Shyam Basnet
SLU, Uppsala, Sweden

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] variable (column) in a data frame

2012-07-15 Thread Peter Ehlers


On 2012-07-15 10:01, Paulo Barata wrote:


Dear Peter,

Thank you. I will try to modify my programming habits.
But it seems there is a flaw in R, when it accepts a reference
to a non-existent variable inside a data frame with the df$var
notation. This should be corrected somehow.

Paulo Barata



Paulo,

I understand your concerns and I do think that the best
thing would be to excise the $ shortcut from the language
or, at least, make y$x equivalent to
y[[x, exact = TRUE]]. But, as has been pointed out
before, that might not be easy. Nevertheless, even y[[x]]
may not be the ultimate panacea. Consider your own
example:

df - data.frame(a = 1:3, b=11:13)
sum(df[[aaa]] == 2)
#[1] 0

which results from

df[[aaa]] == 2
#logical(0)

The safest extraction is y[ , x]:

sum(df[ , aaa] == 2)
#Error in `[.data.frame`(df, , aaa) : undefined columns selected

But then, this comes down to whether one thinks that
addressing a nonexistent variable should result in an
error or should return NULL.

The bottom line probably is that the $ behaviour will not change
in the near future and one would simply be well advised to be
aware of its behaviour. Every language has its quirks. Just be
thankful that the R language isn't as big a mess as the English
language (which I do love dearly).

Peter Ehlers


-


-- Original Message ---
From: Peter Ehlersehl...@ucalgary.ca
To: Paulo Baratapaulo.bar...@ensp.fiocruz.br
Cc: r-help@r-project.orgr-help@r-project.org, peter dalgaard
pda...@gmail.com
Sent: Sun, 15 Jul 2012 09:29:11 -0700
Subject: Re: [R] variable (column) in a data frame


On 2012-07-15 08:41, Paulo Barata wrote:


Dr. Dalgaard,

Thank you. But pre-checking with is.null() or using with()
doesn't solve the problem of catching spelling mistakes
in the name of a variable inside a data frame, when using
the df$var notation often in a program.

Is there some way for R to behave, in relation to a variable
inside a data frame, the same way it behaves for a variable
not in a data frame? For example:

##
a- c(1,2,3)

## the variable exists, we get a correct answer
a==1

## the variable does not exist, R rightly points this out
aaa==1
##

My point is, if we make a spelling mistake in a program when referring
to a variable inside a data frame, using the df$var notation,
there seems to be no way of getting warned about that.


You could wean yourself from the $-habit. It's convenient but can
lead to the problems you're experiencing (and this has been
discussed before). For programming, if you're prone to make
spelling errors, you should prefer df[, aaa]. See ?Extract.

Peter Ehlers



Thank you once again.

Paulo Barata

-


-- Original Message ---
From: peter dalgaardpda...@gmail.com
To: Paulo Baratapaulo.bar...@ensp.fiocruz.br
Sent: Sun, 15 Jul 2012 16:47:35 +0200
Subject: Re: [R] variable (column) in a data frame


On Jul 15, 2012, at 16:30 , Paulo Barata wrote:



To the R help list,

When using a data frame, there is no warning or error message
when I refer to a non-existent variable inside the data frame.

Example:

##--

a- c(1,2,3)
b- c(11,22,33)
df- data.frame(a,b)
df

## correct: there is a column in df named 'a'
## the sum is correctly performed
sum(df$a==2)

## incorrect: there is no column in df named 'aaa',
## but the sum is performed anyway without either warning or error
sum(df$aaa==2)

##--

Is there some way to make R issue either a warning or an error
message in such a situation?



You can pre-check for is.null(df$aaa) or use with(df, sum(aaa==2)).

--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

--- End of Original Message ---

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

--- End of Original Message ---



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to extract p-value in GenMatch function

2012-07-15 Thread Peter Ehlers


On 2012-07-15 14:37, shyam basnet wrote:

Dear R-Users,

I have a problem on extracting T-Stat and P-Value. I have written R-code below

library(Matching)

data(lalonde)
attach(lalonde)
names(lalonde)
Y- lalonde$re78

Tr- lalonde$treat
glm1- 
glm(Tr~age+educ+black+hisp+married+nodegr+re74+re75,family=binomial,data=lalonde)

pscore.predicted- predict(glm1)

rr1- Match(Y=Y,Tr=Tr,X=glm1$fitted,estimand=ATT, M=1,ties=TRUE,replace=TRUE)

summary(rr1)


summary(rr1)


Estimate...  2624.3
AI SE..  802.19
T-stat.  3.2714
p.val..  0.0010702

Original number of observations..  445
Original number of treated obs...  185
Matched number of observations...  185
Matched number of observations  (unweighted).  344

In above output, I can extract Estimate and AI SE with below code:
rr1$est

rr1$se

But the problem is I could not extract T-statistic and P-value from the above 
output.


Could you please someone help me to resolve this problem?


You could look at the code for summary.Match to see that
T-stat (not surprisingly) is calculated as est/se and
p.val is calculated as (1 - pnorm(abs(est/se))) * 2.
summary.Match() doesn't return these values, it just
prints them.

Peter Ehlers


Thanking you,

Best Regards,

Shyam Basnet
SLU, Uppsala, Sweden

[[alternative HTML version deleted]]


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] computing a subset using a loop

2012-07-15 Thread jim holtman

Here is an example of using your data to split it into the subsets and
then computing a summary of each subset.  You have to remember that
what is returned from 'split' is a 'list' of 'data.frames' that as the
subsets that you want and then use use 'lapply' to process each of the
subsets in the list.

 df - read.table(C:\\Documents and Settings\\kon9407\\My 
 Documents\\Downloads\\Baumdaten_aufbereitet (1).csv
+ , sep = ';'
+ , as.is = TRUE
+ , header = TRUE
+ )
 # split the data into a list of dataframe
 df.s - split(df, list(df$Plot..ID., df$Baumart), drop = TRUE)
 head(names(df.s), 20)
 [1] A_2_1.Abies alba A_2_2.Abies alba A_2_3.Abies alba A_2_4.Abies alba
 [5] A_2_5.Abies alba A_2_6.Abies alba A_3_1.Abies alba A_3_4.Abies alba
 [9] A_3_5.Abies alba A_3_6.Abies alba A_4_1.Abies alba A_4_3.Abies alba
[13] A_4_4.Abies alba A_4_5.Abies alba A_4_6.Abies alba B_1_2.Abies alba
[17] B_1_4.Abies alba B_1_5.Abies alba B_1_6.Abies alba B_2_4.Abies alba
 df.s[1]
$`A_2_1.Abies alba`
  X Plot..ID. Alter.Neuer.Wald Hoehe..m. Radius..cm.  Familie
  BaumartDeutsch
1 1 A_2_12   6475,64 Kieferngewaechse
Abies alba Weisstanne
2 2 A_2_12   6475,64 Kieferngewaechse
Abies alba Weisstanne
3 3 A_2_12   6475,64 Kieferngewaechse
Abies alba Weisstanne
4 4 A_2_12   6475,64 Kieferngewaechse
Abies alba Weisstanne
5 5 A_2_12   6475,64 Kieferngewaechse
Abies alba Weisstanne
6 6 A_2_12   6475,64 Kieferngewaechse
Abies alba Weisstanne
7 7 A_2_12   6475,64 Kieferngewaechse
Abies alba Weisstanne
8 8 A_2_12   6475,64 Kieferngewaechse
Abies alba Weisstanne
9 9 A_2_12   6475,64 Kieferngewaechse
Abies alba Weisstanne
 Englisch Umfang..cm.DBH..cm. Gehoelz Bemerkungen
Fotos Waldart pointID
1 European silver fir  38 12,09577569   0
NA SEF A_2_SEF
2 European silver fir  NANA   1
NA SEF A_2_SEF
3 European silver fir  NANA   1
NA SEF A_2_SEF
4 European silver fir  NANA   1
NA SEF A_2_SEF
5 European silver fir  NANA   1
NA SEF A_2_SEF
6 European silver fir  NANA   1
NA SEF A_2_SEF
7 European silver fir  NANA   1
NA SEF A_2_SEF
8 European silver fir  NANA   1
NA SEF A_2_SEF
9 European silver fir  NANA   1
NA SEF A_2_SEF
  transectID DBH_inch  age
1  A_SEF 4,76211641338583 35,7158731003937
2  A_SEF NA NA
3  A_SEF NA NA
4  A_SEF NA NA
5  A_SEF NA NA
6  A_SEF NA NA
7  A_SEF NA NA
8  A_SEF NA NA
9  A_SEF NA NA


 lapply(df.s, summary)  # notice the names of each of the subsets is printed
$`A_2_1.Abies alba`
   X  Plot..ID. Alter.Neuer.Wald   Hoehe..m.
Radius..cm.
 Min.   :1   Length:9   Min.   :2Min.   :647
Length:9
 1st Qu.:3   Class :character   1st Qu.:21st Qu.:647   Class
:character
 Median :5   Mode  :character   Median :2Median :647   Mode
:character
 Mean   :5  Mean   :2Mean   :647
 3rd Qu.:7  3rd Qu.:23rd Qu.:647
 Max.   :9  Max.   :2Max.   :647

   FamilieBaumartDeutschEnglisch
   Umfang..cm.
 Length:9   Length:9   Length:9   Length:9
  Min.   :38
 Class :character   Class :character   Class :character   Class
:character   1st Qu.:38
 Mode  :character   Mode  :character   Mode  :character   Mode
:character   Median :38

  Mean   :38

  3rd Qu.:38

  Max.   :38

  NA's   :8
   DBH..cm.Gehoelz   Bemerkungen Fotos
  Waldart
 Length:9   Min.   :0.   Length:9   Mode:logical
Length:9
 Class :character   1st Qu.:1.   Class :character   NA's:9
Class :character
 Mode  :character   Median :1.   Mode  :character
Mode  :character
Mean   :0.8889
3rd Qu.:1.
Max.   :1.

   pointID   transectID  DBH_inch age
 Length:9   Length:9   Length:9   Length:9
 Class :character   Class :character   Class :character   Class :character
 Mode  :character   Mode  :character   Mode  :character   Mode  :character





$`A_2_2.Abies alba`
   X   Plot..ID. Alter.Neuer.Wald   Hoehe..m.
Radius..cm.
 Min.   :12   Length:1   Min.   :2Min.   :660
Length:1
 1st Qu.:12   Class :character   1st Qu.:2

Re: [R] read mignight as 24:00 and not as 0:00

2012-07-15 Thread Jeff Newmiller

Extract the date separately from the time initially, and keep it separate. When 
you want to process daily data, use that column.
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Daniel Nordlund djnordl...@frontier.com wrote:

 -Original Message-
 From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org]
 On Behalf Of Sandy Adriaenssens
 Sent: Friday, July 13, 2012 3:52 AM
 To: r-help@r-project.org
 Subject: [R] read mignight as 24:00 and not as 0:00
 
 Dear all,
 
 I have dataset which contains date and time in the format
 yearmonthdayhour. I can read in these data correctly as follows:
 
 mydata - read.csv(pm10_corine_gridcel_hourly_2011.csv, header =
TRUE)
 mydata$date - as.POSIXct(strptime(mydata$date, format = %Y%m%d%H,
 tz=UTC))
 
 However, midnight is defined as 24:00 in my original file (so the end
of
 the
 day), while the POSIXct function changes this to 0:00 (the beginning
of
 the
 next day).
 So, my data now go from January 1 2011 1:00 to Januari 1 2012 0:00,
in
 stead
 of December 31 2011 24:00.
 
 summary(mydata$date)
  Min.   1st Qu.Median
 2011-01-01 01:00:00 2011-04-02 06:45:00 2011-07-02 12:30:00
  Mean   3rd Qu.  Max.
 2011-07-02 12:30:00 2011-10-01 18:15:00 2012-01-01 00:00:00
 
 I would like to change this 0:00 to 24:00 again since I want to
include
 these values in daily averages of the previous day (and not of the
next
 day). So the day of the month should also be diminished by 1.
 
 I have tried extracting the hours which are 0 and converting them to
24,
 but
 then I can't paste them back in the date/time of the original
data.fram
 again.
 
 Are there maybe other solutions?
 
 Thanks in advance,
 
 Sandy
 
 ifelse (as.POSIXlt(mydata[24,1])$hour =
0,as.POSIXlt(mydata[24,1])$hour
 =
 24
 

Sandy,

You really haven't given us enough information to provide a solution,
but here are some questions and suggestions.  

Do you have any times less than 01:00:00 ?  You mention going from
01:00:00 to 24:00:00 in you data.  I presume these are text fields and
not time objects. Do you have fractional hours represented in your
data, or are all times on the hour?

1.  If your times are always on the hour no minutes or second, i.e.
01:00 to 24:00, then you could read them as is and then just subtract 1
hour from all date/time values.

2.  If you have fractional hours, e.g. 00:32:00 or 11:45, then you
could possible just read the date/time values and whenever the time is
exactly 00:00:00, subtract 1 second from the value.  this will at least
get you just before midnight on the previous day.

Whether either of these approaches will work for you depends on what
your actual needs are.  If this doesn't work for you, you will need to
write back to R-help and explain more about what your actual needs are,
and and provide more detail about you actual dates and times (see
questions above.


Hope this is somewhat helpful,

Dan

Daniel Nordlund
Bothell, WA USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Is it possible to start R-GUI in a windowed state under Windows OS

2012-07-15 Thread Wei Wu

Is it possible to start R-Gui in a windowed state under windows? (I am running 
Windows 7 and Vista)
I have the set the property for R icon to normal window option, but that has 
no effect.
Thanks.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Package MuMIn (dredge): Error in ret[, ] - cbind(x, se, rep(if (is.null(df)) NA_real_ else df, : number of items to replace is not a multiple of replacement length.

2012-07-15 Thread Jeremy Little

I have also reinstalled the MuMIn package as suggested at...

http://r.789695.n4.nabble.com/Error-message-number-of-items-to-replace-is-not-a-multiple-of-replacement-length-td3257893.html

...however, this made no difference.

any help is appreciated.

thank you

--
View this message in context: 
http://r.789695.n4.nabble.com/Package-MuMIn-dredge-Error-in-ret-cbind-x-se-rep-if-is-null-df-NA-real-else-df-number-of-items-to-re-tp4636105p4636604.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] extracting rows and columns from a big matrix

2012-07-15 Thread arun

Hello,

In my previous email, I used index to subset the data.  Then, I looked at your 
code.  I guess you wanted to try the subset function to get the same output.

Try this:
dat1-read.table(text=
  X1 X7 X12 X15 X22 X26 X31 X34 X39 X44 X51
X1  1  2   3   4  5  6  7  8  9 10  11
X7  11  9  7  5   3  1 10 8 6  4  2
X12 3  4  7  8  5   7  2  9  1  3  2
X15 9  9  8  4  7  1   1  3  2  5  3
X22 6  7  7  4  4  2  9  8  8  1  1
X26 3  9  4  8  5  7  6  1  2  3  8
X31 1  2  1  3  1  4  1  5  1  6  1
X34 6  7  8  5  2  9  5  1  6  8  9
X39 4  8  7  4  6  5  1  9  2  7  5
X44 2  2  2  8  6  7  9  5  3  7  7
X51 9  9  9  6  6  4  8  7  2  1  3
,sep=, header=TRUE)

subset(dat1,subset=row.names(dat1)%in% 
c(X1,X12,X22,X31),select=c(X1,X12,X22,X31))
    X1 X12 X22 X31
X1   1   3   5   7
X12  3   7   5   2
X22  6   7   4   9
X31  1   1   1   1

A.K.






- Original Message -
From: A J anxu...@hotmail.com
To: jholt...@gmail.com
Cc: r-help@r-project.org
Sent: Sunday, July 15, 2012 3:43 PM
Subject: Re: [R] extracting rows and columns from a big matrix


Sorry so much for mistakes. 

It was an example code and I commited some mistakes typing it. But meaning the 
original code is right (I have checked several times) I am not sure about how 
to solve the problem of extracting columns and rows using labels from a squared 
matrix. I have enclosed a text file with the idea in order to understand it 
better.

Thanks again, and sorry for the inconvenience.

Best,

AJ



 Date: Sun, 15 Jul 2012 14:53:47 -0400
 Subject: Re: [R] extracting rows and columns from a big matrix
 From: jholt...@gmail.com
 To: anxu...@hotmail.com
 CC: r-help@r-project.org
 
 For a start, you are missing a quote and a parenthese on the
 statement; probably should be: (another quote was also missing)
 
 n-subset(m, select=c(X1, X7, X12,X15, X22, X26, X31,
 X34, X39, X44, X51, X58))
 
 Not sure what you want with the rownames; an example would help and
 post with 'dput'.
 
 On Sun, Jul 15, 2012 at 2:47 PM, A J anxu...@hotmail.com wrote:
 
  Hi there and thanks in advance.
 
  I have a large symmetrical matrix stored in a text file. After load in R I 
  would like to extract the same number of columns and rows (symmetrical 
  submatrix) using their labels.
 
  I have tried this code in order to extract columns, but R console gives me 
  the + sign at the end of the code, pointing out incomplete command, so it 
  is not working:
 
  m-read.table(C:/backup/symmetrical.csv)
 
  n-subset(m, select=c(X1, X7, X12, X15, X22, X26, X31, X34, 
  X39, X44, x51, X58)
 
  Therefore, I have no tried with row names yet.
 
  Any suggestions? Sorry for the inconvenience. I have read some information 
  about this but always have the same problem with + and I do not have any 
  idea to follow.
 
  Best,
 
  AJ
 
 
 
 
          [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 -- 
 Jim Holtman
 Data Munger Guru
 
 What is the problem that you are trying to solve?
 Tell me what you want to do, not how you want to do it.
                          
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] RSQLite install problem

2012-07-15 Thread Jake Burkhead

Hi,

I'm trying to install RSQLite on R 2.14 Ubuntu 12.04 i686. The installation
always gets stalled and ends up not working. I installed libsqlite3-dev but
still no luck. Anyone know how to solve this?

$ R CMD INSTALL RSQLite_0.11.1.tar.gz
* installing to library /home/ubuntu/R/i686-pc-linux-gnu-library/2.14
* installing *source* package RSQLite ...
** package RSQLite successfully unpacked and MD5 sums checked
checking for gcc... gcc -std=gnu99
checking for C compiler default output file name... a.out
checking whether the C compiler works... yes
checking whether we are cross compiling... no
checking for suffix of executables...
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc -std=gnu99 accepts -g... yes
checking for gcc -std=gnu99 option to accept ISO C89... none needed
checking how to run the C preprocessor... gcc -std=gnu99 -E
checking for gcc... (cached) gcc -std=gnu99
checking whether we are using the GNU C compiler... (cached) yes
checking whether gcc -std=gnu99 accepts -g... (cached) yes
checking for gcc -std=gnu99 option to accept ISO C89... (cached) none needed
checking for library containing fdatasync... none required
configure: creating ./config.status
config.status: creating src/Makevars
** libs
gcc -std=gnu99 -I/usr/share/R/include -DRSQLITE_USE_BUNDLED_SQLITE
-DSQLITE_ENABLE_RTREE -DSQLITE_ENABLE_FTS3 -DSQLITE_ENABLE_FTS3_PARENTHESIS
-DSQLITE_SOUNDEX -DSQLITE_MAX_VARIABLE_NUMBER=4
-DSQLITE_MAX_COLUMN=3 -DTHREADSAFE=0 -fpic  -O3 -pipe  -g -c
RS-DBI.c -o RS-DBI.o
gcc -std=gnu99 -I/usr/share/R/include -DRSQLITE_USE_BUNDLED_SQLITE
-DSQLITE_ENABLE_RTREE -DSQLITE_ENABLE_FTS3 -DSQLITE_ENABLE_FTS3_PARENTHESIS
-DSQLITE_SOUNDEX -DSQLITE_MAX_VARIABLE_NUMBER=4
-DSQLITE_MAX_COLUMN=3 -DTHREADSAFE=0 -fpic  -O3 -pipe  -g -c
RS-SQLite.c -o RS-SQLite.o
gcc -std=gnu99 -I/usr/share/R/include -DRSQLITE_USE_BUNDLED_SQLITE
-DSQLITE_ENABLE_RTREE -DSQLITE_ENABLE_FTS3 -DSQLITE_ENABLE_FTS3_PARENTHESIS
-DSQLITE_SOUNDEX -DSQLITE_MAX_VARIABLE_NUMBER=4
-DSQLITE_MAX_COLUMN=3 -DTHREADSAFE=0 -fpic  -O3 -pipe  -g -c
param_binding.c -o param_binding.o
gcc -std=gnu99 -I/usr/share/R/include -DRSQLITE_USE_BUNDLED_SQLITE
-DSQLITE_ENABLE_RTREE -DSQLITE_ENABLE_FTS3 -DSQLITE_ENABLE_FTS3_PARENTHESIS
-DSQLITE_SOUNDEX -DSQLITE_MAX_VARIABLE_NUMBER=4
-DSQLITE_MAX_COLUMN=3 -DTHREADSAFE=0 -fpic  -O3 -pipe  -g -c
sqlite-all.c -o sqlite-all.o
sqlite-all.c:1:35: warning: extra tokens at end of #ifdef directive
[enabled by default]
^Cmake: *** wait: No child processes.  Stop.
make: *** Waiting for unfinished jobs
make: *** wait: No child processes.  Stop.
** R
^C
* removing /home/ubuntu/R/i686-pc-linux-gnu-library/2.14/RSQLite

ubuntu@ip-10-99-65-94:~/R/kaggle/diabetes$
ubuntu@ip-10-99-65-94:~/R/kaggle/diabetes$
ubuntu@ip-10-99-65-94:~/R/kaggle/diabetes$
ubuntu@ip-10-99-65-94:~/R/kaggle/diabetes$ R CMD INSTALL
RSQLite_0.11.1.tar.gz
* installing to library /home/ubuntu/R/i686-pc-linux-gnu-library/2.14
* installing *source* package RSQLite ...
** package RSQLite successfully unpacked and MD5 sums checked
checking for gcc... gcc -std=gnu99
checking for C compiler default output file name... a.out
checking whether the C compiler works... yes
checking whether we are cross compiling... no
checking for suffix of executables...
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc -std=gnu99 accepts -g... yes
checking for gcc -std=gnu99 option to accept ISO C89... none needed
checking how to run the C preprocessor... gcc -std=gnu99 -E
checking for gcc... (cached) gcc -std=gnu99
checking whether we are using the GNU C compiler... (cached) yes
checking whether gcc -std=gnu99 accepts -g... (cached) yes
checking for gcc -std=gnu99 option to accept ISO C89... (cached) none needed
checking for library containing fdatasync... none required
configure: creating ./config.status
config.status: creating src/Makevars
** libs
gcc -std=gnu99 -I/usr/share/R/include -DRSQLITE_USE_BUNDLED_SQLITE
-DSQLITE_ENABLE_RTREE -DSQLITE_ENABLE_FTS3 -DSQLITE_ENABLE_FTS3_PARENTHESIS
-DSQLITE_SOUNDEX -DSQLITE_MAX_VARIABLE_NUMBER=4
-DSQLITE_MAX_COLUMN=3 -DTHREADSAFE=0 -fpic  -O3 -pipe  -g -c
RS-DBI.c -o RS-DBI.o
gcc -std=gnu99 -I/usr/share/R/include -DRSQLITE_USE_BUNDLED_SQLITE
-DSQLITE_ENABLE_RTREE -DSQLITE_ENABLE_FTS3 -DSQLITE_ENABLE_FTS3_PARENTHESIS
-DSQLITE_SOUNDEX -DSQLITE_MAX_VARIABLE_NUMBER=4
-DSQLITE_MAX_COLUMN=3 -DTHREADSAFE=0 -fpic  -O3 -pipe  -g -c
RS-SQLite.c -o RS-SQLite.o
gcc -std=gnu99 -I/usr/share/R/include -DRSQLITE_USE_BUNDLED_SQLITE
-DSQLITE_ENABLE_RTREE -DSQLITE_ENABLE_FTS3 -DSQLITE_ENABLE_FTS3_PARENTHESIS
-DSQLITE_SOUNDEX -DSQLITE_MAX_VARIABLE_NUMBER=4
-DSQLITE_MAX_COLUMN=3 -DTHREADSAFE=0 -fpic  -O3 -pipe  -g -c
param_binding.c -o param_binding.o
gcc -std=gnu99 -I/usr/share/R/include -DRSQLITE_USE_BUNDLED_SQLITE

[R] enquiry

2012-07-15 Thread Karan Anand

hi,
 i am new to r ,i have a xlsx data with me with 12 sheet in it  and
need to convert it to csv  first and then need to  convert it into time
series ,so if u can pls guide me a little how to do it.


Regards
karan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Error in as.xts

2012-07-15 Thread Yolande Tra

Hi
I got the following error using as.xts
Error in xts(x, order.by = order.by, frequency = frequency, ...) :
  NROW(x) must match length(order.by)
Here is how the data looks like
 d1 - read.csv(file.path(dataDir,AppendixA-FishCountsTable-2009.csv),
as.is=T)
 d1[1:3,]
  dive_id   date  time  species count sizesite depth level
TRANSECT VIS_M
1  62 10/12/2009 12:44 E. lateralis 2   15 Hopkins15 B
   1 4
2  62 10/12/2009 12:44 E. lateralis 1   22 Hopkins15 B
   1 4
3  62 10/12/2009 12:44 E. lateralis 1   25 Hopkins15 B
   1 4
 diveData_2009 - as.xts( d1,order.by=as.POSIXct(strptime(paste(d$date,
d$TIME ), %d/%m/%Y %H:%M) ))
Error in xts(x, order.by = order.by, frequency = frequency, ...) :
  NROW(x) must match length(order.by)

I could not figure out how to correct it
Thank you for your help
Yolande

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

82 matches

Mail list logo