[R] grouping by consecutive integers

2006-07-24 Thread Kevin J Emerson
Hello R-helpers!

I have a question concerning extracting sequence information from a
vector.  I have a vector (representing the bins of a time series where
the frequency of occurrences is greater than some threshold) where I
would like to extract the min, median and max of each group of
consecutive numbers.

For Example:

tmp - c(24,25,29,35,36,37,38,39,40,41,42,43,44,45,46,47,68,69,70,71)

I would like to have the max,min,median of the following groups:

24,25
29
35,36,37,38,39,40,41,42,43,44,45,46,47,
68,69,70,71

I would like to be able to perform this for many time series so an
automated process would be nice.  I am hoping to use this as a peak
detection protocol.

Any advice would be greatly appreciated,
Kevin

-
-
Kevin J Emerson
Center for Ecology and Evolutionary Biology
1210 University of Oregon
Eugene, OR 97403
USA
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] grouping by consecutive integers

2006-07-24 Thread Kevin J Emerson
Let me clarify one thing that I dont think I made clear in my posting.
I am looking for the max, min and median of the indicies, not of the
time series frequency counts.  I am looking to find the max, min, and
median time of peaks in a time series, so i am looking for the
information concerning that. 

so mostly my question is how to extract the information of max, min, and
median of sequential numbers in a vector.  I will reword my original
posting below.

  Hello R-helpers!
 
  I have a question concerning extracting sequence information from a
  vector.  I have a vector (representing the bins of a time series where
  the frequency of occurrences is greater than some threshold) where I
  would like to extract the min, median and max of each group of
  consecutive numbers in the index vector..
 
  For Example:
 
  tmp - c(24,25,29,35,36,37,38,39,40,41,42,43,44,45,46,47,68,69,70,71)
 
  I would like to have the max,min,median of the following groups:
 
  24,25 - max = 25, min = 24 median = 24.5
  29 max=min=median = 29
  35,36,37,38,39,40,41,42,43,44,45,46,47, max = 45 min = 35 etc...
  68,69,70,71
 
  I would like to be able to perform this for many time series so an
  automated process would be nice.  I am hoping to use this as a peak
  detection protocol.
 
  Any advice would be greatly appreciated,
  Kevin
 
  -
  -
  Kevin J Emerson
  Center for Ecology and Evolutionary Biology
  1210 University of Oregon
  Eugene, OR 97403
  USA
  [EMAIL PROTECTED]
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Getting rid of for loops

2006-07-16 Thread Kevin J Emerson
Hello R-users!

I have a style question.  I know that for loops are somewhat frowned upon in
R, and I was trying to figure out a nice way to do something without using
loops, but figured that i could get it done quickly using them.  I am now
looking to see what kind of tricks I can use to make this code a bit more
aesthetically appealing to other R users (and learn something about R along
the way...).

Here's the problem.  I have a data.frame with 4 columns of dependent
variables and then ~35 columns of predictor variables (factors) [for those
interested, it is a qtl problem, where the predictors are genotypes at DNA
markers and the dependent variable is a biological trait].  I want to go
through all pairwise combinations of predictor variables and perform an
anova with two predictors and their interaction on a given dependent
variable.  I then want to store the p.value of the interaction term, along
with the predictor variable information.  So I want to end up with a
dataframe at the end with the two variable names and the interaction p value
in each row, for all pairwise combinations of predictors.  I used the
following code:

# qtl is the original data.frame, and my dependent var in this case is
# qtl$CPP.

marker1 - NULL
marker2 - NULL
p.interaction - NULL
for ( i in 5:40) {   # cols 5 - 41 are the predictor factors
for (j in (i+1):41) {
marker1 - rbind(marker1,names(qtl)[i])
marker2 - rbind(marker2,names(qtl)[j])
tmp2 - summary(aov(tmp$CPP ~ tmp[,i] * tmp[,j]))[[1]]
p.interaction - rbind(p.interaction, tmp2$Pr(F)[3])
}
}

I have two questions:
(1) is there a nicer way to do this without having to invoke for loops?
(2) my other dependent variables are categorical in nature.  I need
basically the same information - I am looking for information regarding the
interaction of predictors on a categorical variable.  Any ideas on what
tests to use? (I am new to analysis of all-categorical data).

Thanks in advance!
Kevin

--
--
Kevin Emerson
Center for Ecology and Evolutionary Biology
1210 University of Oregon
Eugene, OR 97403
USA
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] logistic regression asymptote problem

2005-07-05 Thread Kevin J Emerson
R-helpers,

I have a question about logistic regressions.

Consider a case where you have binary data that reaches an asymptote
that is not 1, maybe its 0.5.  Can I still use a logistic regression to
fit a curve to this data?  If so, how can I do this in R.  As far as I
can figure out, using a logit link function assumes that the asymptote
is at y = 1.

An example.  Consider the following data:

tmp -
structure(list(x = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 
14), yes = c(0, 0, 0, 2, 1, 14, 24, 15, 23, 18, 22, 20, 14, 17
), no = c(94, 101, 95, 80, 81, 63, 51, 56, 30, 38, 31, 18, 21, 
20)), .Names = c(x, yes, no), row.names = c(1, 2, 3, 
4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14), class =
data.frame)

where x is the independent variable, and yes and no are counts of
events.  plotting the data you can see that the data seem to reach an
asymptote at around y=0.5.  using glm to fit a logistic regression it is
easily seen that it does not fit well.

tmp.glm - glm(cbind(yes,no) ~ x, data = tmp, family = binomial(link =
logit))
plot(tmp.glm$fitted, type = l, ylim = c(0,1))
par(new=T)
plot(tmp$yes / (tmp$yes + tmp$no), ylim = c(0,1))

Any suggestions would be greatly appreciated.

Cheers,
Kevin

-- 


Kevin J Emerson
Center for Ecology and Evolutionary Biology
1210 University of Oregon
University of Oregon
Eugene, OR 97403
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] frequency tables

2005-06-20 Thread Kevin J Emerson
R-masters,

I have a problem that I have been working on for a while and it seems
that there may be a simple solution that I have yet to figure out, so I
thought that I would venture to post to the help list.

Let's say there was a data.frame with three vectors, two that are
factors identifying the data, and one that holds the frequency of
occurrence (the events are binary, yes or no).  I would like to perform
logistic regression on this data, and it seems that I need a vector of
0s and 1s for input into lrm.  How might I convert between a frequency
table and a vector of binary data while still maintaining all identifier
information?

I have thought about using the rep command over and over again and
basically building the data.frame by hand but that seems long and
tedious.  Is there a quick and dirty way of doing this?

Thanks in advance!
Kevin
-- 


Kevin J Emerson
Center for Ecology and Evolutionary Biology
1210 University of Oregon
University of Oregon
Eugene, OR 97403
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html