date:20111021

[R] cph/nomogram Design/RMS package hazard ratio: interquartile vs per unit

2011-10-21 Thread renee

Hello,

I am constructing a nomogram using cph and nomogram commands in Dr.
Harrell's Design/RMS package. The HR that I obtain for dichotomous and
categorical variables are identical to those that I obtain using STATA
stcox. However, the inter-quartile HR I obtain for continuous variables is
obviously different, since STATA gives me HR for each unit (year,
centimeter, etc) like coxph would give. My question is if this will effect
the output of the nomogram. I'm assuming that nomogram is constructed using
hazard between each unit rather than quartiles - is this true? 

Also, I've found that I do not need to create indicator variables for my
categorical variables when I use cph. Is this also correct? 

I appreciate your feedback. Thank you.

~Renee

--
View this message in context: 
http://r.789695.n4.nabble.com/cph-nomogram-Design-RMS-package-hazard-ratio-interquartile-vs-per-unit-tp3923896p3923896.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Foreach (doMC)

2011-10-21 Thread Jannis


Dear list members, dear Jay,

Well, I personally do not care about Revolutions Analytics selling their 
products as this is also included into the idea of many open source 
licences. Especially as Revolutions provide their packages to the 
community and its is everybodies personal choice to buy their special R 
version.


I was just wondering about this issue as usually most questions on 
r-help are answered pretty soon and by many different people and I had 
the impression that this is not the case for posts regarding the 
foreach/doMC/doSMP etc packages. This may, however, be also due to the 
probably limited use of these packages for most users who do not need 
these high performance computing things. Or it was just my personal 
perception or pure chance.


Thanks however, to the authors of such packages! They were of great help 
to me on several ocasions and I have deep respect for everybody devoting 
his time to open source software!


Jannis



On 10/19/2011 01:26 PM, Jay Emerson wrote:

P.S. Is there any particular reason why there are so seldom answers to posts 
regarding foreach and all these doMC/doSMP packages ?  Do so few people use 
these packages or does this have anything to do with the commercial origin of 
these packages?

Jannis,

An interesting question.  I'm a huge fan of foreach and the parallel
backends, and have used foreach in some of my packages.  It leaves the
choice of backend to the user, rather than forcing some environment.
If you like multicore, great -- the package doesn't care.  Someone
else may use doSNOW.  No problem.

To answer your question, foreach was originally written by (primarily,
at least) Steve Weston, previously of REvolution Computing.  It, along
with some of the parallel backends (perhaps all at this point, I'm out
of touch) are available open-source.  Hence, I'd argue that the
commercial origin is a moot point -- it doesn't matter, it will
always be available, and it's really useful.  Steve is no longer with
REvolution, however, and I can't speak for the responsiveness/interest
of current REvolution folks on this point.  Scanning R-help daily for
things relating to my own packages is something I try to do, but it
doesn't always happen.

I would like to think foreach is widely used -- it does have a growing
list of reverse depends/suggests.  And was updated as recently as last
May, I just noticed.
http://cran.r-project.org/web/packages/foreach/index.html

Jay



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] identifying groups in xyplot

2011-10-21 Thread wisc_maier

There's a great tutorial online that helped me out a lot - Lattice and Other
Graphics in R, by J H Maindonald at the Centre for Mathematics and Its
Applications at Australian National University.

http://maths.anu.edu.au/~johnm/r-book/2edn/xtras/rgraphics.pdf

I gave my lattice object a name, and then I was able to superimpose changes
via the update fcn. I'm sure there are many other ways to do this, but this
was very simple to follow and delivered results quickly.

Fieldplots = xyplot(Hill.s.diversity ~ Year| Field, group=Management,
layout=c(2,3),
data=summer_pr_avg,
auto.key=TRUE)
Fieldplots
update ( Fieldplots,
main=Hill's evenness by Field, June 09-11, 
par.settings = simpleTheme (pch=c(1 ,3 ,4)))

--
View this message in context: 
http://r.789695.n4.nabble.com/identifying-groups-in-xyplot-tp3922985p3923338.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Scatterplot with the 3rd dimension = color?

2011-10-21 Thread Kerry

If it would help get any assistance with my issue, here's another
method I'm trying (using R sample data):

ggplot(mtcars, aes(disp)) +
  geom_point(aes(y = mpg, colour = qsec))+
scale_colour_gradient(low=yellow, high=green)+
  geom_point(aes(y = cyl, colour = qsec))+
scale_colour_gradient(low=red, high=blue)

What I want is the var mpg to be colored by the var qsec from
yellow to green and then the var cyl to be colored by the var qsec
from red to blue. Instead, both colors end up being from red to
blue.

Thanks again,
kb

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Aggregating data help

2011-10-21 Thread F Mai

check this out
http://www.r-bloggers.com/pivot-tables-in-r/

--
View this message in context: 
http://r.789695.n4.nabble.com/Aggregating-data-help-tp3923138p3923397.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Vegan: Anova.CCA accessing original data using option by=

2011-10-21 Thread Steve Pawson

Dear Jari,

Thank you for your quick reply and the time you have spent assisting with this 
problem. Indeed the alias tool identifies one variable that when removed from 
the capscale model solves the problem.

Once again greatly appreciate your assistance.

Regards

Steve Pawson
Scientist (Entomology)
Scion
Forestry Rd, P.O. Box 29-237, Christchurch, New Zealand
DDI +64 (0)3 3642987 Ext 4832
Cell +64 (0)27 4400727
www.scionresearch.comhttp://www.scionresearch.com/

[cid:image001.png@01CC8FF1.EA1EA930]

From: Jari Oksanen [via R] [mailto:ml-node+s789695n3921456...@n4.nabble.com]
Sent: Thursday, 20 October 2011 11:18 p.m.
To: Steve Pawson
Subject: Re: Vegan: Anova.CCA accessing original data using option by=

Steve Pawson Steve.Pawson at scionresearch.com writes:


 My apologies for the delay in responding to your request for further
information I have been travelling for
 work since you replied and have only just returned to email contact.

 The output from the traceback is as follows
 # This is the capscale model that I called
  beetlecap -capscale(log(beetles+1) ~ size + Clearfell + Absolute.Distance+
Distance_from_edge+
 clearfell.harvest_area + Canopy.Cover + X500mnative + Litter3 + X500mexotic +
X5000exotic +
 Condition(AdjLong + AdjLat + AdjLat.2 + AdjLat.2.long + AdjLong.3), environ,
distance = bray)


 This is the ANOVA by margin option with the error
  anova(beetlecap, by=margin)
 Error in dimnames(x) - dn :
   length of 'dimnames' [2] not equal to array extent

 Corresponding traceback
  traceback()
 9: `colnames-`(`*tmp*`, value = c(CAP1, CAP0))
 8: capscale(formula = log(beetles + 1) ~ size + Clearfell + Absolute.Distance 
 +
Distance_from_edge + clearfell.harvest_area + Canopy.Cover +
X500mnative + Litter3 + X500mexotic + X5000exotic + Condition(AdjLong +
AdjLat + AdjLat.2 + AdjLat.2.long + AdjLong.3) + Condition(size +
Clearfell + Absolute.Distance + Distance_from_edge +
clearfell.harvest_area +
Canopy.Cover + Litter3 + X500mexotic + X5000exotic + AdjLong +
AdjLat + AdjLat.2 + AdjLat.2.long + AdjLong.3), data = environ,
distance = bray)
[...clip...]

Dear Steve Pawson,

With the help of this message I was able to construct an example that gives
the same error message -- this does not prove that the cause of the problem
is the same, but it is possible.

It may be that your *huge* model has redundant variables that cannot be
analysed in marginal test: the other variables explain all, and the marginal
effect of some variables is zero. With that a high number of variables as you
have, this is very likely. It seems that capscale() cannot cope with this case.

I fixed capscale in http://vegan.r-forge.r-project.org and now it handles
smoothly these redundant variables (skips them in permutation test, and
reports df=0). From your point of view it may be unfortunate that I released
a new version of vegan a couple of hours before checking R-News mail, and
therefore this fix is not yet in the next release, and as we just had a release
we probably (hopefully) will not have a new revision very soon. So your
choices are either to use the vegan version in R-Forge (which must be at
least r1958) or simplify your model so that you don't have redundant variables.
One way of achieving this is to use command

alias(beetlecap, names = TRUE)

which will list the names of the variables that cannot be analysed. You
can remove these variables without influencing your fitted model, because
they really are redundant variables.

Cheers, Jari Oksanen

__
[hidden email]/user/SendEmail.jtp?type=nodenode=3921456i=0 mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


If you reply to this email, your message will be added to the discussion below:
http://r.789695.n4.nabble.com/Vegan-Anova-CCA-accessing-original-data-using-option-by-margin-tp3893005p3921456.html
To unsubscribe from Vegan: Anova.CCA accessing original data using option 
by=margin, click 
herehttp://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=3893005code=U3RldmUuUGF3c29uQHNjaW9ucmVzZWFyY2guY29tfDM4OTMwMDV8MzMzNTcwNTc1.




This e-mail and any attachments may contain information ...{{dropped:10}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Collapse UP a dendrogram

2011-10-21 Thread jshrager

I created a dendrogram (ddg0) using hclust in the usual way. I want to
collapse UP the tree in various ways, that is, from the leaves up to the
root. Optimally, I would give the id of a member of a final split in ddg0,
and return a new ddg1 with that split collapsed. Alt, I could give a depth
to collapse up to (such that ddg1 would have n fewer levels than ddg0). That
sort of thing. I could, of course, program this myself, but it seems like
something that is so obviously needed that there is very likely to be a
package that does it already. Is there? TIA 'Jeff

--
View this message in context: 
http://r.789695.n4.nabble.com/Collapse-UP-a-dendrogram-tp3923907p3923907.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] 'Apply' giving me errors

2011-10-21 Thread kickout

So i have a simple function:

bmass=function(y){
weight=y$WT*y$MSTR
return(bio)
}

And want to apply to a whole bunch of rows in my data.frame:

final1=apply(final,1,yldbu)


BUT...recieve the following error:
Error in y$WT : $ operator is invalid for atomic vectors


However when i try:
 final[1,]$WT*final[1,]$MSTR
[1] 156.3


It gives me the correct answerwhat is apply not liking in my code?

Thanks



--
View this message in context: 
http://r.789695.n4.nabble.com/Apply-giving-me-errors-tp3923880p3923880.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] plotting average effects.

2011-10-21 Thread poliscigradstudent

hi...  i am a phd student using r.  i am having difficulty plotting average
effects.  admittedly, i am not really understanding what each of the
commands mean so when i get the error i am not sure where the issue is.

here is my code...  i will include the points at which there are errors

 dat2 - dat3 - dat
 dat2$popc100 - dat2$popc100 + 1000
 dat2$popc100[which(dat2$popc100  max(dat$popc100))] - max(dat$popc100)
 dat3$popc100 - dat$popc100 - 1000
 dat3$popc100[which(dat3$popc100  min(dat$popc100))] - min(dat$popc100)
 pred1 - predict(mod, type=response)
 pred2 - predict(mod, newdata=dat2, type=response)
 pred3 - predict(mod, newdata=dat3, type=response)
 pop.group - cut(dat$popc100kpc, breaks=quantile(dat$popc100kpc,
 seq(0,1,by=.3)), include.lowest=T)
 means - by(cbind(pred1, pred2, pred3), list(pop.group), apply, 2, mean)
 means - do.call(rbind, means)


 par(mar=c(7,4,4,2))
 plot(c(1,10), range(c(means)), type=n, xlab=, 
+   ylab=Predicted Probability, axes=F)
 plot(c(1,10), range(c(means)), type=n, xlab=pop pc by 100k, 
+   ylab=Predicted Probability, axes=F)
 arrows(1:10, means[,1], 1:10, means[,2], code=2, length=.1)
 arrows(1:10, means[,1], 1:10, means[,3], code=2, length=.1, col=red)
 points(1:10, means[,1], pch=16)
Error in xy.coords(x, y) : 'x' and 'y' lengths differ


as i understand it, i need to change the means[,1]  i have tried a few
combos and i am not getting anywhere...  

further, my arrows are huge and points are not appearing in my plot.

is there anywhere i can find a break down of each of these commands and what
each part means?  i understand the lengths, colors, xlab, ylab, etc etc.

thanks in advance for any insight you can give me.


http://r.789695.n4.nabble.com/file/n3923982/effplot_copy.jpg 

--
View this message in context: 
http://r.789695.n4.nabble.com/plotting-average-effects-tp3923982p3923982.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plotting average effects.

2011-10-21 Thread poliscigradstudent

let me clarify, i understand what differing x, y lengths mean. i understand
the concept of average effects, etc.  i just don't understand how one would
fix it.

thanks.

--
View this message in context: 
http://r.789695.n4.nabble.com/plotting-average-effects-tp3923982p3924003.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] geom_tile rendering problems

2011-10-21 Thread rayzlor

Hi, I'm trying to overlay a geographical map with a heat map by following the
directions on
http://pages.stern.nyu.edu/~achinco/programming_examples/Example__PlotGeographicDensity.html.

However, the smaller my zoom level (the farther I zoom out), the more white
horizontal lines I have interspersed in the tiled data after calling
geom_tile.  Is there any way around this?  I tried setting the image to
panel.background, but found it impossible to scale back to the original heat
map matrix.

--
View this message in context: 
http://r.789695.n4.nabble.com/geom-tile-rendering-problems-tp3924100p3924100.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Ordered probit model -marginal effects and relative importance of each predictor-

2011-10-21 Thread poliscigradstudent

late to the game but maybe this will help:

https://docs.google.com/viewer?a=vq=cache:8PhCkZxP9zQJ:www.quantoid.net/Effects_package_4up.pdf+plot+effects+of+variables+in+R+nonlinear+arrowshl=engl=uspid=blsrcid=ADGEEShUKOEcifzuWGxWvakh0yD4KtnLgBhLFvX5cCAkwewyQ75uznTw1OYybx6vrGuJflgMw6QYKGwuXQViNGZCh_lt8H4DAqKNPxI8y2hYTQJTyaMD4tZ0DKrYMNGtIY3B34qp-LBXsig=AHIEtbTdt10fNCOPXtg5nPSs85NqPFlAvA

--
View this message in context: 
http://r.789695.n4.nabble.com/Ordered-probit-model-marginal-effects-and-relative-importance-of-each-predictor-tp3773504p3924287.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] re coercing data frame rows to character: Am I right that this is a bug?

2011-10-21 Thread andrewH

Dear Folks--
All this seems to me to behave the way you expect, recognising that column b
is a factor:
 AA - data.frame(a=3:4, b=c('x', 'y'))
 AA[1,]
  a b
1 3 x
 as.numeric(AA[1,])
[1] 3 1
 AA[,2]
[1] x y
Levels: x y
 as.numeric(AA[,2])
[1] 1 2
 as.character(AA[,2])
[1] x y

But this seems to me to be wrong:
 as.character(AA[1,])
[1] 3 1

Shouldn't it be:
[1] 3 x
to be consistant with the normal pattern of coercing factors to character
values?
If it is a bug, is this the right place to post it?

sincerely, andrewH

--
View this message in context: 
http://r.789695.n4.nabble.com/re-coercing-data-frame-rows-to-character-Am-I-right-that-this-is-a-bug-tp3924449p3924449.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] stop R from rounding

2011-10-21 Thread Martin Maechler

 David Winsemius dwinsem...@comcast.net
 on Thu, 20 Oct 2011 01:51:28 -0400 writes:

 On Oct 19, 2011, at 11:29 PM, Alyse wrote:

 Hello,
 
 I have a column in a data frame that need to be 10 digits long.  As  
 such:
 
 Decimal.Year
 1  1994.25997
 2  1994.26020
 
 However, R keeps rounding the digits.  As such:
 
 Decimal.Year
 1  1994.260
 2  1994.260
 
 *Is there any way to stop this from happening?*
 
 Here is how I created the data frame:
 
 x - read.table('bats_1994_CTD.txt')
 colnames(x) -
 c 
 ('Cruise 
 ','Dec.Year','Lat.N','Long.W','Press','Depth','Temp','Sal','Oxy')
 date - subset(x,select=c(Dec.Year), (Depth201)  (Depth199))
 datelist - list(date$Dec.Year)
 temp - subset(x,select=c(Temp), (Depth201)  (Depth199))
 tempmean - aggregate(temp,by=datelist,FUN=mean)
 tempframe - data.frame(tempmean) #the first column of this  
 dataframe is the
 one that I don't want R to round

 R is not rounding. It is displaying with less than full precision. You  
 can control that with format or sprintf or formatC.

Well, or more simply in such situations by

  options(digits = 10) # if it's  10 (significant) digits you want
   # uses 10 (sig..) digits *FROM NOW ON*
   
or, if it's just for this one printing,
instead of saying
  
x

which is *equivalent* to  print(x), use

   print(x, digits = 10)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] use of segments in PLS

2011-10-21 Thread arunkumar1111

How to use the segments in the PLS

fit1 - mvr(formula=Y~X1+X2+X3+X4+x5++x27, data=Dataset, comp=5,segment
=7 ) 

here when i use segments,the error was like this

rror in mvrCv(X, Y, ncomp, method = method, scale = sdscale, ...) : 
  argument 7 matches multiple formal arguments


Please help


--
View this message in context: 
http://r.789695.n4.nabble.com/use-of-segments-in-PLS-tp3924397p3924397.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] bar plot issues

2011-10-21 Thread Uwe Ligges




On 20.10.2011 22:29, Henri-Paul Indiogine wrote:

Hi Uwe!

2011/10/20 Uwe Liggeslig...@statistik.tu-dortmund.de:

arrange it outside by, e.g. increasing the size of margins (see argument
mar in ?par) and place a separate legend (see ?legend) into the margins
(see xps argument in ?par).


I could not find 'xps',  do you mean 'xpd'?


Yes, sorry, a typo.

Best,
Uwe




This is what I have so far:


par(mar=c(5.1,4.1,4.1,12.1))
barplot(t(file.codes), beside = FALSE, legend = FALSE, main=test stacked bar plot, 
xlab=documents, ylab=number of codes, col=rainbow(ncol(file.codes)), names.arg = 
rep(NA, nrow(file.codes)))


danke,
Henri-Paul



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R square and F - stats in PLS

2011-10-21 Thread arunkumar1111

In the lm function the summary(lmobject) we have adjusted.r square and f
statistics

Do we have similar to the pls package and how to get it

--
View this message in context: 
http://r.789695.n4.nabble.com/R-square-and-F-stats-in-PLS-tp3924484p3924484.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] POT package

2011-10-21 Thread Amina Shahzadi

Hi Sir

It is requested to please tell the reason why the range of c(20945, 209547)
is used in this function

 npy - length(events1[, obs])/(diff(range(ardieres[, time],

+ na.rm = TRUE)) - diff(ardieres[c(20945, 20947), time]))
Please tell logic.

Looking for quick response.

Regards


-- 
*Amina Shahzadi*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] stacked plot

2011-10-21 Thread Dennis Murphy

It appears that your object is currently a matrix. Here's a toy
example to illustrate how to get a stacked bar chart in ggplot2:

library('ggplot2')
m - matrix(1:9, ncol = 3, dimnames = list(letters[1:3], LETTERS[1:3]))
(d - as.data.frame(as.table(m)))
  Var1 Var2 Freq
1aA1
2bA2
3cA3
4aB4
5bB5
6cB6
7aC7
8bC8
9cC9

ggplot(d, aes(x = Var1, y = Freq, fill = Var2)) +
   geom_bar(position = 'stack', stat = 'identity') +
   labs(x = 'Variable 1', y = 'Frequency', fill = 'Group') +
   scale_fill_manual(values = c('A' = 'red', 'B' = 'blue', 'C' = 'green'))

This plot uses Var1 as the x-variable, Freq as the response and Var2
as the variable whose frequencies are to be stacked, distinguished by
fill color. position = 'stack' designates the stacking while stat =
'identity' indicates that the y variable Freq should be used to
represent the counts.
labs()  designates the labels for each axis; the fill = label
indicates the legend title for the fill colors. Finally, the
scale_fill_manual() function is used to manually assign specific
colors to levels of the fill variable Var2. The scale_fill_manual()
code could also have been written as

... +
scale_fill_manual(breaks = levels(d$Var2), values = c('red', 'blue', 'green'))

with the same result.

HTH,
Dennis

On Thu, Oct 20, 2011 at 10:08 PM, Henri-Paul Indiogine
hindiog...@gmail.com wrote:
 Hi!

 I am trying to use ggplot2 to create a stacked bar plot.  Previously I
 tried using barplot() but gave up because of problems with the
 positioning of the legend and other appearance problems.   I am now
 trying to learn ggplot2 and use it for all the plots that I need to
 create for my dissertation.

 I am able to create normal bar plots using ggplot2, but I am stomped
 with the stacked bar plots.

 This works:

 barplot(t(file.codes), beside = FALSE)

 the data.frame file.codes looks like this .

        code.1 code.2 code.3 code.4 code.5 
 file.1      2       0         0         5        4      
 file.2      3       18       1         0        2      
 

 I would like each file to be a bar and then each code stacked for each
 file.    By transposing the file.codes data.frame barplot() will allow
 me to do so.   I am trying to obtain the same result in ggplot2  but i
 think that qplot wants the data to be like this:

 file.1 code.1  2
 file.1 code.2  0
 file.1 code.3  0
 file.1 code.4  5
 file.1 code.5  4
 file.2 code.1  3
 file.2 code.2  18
 

 I think that I need to use the package reshape, but I am not sure
 whether to use cast(), melt(), or recast() and how to set up the
 function.

 Thanks,
 Henri-Paul


 --
 Henri-Paul Indiogine

 Curriculum  Instruction
 Texas AM University
 TutorFind Learning Centre

 Email: hindiog...@gmail.com
 Skype: hindiogine
 Website: http://people.cehd.tamu.edu/~sindiogine

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] use of segments in PLS

2011-10-21 Thread Bjørn-Helge Mevik

arunkumar akpbond...@gmail.com writes:

 How to use the segments in the PLS

 fit1 - mvr(formula=Y~X1+X2+X3+X4+x5++x27, data=Dataset, comp=5,segment
 =7 ) 

 here when i use segments,the error was like this

 rror in mvrCv(X, Y, ncomp, method = method, scale = sdscale, ...) : 
   argument 7 matches multiple formal arguments

This cannot be true.  mvr does not call mvrCv unless you give it the
argument validation = CV or validation = LOO.

Anyway, the argument is segments, not segment, which - as the error
message says - matches multiple arguments, in this case segment.type.

-- 
Regards,
Bjørn-Helge Mevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R square and F - stats in PLS

2011-10-21 Thread Bjørn-Helge Mevik

arunkumar akpbond...@gmail.com writes:

 In the lm function the summary(lmobject) we have adjusted.r square and f
 statistics

 Do we have similar to the pls package and how to get it

No.  Both of these requires theory about the model that doesn't exist
for PLSR.  (I should note that there have been published a couple of
generalisations of the degrees of freedom to general regression models,
and these could be used to calculate an adjusted R^2.  However, they
have not been implemented in the pls package.)

It seems you would like to use PLSR the way you use OLS, with classical
hypothesis tests and performance statistics.  This is not how PLSR is
usually applied, and there are few such tools.  The traditional/typical
focus amongst PLSR practicioners is much more on prediction performance
(RMSEP) and interpretation by plotting scores and loadings.

-- 
Regards,
Bjørn-Helge Mevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 'Apply' giving me errors

2011-10-21 Thread Uwe Ligges




On 21.10.2011 02:09, kickout wrote:

So i have a simple function:

bmass=function(y){
weight=y$WT*y$MSTR
return(bio)
}

And want to apply to a whole bunch of rows in my data.frame:

final1=apply(final,1,yldbu)


BUT...recieve the following error:
Error in y$WT : $ operator is invalid for atomic vectors


However when i try:

final[1,]$WT*final[1,]$MSTR

[1] 156.3


It gives me the correct answerwhat is apply not liking in my code?


Since apply passes the rows as vectors into your function, not as a 
data.frame of 1 row.


I woder why you need apply() at all, since
 final$WT * final$MSTR
should do.

Uwe Ligges











Thanks



--
View this message in context: 
http://r.789695.n4.nabble.com/Apply-giving-me-errors-tp3923880p3923880.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R code Error : Hybrid Censored Weibull Distribution

2011-10-21 Thread peter dalgaard


On Oct 20, 2011, at 21:25 , ritwi...@isical.ac.in wrote:

 Dear Sir/madam,
 
 I'm getting a problem with a R-code which calculate Fisher Information
 Matrix for Hybrid Censored Weibull Distribution. My problem is that:
 
 when I take weibull(scale=1,shape=2) { i.e shape1} I got my desired
 result but when I take weibull(scale=1,shape=0.5) { i.e shape1} it gives
 error : Error in integrate(int2, lower = 0, upper = t) : the integral is
 probably divergent. I could not found any theoretical interpretation of
 it. I'm sending the code :

The code doesn't work...

 output=f3(5,10)
Error in f(x, ...) : object 'p' not found

Furthermore, if I guess p=.5, lamda=1, n=10, the code doesn't even break:

 output=f3(5,10)
 output
[1] 1.155917

So what do you expect _us_ to do about it?

I strongly suspect that actually testing the code (in a clean R session) would 
have revealed issues causing you not to have to submit the post at all...

-pd

-- 
Peter Dalgaard, Professor
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] replicating SAS's proc rank procedure

2011-10-21 Thread riskcalc

Hi try this function ive written
it should be self explantory but let me know if you have any problems. 
I've only been using R for a few eeeks so apologies if its not the most
efficient!

rankit2-function(rankvar,cuts,data,factor) {
ranker-rankvar
ranker-0
range-c(1:cuts)
range2-range/cuts
range3-quantile(factor,range2)
over-length(factor)



for (i in 1:over){ 
for (j in 1:cuts) {
 
if (data[[i,1]]=range3[[j]])
{data[[i,3]]-j
##test-j
##print(j)
}
 if (data[[i,3]]0)  break
}
}
out2-data
return(out2)
}

cars$rank-0
try2-rankit2(rank,15,cars,cars$speed)
try2

all the best

Leigh
RCalc partner
www.RCalc.co.uk

--
View this message in context: 
http://r.789695.n4.nabble.com/replicating-SAS-s-proc-rank-procedure-tp820510p3924739.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R-help Digest, Vol 104, Issue 19

2011-10-21 Thread peter dalgaard


On Oct 21, 2011, at 09:01 , Martin Maechler wrote:

 ARE == Alex Ruiz Euler rruizeu...@ucsd.edu
on Wed, 19 Oct 2011 14:05:16 -0700 writes:
 
ARE Motion supported. Very.
 
ARE On Wed, 19 Oct 2011 15:40:14 +0200
ARE peter dalgaard pda...@gmail.com wrote:
 
 Argh!
 
 Someone please unsubscribe this guy?
 
 He did this over Summer too and still hasn't learned that 1
 recipients of R-help do not care whether he is out of office!
 
 -pd
 
 Well, there are hundreds like him.
 The only difference being that he speaks Hungarian..
 

You might filter on the Subject line being Re: [R] R-help Digest.*, with no 
attention to content. That has an obvious side effect, but maybe not a harmful 
one...

-pd


 Why?  I (as R-* mailing list site maintainer)
 have had (procmail) filters that automatically catch such 'out of office'
 messages, so the 10'000 readers don't have to get them.
 The current set of filters catches  a set of English, French,
 German,.. (and I don't know) messages
 So I have (many!!) filters like this:
 
 :0
 * ^Subject: (Re|Holiday|Vacation): .*[-A-za-z]+ Digest, Vol [1-9][0-9]*, 
 Issue [1-9][0-9]*
 {
  :0B
  * I( will not be reading.*\e?[-]?mail|.* away .* attend to your message 
 when I get)
  mlist-bounced.spool
 }
 
 ---
 but can't start doing that for Hungarian or Chinese or ...
 
 Martin

-- 
Peter Dalgaard, Professor
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Foreach (doMC)

2011-10-21 Thread Jannis


Jay,


sorry if my post was not precise enough. I simply wanted to point out 
that I personally have no problem at all with commercial R products as I 
have the free choice to use them or their open source alternatives. In 
addition Revolutions is supplying their packages for free to the R 
community which is great! I was purely curios whether other R users may 
have different opinions but as you are the only one replying I would 
imagine that this is no problem for most users. I will browse the list 
archive as you suggeted to get some impression on this.


So, it is probably time to close this post for not beating the dead 
horse? Thanks anyway, Jay, for your detailed explanations of the origin 
of these R packages!



Best
Jannis


On 10/21/2011 02:34 AM, Jay Emerson wrote:

Jannis,

I'm not complete sure I understand your first point, but maybe someone
from REvolution will weigh in.  Nobody is forcing anyone to purchase
any products, and there are attractive alternatives such as the CRAN R
and R Studio (to name two).  This issue has arisen many times of the
various lists and you are welcome to search the archives and read many
very intelligent, thoughtful opinions.

As for foreach, etc... if you have fairly focused questions
(preferably with a reproducible example if there is a problem) and if
you have done reading on examples available on using it, then you
might try joining the r-sig-...@r-project.org group.  Clearly there
are far more users of core R and hence mainstream questions on
r-help are likely to be answered more quickly (on average) than
specialized questions.

Regards,

Jay

On Thu, Oct 20, 2011 at 4:27 PM, Jannisbt_jan...@yahoo.de  wrote:

Dear list members, dear Jay,

Well, I personally do not care about Revolutions Analytics selling their
products as this is also included into the idea of many open source
licences. Especially as Revolutions provide their packages to the community
and its is everybodies personal choice to buy their special R version.

I was just wondering about this issue as usually most questions on r-help
are answered pretty soon and by many different people and I had the
impression that this is not the case for posts regarding the
foreach/doMC/doSMP etc packages. This may, however, be also due to the
probably limited use of these packages for most users who do not need these
high performance computing things. Or it was just my personal perception or
pure chance.

Thanks however, to the authors of such packages! They were of great help to
me on several ocasions and I have deep respect for everybody devoting his
time to open source software!

Jannis



On 10/19/2011 01:26 PM, Jay Emerson wrote:

P.S. Is there any particular reason why there are so seldom answers to
posts regarding foreach and all these doMC/doSMP packages ?  Do so few
people use these packages or does this have anything to do with the
commercial origin of these packages?

Jannis,

An interesting question.  I'm a huge fan of foreach and the parallel
backends, and have used foreach in some of my packages.  It leaves the
choice of backend to the user, rather than forcing some environment.
If you like multicore, great -- the package doesn't care.  Someone
else may use doSNOW.  No problem.

To answer your question, foreach was originally written by (primarily,
at least) Steve Weston, previously of REvolution Computing.  It, along
with some of the parallel backends (perhaps all at this point, I'm out
of touch) are available open-source.  Hence, I'd argue that the
commercial origin is a moot point -- it doesn't matter, it will
always be available, and it's really useful.  Steve is no longer with
REvolution, however, and I can't speak for the responsiveness/interest
of current REvolution folks on this point.  Scanning R-help daily for
things relating to my own packages is something I try to do, but it
doesn't always happen.

I would like to think foreach is widely used -- it does have a growing
list of reverse depends/suggests.  And was updated as recently as last
May, I just noticed.
http://cran.r-project.org/web/packages/foreach/index.html

Jay








__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Multiple factorial comparison LSD

2011-10-21 Thread Vera Marjorie E. Velasco

Please help.  I really like R and I have been looking at how to do LSD multiple 
comparison test with data that has more than one factor.  So far, I am 
unsuccessful. Please help!

Me

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] cullen and Frey graph in fitdistrplus

2011-10-21 Thread patraopedro

Hi,

I’ve came across something that I can’t explain and I would appreciate if
anyone could have a go at it.
In the library “fitdistplus” there is a function “descdist” to help on the
decision of choosing a distribution to fit. The same function also allows
bootstrap this is to take in account the uncertainty of the calculated
values.
If you run the  line below a few times you will find quite a big variation
each time you run it, in fact that’s what is made me suspicious. If I’m
bootstrapping always from the same distribution curve why do I have such
variation?
A second question that arises to my mind and this probably is due to the
lack knowledge on the subject. But if the Cullen and Frey graph is to help
on the decision on which distribution to choose is the line below just
giving me the uncertainty of the estimates of Kurtosis and Skewness and
should I ignore all the lines in the graph as I already fitted a weibull
distribution to the original data?  



library(fitdistrplus)
descdist(rweibull(1000,shape=13.74286,scale=38.07489),boot=1000)


Cheers
Patrao


--
View this message in context: 
http://r.789695.n4.nabble.com/cullen-and-Frey-graph-in-fitdistrplus-tp3924732p3924732.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 'Apply' giving me errors

2011-10-21 Thread Kenn Konstabel

On Fri, Oct 21, 2011 at 3:09 AM, kickout kyle.ko...@gmail.com wrote:
 So i have a simple function:

 bmass=function(y){
 weight=y$WT*y$MSTR
 return(bio)
 }

But this just returns bio and since an object with that name is not
defined in the function, it will be looked up in the global
environment (workspace) and if it's not there either, you will get
Error: object 'bio' not found. So even if you could fix the apply
issue it would still not work. But Uwe Ligges showed you don't need
apply to do what you seem to intend here, final$WT*final$MSTR should
work.

But if you do insist on apply for whatever reason then ... apply
converts X (the first argument) to a matrix so you can't use the $
operator any more. The column names are preserved though, so what you
could do is

bmass - function(y) y[WT] * y[MSTR]
apply(final, 1, bmass)

 And want to apply to a whole bunch of rows in my data.frame:

 final1=apply(final,1,yldbu)

What is yldbu? I suppose you meant the function you defined above?

Kenn



 BUT...recieve the following error:
 Error in y$WT : $ operator is invalid for atomic vectors


 However when i try:
 final[1,]$WT*final[1,]$MSTR
 [1] 156.3


 It gives me the correct answerwhat is apply not liking in my code?

 Thanks



 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Apply-giving-me-errors-tp3923880p3923880.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Scatterplot with the 3rd dimension = color?

2011-10-21 Thread Jim Lemon


On 10/21/2011 06:25 AM, Kerry wrote:

Can someone please help me out with this? The ggplot2 suggestion works
great but I've spent a few days trying to figure out how to plot 2
variables with it and I'm stuck. Here's my example code:
...


Hi Kerry,
This isn't ggplot2, but it may do what you want.

library(plotrix)
oldmar-par(mar=c(5,4,4,4))
plot(x,y,type=n)
plotlim-par(usr)
rect(plotlim[1],plotlim[3],plotlim[2],plotlim[4],col=lightgray)
grid(col=white)
box()
points(x,y,col=color.scale(z,c(1,0),0,c(0,1)),pch=19)
points(x1,y2,col=color.scale(z3,1,c(0,1),0),pch=19)
legendval1-seq(min(z),max(z),length.out=5)
color.legend(2.9,0.5,3.1,1.5,round(legendval1,1),align=rb,gradient=y,
 rect.col=color.scale(legendval1,c(1,0),0,c(0,1)))
legendval2-seq(min(z3),max(z3),length.out=5)
color.legend(2.9,-1.5,3.1,-0.5,round(legendval2,1),align=rb,gradient=y,
 rect.col=color.scale(legendval2,c(1,1),c(0,1),0))
par(xpd=TRUE)
text(3,1.6,z)
text(3,-0.4,z3)
par(xpd=FALSE,oldmar)

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] What does \Sexpr[results=rd]{} exactly mean in Rd?

2011-10-21 Thread Duncan Murdoch


On 11-10-17 9:53 AM, Yihui Xie wrote:

Thanks a lot! Sorry for cross-posting, but I did it intentially
because I tend to believe Barry Rowlingson (Why R-help Must Die!), and
I will summarize the answers here later to StackOverflow.

Another user also told me this worked for 2.13.1, but not later versions.


This should now be fixed.  Could you please test a version of 2.14.0 
beta or R-devel, after r57531?


Thanks.

Duncan Murdoch



Regards,
Yihui
--
Yihui Xiexieyi...@gmail.com
Phone: 515-294-2465 Web: http://yihui.name
Department of Statistics, Iowa State University
2215 Snedecor Hall, Ames, IA



On Mon, Oct 17, 2011 at 8:45 AM, Gavin Simpsongavin.simp...@ucl.ac.uk  wrote:

On Sun, 2011-10-16 at 19:36 -0500, Yihui Xie wrote:

Hi,

I have spent a few hours on the R-exts manual and the documentation of
parse_Rd() (as well as the PDF document in the references), but I
still have not figured out what results=rd means. I thought I could
use an R code fragment to create an Rd fragment dynamically. Here is
an example, in which I was expected the output to be a describe list
DL  in HTML, but it turns out not to be true.


Perhaps best not to cross post to several internet resources at once. I
replied to the same Q on StackOverflow:

http://stackoverflow.com/q/7788628/429846

Suffice it to say that your example works for me with 2.13.1 (still need
to compile 2.13.2 on my workstation). I left some additional comments
and examples, which might help understand this. I had trouble when I
first started playing this and didn't pursue further, but I think I am
starting to understand how to use this now after taking a look when I
tried to answer your Q.

G


(I was actually building a package with Rd's containing \Sexpr{}
instead of really using Rd2HTML(); the content was not rendered after
I run R CMD build.)

des- \\describe{\\item{def}{ghi}}
con- textConnection(c(\\title{abc}\\name{abc},
 \\details{\\Sexpr[results=rd,stage=build]{des}}))
z- parse_Rd(con)
Rd2HTML(z, stages = build)
close(con)

!DOCTYPE html PUBLIC -//W3C//DTD HTML 4.01 Transitional//EN
htmlheadtitleR: abc/title
meta http-equiv=Content-Type content=text/html; charset=utf-8
link rel=stylesheet type=text/css href=R.css
/headbody

table width=100% summary=page for abctrtdabc/tdtd
align=rightR Documentation/td/tr/table

h2abc/h2

h3Details/h3

pdefghi/p


/body/html



sessionInfo()

R version 2.13.2 (2011-09-30)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
  [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] tools stats graphics  grDevices utils datasets  methods
[8] base

other attached packages:
[1] devtools_0.4

loaded via a namespace (and not attached):
[1] RCurl_1.6-10


Thanks!

Regards,
Yihui
--
Yihui Xiexieyi...@gmail.com
Phone: 515-294-2465 Web: http://yihui.name
Department of Statistics, Iowa State University
2215 Snedecor Hall, Ames, IA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
  Dr. Gavin Simpson [t] +44 (0)20 7679 0522
  ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
  Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
  Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
  UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plotting average effects.

2011-10-21 Thread gradstudent

Hi Denins, and thanks for your reply.

I understand x,y are not lining up.  I just don't know how to fix it in the
code.

There is only a small group of us at my university using R (4 people of
which I am one).  2 are not even touching the average effects plot option,
however myself and my study partner feel it is best.  So, we really don't
have anyone to ask.  Everyone else in our class was taught on STATA.  The
reasons why are sort of complicated and I don't wish to bore you with
details. Basically, we were the first group to be trained using R.  This is
our 3rd semester using it.

Is there an online guide anywhere that will describe exactly what is going
on in the plot function for average effects?  I have been googling and have
not come across anything useful, except this site.

-
Ph.D. Candidate
--
View this message in context: 
http://r.789695.n4.nabble.com/plotting-average-effects-tp3923982p3925293.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Change column/row-name

2011-10-21 Thread R. Michael Weylandt

I'm not sure I follow: the matrix Iske doesn't have row or column
namesthough if you perhaps mean you want to use the pasted
together rows as names on the distance matrix rather than the
converted characters, this will do it:

Iske.rows - apply(Iske, 1, paste, collapse = ) # Perhaps subtract
out the 33 you added in
dimnames(Iske.levens) - list(Iske.rows, Iske.rows)

On Fri, Oct 21, 2011 at 1:57 AM, Jörg Reuter jo...@reuter.at wrote:
 Hi,
 I am very happy. My problems are solved without one little thing:

 (Iske - matrix(c(1, 1, 1, 2, 2, 2, 1, 1, 1, 5, 1, 2, 2, 2, 1, 1, 1,
 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 5, 4, 4, 4, 4, 4, 4, 2,
 2, 2, 2, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 1, 2), ncol = 5)) #My Matrix

 Iske- Iske+33 #I want see the letters

 (Iske.char-apply(Iske, 1, function(x) rawToChar(as.raw(x #Numbers to Char
 LD - function(s1, s2){
    require(vwr)
    s1 = as.character(s1)
    s2 = as.character(s2)
    t(sapply(s1, levenshtein.distance, s2))
 }
 Iske.levens-(LD(Iske.char,Iske.char)) #Calculate the Levenshtein-Distanz

 The result:
 !#$% !#$% !#$% !#$% 
 !#$%     0     0     0
 !#$%     0     0     0
 !#$%     0     0     0
 .
 .
 .
 It is all beautiful. But is there a simple way to change the
 column/row-name to the original from the Matrix Iske?

 Thanks a lot for the help yesterday. It was a big step in my life :-)

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plotting average effects.

2011-10-21 Thread gradstudent

Bart,

I apologize.  I posted the code I was using in my first comment, to include
the error and the plot that is coming up.  I was unaware that was not
enough.  I am not looking for anyone to give me  the actual answer to my
specific issue, only looking to be pointed in a direction for an online
guide that i can read to understand how average effects are plotted in R. 
Our text book doesn't cover any topics for using R, only theory.

-
Ph.D. Candidate
--
View this message in context: 
http://r.789695.n4.nabble.com/plotting-average-effects-tp3923982p3925350.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] POT package

2011-10-21 Thread R. Michael Weylandt

Perhaps you could tell us what function you are talking about npy
is not part of the POT package in the version on my machine and those
letters don't seem to show up anywhere consecutively at all on my
system, according to ??npy. Similarly, this trick produces no results:

sapply(ls(package:POT), function(f) sum(grepl(20945,deparse(get(f)

Michael

On Fri, Oct 21, 2011 at 3:35 AM, Amina Shahzadi aminashahz...@gmail.com wrote:
 Hi Sir

 It is requested to please tell the reason why the range of c(20945, 209547)
 is used in this function

 npy - length(events1[, obs])/(diff(range(ardieres[, time],

 + na.rm = TRUE)) - diff(ardieres[c(20945, 20947), time]))
 Please tell logic.

 Looking for quick response.

 Regards


 --
 *Amina Shahzadi*

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] windows limits

2011-10-21 Thread Ben qant

Hello,

Using the rgl package, I can set the device window to any dimension (that I
have tested):
par3d(windowRect=c(1,1,700,700))

With windows I can't get the window to span from the top to the bottom of
the monitor. In the following, no matter how large the ypinch value gets it
stops, leaving about 2 inches of space at the bottom of my screen:
windows(record=TRUE, ypinch=1100, xpinch=10, xpos=1,ypos=1, rescale='fixed')

...I've read about some limits on windows. Is there any way around these
limits? Any way I can get windows to perform like the rgl package 3d device?

Thanks for your help!

ben

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] xyplot() or splom()?: two factors from same data frame

2011-10-21 Thread Rich Shepard


On Fri, 21 Oct 2011, Duncan Mackay wrote:


Without a dataset I am not sure what you need.


Duncan,

  Part of the problems I'm trying to resolve come from changing priorities
from my client and the regulators. I end up stopping one process and
starting on a different one. But, that's life in the real world of
environmental consulting. :-)

  What I need now is to compare TDS (total dissolved solids) with specific
conductivity and the ions that are normally comprise TDS. Before running any
regression models I need to look at these data from three points of view:
all data from all sites collected during the past 30 years; average (or
total) concentrations (not yet decided on what makes the most ecological
sense) within a stream having multiple collection sites; and by site within
certain streams.

  I think that I need to subset the data frame to create distinct analytical
data frames for each comparison, then rm() them until needed again (or I'd
have a very large number of files in the directory). If I have a subset, for
example, of TDS and conductivity regardless of sample date or location I
will have two columns of numbers that will fit the xyplot() formula; e.g.,
xyplot(TDS ~ Cond). This is the broad picture. I can then use the
hydrographic basins (2 of 'em) or streams (24 of 'em) as factors to
condition the analysis. Repeat for other parameter pairs (TDS vs. Ca, TDS
vs, Mg, etc.).

  Another part of the issue, perhaps, is that the data are in a single data
frame:

 str(chemdata)
'data.frame':   47244 obs. of  6 variables:
 $ site: Factor w/ 143 levels BC-0.5,BC-1,..: 134 134 134 127 127
 $ sampdate: Date, format: 2006-12-06 2006-12-06 ...
 $ param   : Factor w/ 66 levels AGP,ANP,ANP/AGP,..: 58 66 12 24 59 66
 $ quant   : num  1.08e+04 7.95 1.80e-02 2.80e+02 1.90e+01 8.44 1.62e+03
 $ stream  : Factor w/ 24 levels BCrk,CCrk,..: 4 4 4 21 21 21 4
 $ basin   : Factor w/ 2 levels BasinEast,BasinWest: 1 1 1 1 1 1 1 1 1 2 ...

while all the data sets used in the books I've read are simpler. What I've
not read is guidance on how complex data sets could (or should) be
partitioned into smaller but still related data sets to facilitate analyses.

  I hope this clarifies my initial request.

Rich

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] windows limits

2011-10-21 Thread Duncan Murdoch


On 21/10/2011 9:36 AM, Ben qant wrote:

Hello,

Using the rgl package, I can set the device window to any dimension (that I
have tested):
par3d(windowRect=c(1,1,700,700))

With windows I can't get the window to span from the top to the bottom of
the monitor. In the following, no matter how large the ypinch value gets it
stops, leaving about 2 inches of space at the bottom of my screen:
windows(record=TRUE, ypinch=1100, xpinch=10, xpos=1,ypos=1, rescale='fixed')


This doesn't affect your question, but you shouldn't be using xpinch and 
ypinch for this:  they describe the pixels per inch.  You should be 
using width and height.


The real problem is that you aren't allowed to open windows that are 
more than 85% of the available size in either direction.  This is 
documented in ?windows.  I don't know the reason for this restriction, 
but it may be so that you can't inadvertantly lose the controls on the 
window (something that happens in rgl in your example code).


You can manually resize the window after creating it (using the mouse); 
you could write a function to do that if you know Windows API 
programming, but I don't believe there's one in base R.


Duncan Murdoch


...I've read about some limits on windows. Is there any way around these
limits? Any way I can get windows to perform like the rgl package 3d device?

Thanks for your help!

ben

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to create time series objects combining two vectors

2011-10-21 Thread sarelseerower

I am new to R and trying to understand time series objects.

I have 2 vectors, one containing rainfall values  (lets call the vector
rain) and the other the time/date in seconds (lets call it time). Is
there a method to create a time series object simply by giving the rain
and time vectors as inputs? Something in the line of:

time_series_object - time_series_function(rain, time)

Most of the packages I have investigated don't have a simple option like
this.  Can this be done however?


--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-create-time-series-objects-combining-two-vectors-tp3924883p3924883.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] varying coefficients model

2011-10-21 Thread Soberon Velez, Alexandra Pilar

Dear members,



I'm trying to estimate a varying coefficients model using the local polynomial 
estimation method in two case (univariate and bivariate) and with two smooth 
functions.



In the univariate case I use:

m1-smooth.lf(x=lp(z1,by=x1,deg=1,h=0.7,ev=z1)+lp(z3,by=expl2,deg=1,h=0.7,ev=z1)-1,y=dep,kern=gauss,kt=prod)



while in the bivariate case I use:

m1-smooth.lf(x=lp(z1,z2,by=x1,deg=1,h=0.7,ev=z1)+lp(z3,z4,by=expl2,deg=1,h=0.7,ev=z1)-1,y=dep,kern=gauss,kt=prod)

where the varying coefficient variable is (x1) that is a random normal variable 
and (z1,z2,z3,z4) are also random normal variables.



My problem is that when I increase the sample size the bivariate case doesn't 
work because appers an error newsplit.



Please, can somebody help me? I don't know if I use correctly the term by to 
especify the varying coefficient.



I'll appreciate any query.



Thanks a lot

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to remove multiple outliers

2011-10-21 Thread aajit75

Hi Michael,

Thanks for the help.

Yes, I have gone through the document for ?outlier. As it removes one
outlier at a time, being new to R, I was woondering is there any function
available for removing multiple outliers whithout calling say rm.outlier for
n number of time because n is not finite here.

On the second point, I am using below mentioned piece of code, because I am
getting error when rm.outlier with fill = FALSE option is applied on the
same dataset.

outlier_tf1 = outlier(x1,logical=TRUE) 
find_outlier1 = which(outlier_tf1==TRUE, arr.ind=TRUE) 
beh_input_ro1 = x1[-find_outlier1] 

 library(outliers)
 beh_input_ro - rm.outlier(beh_input_dr, fill = FALSE, median = FALSE,
 opposite = FALSE)
Error in data.frame(X1 = c(28.7812, 24.8923, 31.3987, 25.774, 27.1798,  : 
arguments imply differing number of rows: 2398, 2390, 2399

Regards,
-Ajit

--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-remove-multiple-outliers-tp3921689p3924904.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] stair-step plot

2011-10-21 Thread knut-o

Hello
Is it possible to map a plot with horizontal lines like in the step-plot,
but without the vertical lines?

Thanks, knut

--
View this message in context: 
http://r.789695.n4.nabble.com/stair-step-plot-tp3924903p3924903.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] (no subject)

2011-10-21 Thread Lisa Henault

can i be taken off of this mailing list please?

is there another way that you can access this without having to get all the
emails??

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to use gev.fit (package ismev) under box constraints?

2011-10-21 Thread NoSkill ButStyle


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] glm-poisson fitting 400.000 records

2011-10-21 Thread D_Tomas

Hi, 

I am trying to fi a glm-poisson model to 400.000 records. I have tried biglm
and glmulti but i have problems... can it really be the case that 400.000
are too many records???

I am thinking of using random samples of my dataset.

Many thanks,

--
View this message in context: 
http://r.789695.n4.nabble.com/glm-poisson-fitting-400-000-records-tp3925100p3925100.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R-help Digest, Vol 104, Issue 19

2011-10-21 Thread Dénes TÓTH



 On Oct 21, 2011, at 09:01 , Martin Maechler wrote:

 ARE == Alex Ruiz Euler rruizeu...@ucsd.edu
on Wed, 19 Oct 2011 14:05:16 -0700 writes:

ARE Motion supported. Very.

ARE On Wed, 19 Oct 2011 15:40:14 +0200
ARE peter dalgaard pda...@gmail.com wrote:

 Argh!

 Someone please unsubscribe this guy?

 He did this over Summer too and still hasn't learned that 1
 recipients of R-help do not care whether he is out of office!

 -pd

 Well, there are hundreds like him.
 The only difference being that he speaks Hungarian..


 You might filter on the Subject line being Re: [R] R-help Digest.*, with
 no attention to content. That has an obvious side effect, but maybe not a
 harmful one...

 -pd


 Why?  I (as R-* mailing list site maintainer)
 have had (procmail) filters that automatically catch such 'out of
 office'
 messages, so the 10'000 readers don't have to get them.
 The current set of filters catches  a set of English, French,
 German,.. (and I don't know) messages
 So I have (many!!) filters like this:

 :0
 * ^Subject: (Re|Holiday|Vacation): .*[-A-za-z]+ Digest, Vol [1-9][0-9]*,
 Issue [1-9][0-9]*
 {
  :0B
  * I( will not be reading.*\e?[-]?mail|.* away .* attend to your
 message when I get)
  mlist-bounced.spool
 }

 ---
 but can't start doing that for Hungarian or Chinese or ...

FYI:
holiday=szabadság
on holiday=szabadságon
out of office=nem tartózkodom az irodában OR irodán kívül vagyok

Expressions like nem tartózkodom az irodában OR irodán kívül vagyok
will never occur in a real post sent to the R-list, so could be used for
filtering.

HTH,
Dénes



 Martin

 --
 Peter Dalgaard, Professor
 Center for Statistics, Copenhagen Business School
 Solbjerg Plads 3, 2000 Frederiksberg, Denmark
 Phone: (+45)38153501
 Email: pd@cbs.dk  Priv: pda...@gmail.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] add=TRUE or similar in spplot?

2011-10-21 Thread infochat

Dear Helper,

I have a spatial lines data frame object 'spRiverDf'. The data frame consists 
of numbers {0,1,...,5}. And I have a vector 'colorS' of length 6 with different 
colours.

If I make a plot with spplot I get a plot of the lines - colours depending on 
there number in the data frame column:

spplot(spRiverDf['data.col.1'], zcol=..., names.attr=..., col.regions=colorS, 
lwd=10) # (A)

My problem:
- I'd like to overlay narrow lines (lwd=5) with the appropriate colors of a 
second data column.

My tests:
1) I tried it with the spplot options: lwd=10, 
sp.layout=list(list(spRiverDf['data.col.2'], col=colorS, lwd=5))
2) Second try: result - spplot(entries_see_(A)); result + 
layer(sp.lines(spRiverDf['data.col.2'], col=colorS, lwd=5))

My results:
- In both tests I get with spplot the desired but with the additional things 
only single-coloured (takes the first entry of 'colorS') lines.

My questions:
- Is there a possebility to overlay two spplots? Something like option add=TRUE 
for the usual 'plot' command.
- Or is there an easy way to select desired lines of a spatial plot data frame 
(e.g. with a special colour) that I can use 'sp.layout' or 'layer'?
- Of course I can create for additional data frame columns own spatial lines 
data frames for each color (depending on the number entry in the column) but 
this is very time-consuming - and realy not elegant.

Thank you for your help!
Thomas
--

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R-help Digest, Vol 104, Issue 21

2011-10-21 Thread mihalicza . peter

Október 19-től 21-ig irodán kívül vagyok, és az emailjeimet nem érem el.

Sürgős esetben kérem forduljon Kárpáti Edithez (karpati.e...@gyemszi.hu).

Üdvözlettel,
Mihalicza Péter


I will be out of the office from 19 till 21 October with no access to my emails.

In urgent cases please contact Ms. Edit Kárpáti (karpati.e...@gyemszi.hu).

With regards,
Peter Mihalicza

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 'Apply' giving me errors

2011-10-21 Thread kickout

Thanks for the tips/adviceI actually used a different solution to
circumvent this, but Uwe's solutions would also work

--
View this message in context: 
http://r.789695.n4.nabble.com/Apply-giving-me-errors-tp3923880p3925377.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plotting average effects.

2011-10-21 Thread Bart Joosen

How about posting a reproducible sample, so that we can see what is going on?
Read the posting guide!!!

--
View this message in context: 
http://r.789695.n4.nabble.com/plotting-average-effects-tp3923982p3925324.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] POT package

2011-10-21 Thread Amina Shahzadi

 npy - length(events1[, obs])/(diff(range(ardieres[, time],

+ na.rm = TRUE)) - diff(ardieres[c(20945, 20947), time]))
This line is from the mannual A user's Guide to POT Approach. I am just
trying to ask why the values 20945, 20947 are used??

Regards

On Fri, Oct 21, 2011 at 4:19 PM, R. Michael Weylandt 
michael.weyla...@gmail.com wrote:

 Perhaps you could tell us what function you are talking about npy
 is not part of the POT package in the version on my machine and those
 letters don't seem to show up anywhere consecutively at all on my
 system, according to ??npy. Similarly, this trick produces no results:

 sapply(ls(package:POT), function(f) sum(grepl(20945,deparse(get(f)

 Michael

 On Fri, Oct 21, 2011 at 3:35 AM, Amina Shahzadi aminashahz...@gmail.com
 wrote:
  Hi Sir
 
  It is requested to please tell the reason why the range of c(20945,
 209547)
  is used in this function
 
  npy - length(events1[, obs])/(diff(range(ardieres[, time],
 
  + na.rm = TRUE)) - diff(ardieres[c(20945, 20947), time]))
  Please tell logic.
 
  Looking for quick response.
 
  Regards
 
 
  --
  *Amina Shahzadi*
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 




-- 
*Amina Shahzadi*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to create time series objects combining two vectors

2011-10-21 Thread Gabor Grothendieck

On Fri, Oct 21, 2011 at 6:24 AM, sarelseerower sarelseero...@gmail.com wrote:
 I am new to R and trying to understand time series objects.

 I have 2 vectors, one containing rainfall values  (lets call the vector
 rain) and the other the time/date in seconds (lets call it time). Is
 there a method to create a time series object simply by giving the rain
 and time vectors as inputs? Something in the line of:

 time_series_object - time_series_function(rain, time)

 Most of the packages I have investigated don't have a simple option like
 this.  Can this be done however?



See:
http://rwiki.sciviews.org/doku.php?id=guides:tutorials:hydrological_data_analysis:miscellaneous_data_import

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] stair-step plot

2011-10-21 Thread Duncan Murdoch


On 21/10/2011 6:39 AM, knut-o wrote:

Hello
Is it possible to map a plot with horizontal lines like in the step-plot,
but without the vertical lines?



Not in the basic plot function, but you can write your own fairly 
easily, using segments().  For example:


x - y - 1:10
plot(x,y, type='n')
segments(x[-10], y[-10], x[-1], y[-10])

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] (no subject)

2011-10-21 Thread Uwe Ligges




On 21.10.2011 13:02, Lisa Henault wrote:

can i be taken off of this mailing list please?

is there another way that you can access this without having to get all the
emails??

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



For both questions (unsubscribe, digest form, archived) see the bottom 
of each mail you got so far:

https://stat.ethz.ch/mailman/listinfo/r-help

Uwe Ligges

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] glm-poisson fitting 400.000 records

2011-10-21 Thread Ben Bolker

D_Tomas tomasmeca at hotmail.com writes:

 
 Hi, 
 
 I am trying to fi a glm-poisson model to 400.000 records. I have tried biglm
 and glmulti but i have problems... can it really be the case that 400.000
 are too many records???
 
 I am thinking of using random samples of my dataset.
 

  I have problems isn't enough for us to diagnose.  I tried
this trivial example in base R:

 d - data.frame(x=runif(4e5),y=rpois(4e5,5))
 system.time(glm(y~x,family=poisson,data=d,trace=TRUE))
Deviance = 438614.6 Iterations - 1 
Deviance = 417968.2 Iterations - 2 
Deviance = 417921.2 Iterations - 3 
Deviance = 417921.2 Iterations - 4 
   user  system elapsed 
  5.444  12.952  18.429 

  Can you give us a hint about what went wrong??

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] question about aggregate

2011-10-21 Thread Adel ESSAFI

Hi list

I am discovering R, and

-- 
PhD candidate in Computer Science
Address
3 avenue lamine, cité ezzahra, Sousse 4000
Tunisia
tel: +216 97 246 706 (+33640302046 jusqu'au 15/6)
fax: +216 71 391 166

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] question about aggregate

2011-10-21 Thread Adel ESSAFI

Hello
I am discovering R and I find it is really very powerful.

However, I find some newbie difficulties.

Here, I have a data frame with manu values that I want to calculate the
frequency (the nomber of line) of the some criteria.
For exemple here,  I want it to print the number of occurence where
sci[,2]=0 and sci[,1]=L. In my exemple, he is printing the number of the
line in the result data frame.
however, I have at least 90 line with sci[,2]=0 and sci[,1]=L.
Thank you in advance for any input.


  aggregate(sci[,5],list(sci[,2],sci[,1]),frequency)
   Group.1 Group.2 x
1  0.0   L 1
2  0.2   L 1

-- 
PhD candidate in Computer Science
Address
3 avenue lamine, cité ezzahra, Sousse 4000
Tunisia
tel: +216 97 246 706 (+33640302046 jusqu'au 15/6)
fax: +216 71 391 166

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Change column/row-name

2011-10-21 Thread David Winsemius



On Oct 21, 2011, at 1:57 AM, Jörg Reuter wrote:


Hi,
I am very happy. My problems are solved without one little thing:

(Iske - matrix(c(1, 1, 1, 2, 2, 2, 1, 1, 1, 5, 1, 2, 2, 2, 1, 1, 1,
4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 5, 4, 4, 4, 4, 4, 4, 2,
2, 2, 2, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 1, 2), ncol = 5)) #My Matrix


First transform to characters. (Using raw type seems failure-prone):

Iltrs - c(letters, LETTERS)[Iske]

Then work with that.

(No testing in absence of the levenshtein.distance function definition  
or package location.)




Iske- Iske+33 #I want see the letters

(Iske.char-apply(Iske, 1, function(x) rawToChar(as.raw(x  
#Numbers to Char

LD - function(s1, s2){
   require(vwr)
   s1 = as.character(s1)
   s2 = as.character(s2)
   t(sapply(s1, levenshtein.distance, s2))
}
Iske.levens-(LD(Iske.char,Iske.char)) #Calculate the Levenshtein- 
Distanz


The result:
!#$% !#$% !#$% !#$% 
!#$% 0 0 0
!#$% 0 0 0
!#$% 0 0 0
.
.
.
It is all beautiful. But is there a simple way to change the
column/row-name to the original from the Matrix Iske?

Thanks a lot for the help yesterday. It was a big step in my life :-)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] lattice::xyplot/ggplot2: plotting weighted data frames with lmline and smooth

2011-10-21 Thread Michael Friendly

In the HistData package, I have a data frame, PearsonLee, containing 
observations on heights of parent and child, in weighted form:


library(HistData)

 str(PearsonLee)
'data.frame':   746 obs. of  6 variables:
 $ child: num  59.5 59.5 59.5 60.5 60.5 61.5 61.5 61.5 61.5 61.5 ...
 $ parent   : num  62.5 63.5 64.5 62.5 66.5 59.5 60.5 62.5 63.5 64.5 ...
 $ frequency: num  0.5 0.5 1 0.5 1 0.25 0.25 0.5 1 0.25 ...
 $ gp   : Factor w/ 4 levels fd,fs,md,..: 2 2 2 2 2 2 2 2 2 2 ...
 $ par  : Factor w/ 2 levels Father,Mother: 1 1 1 1 1 1 1 1 1 1 ...
 $ chl  : Factor w/ 2 levels Daughter,Son: 2 2 2 2 2 2 2 2 2 2 ...

I want to make a 2x2 set of plots of child ~ parent | par+chl, with 
regression lines and loess smooths, that
incorporate weights=frequency.  The frequencies are not integers, so I 
can't simply expand the

data frame.

I'd also like to use different colors for the regression and smoothed lines.
Here's what I've tried using xyplot, all unsuccessful.  I suppose I 
could also use ggplot2, if I could do what

I want.

xyplot(child ~ parent|par+chl, data=PearsonLee, weights=frequency, 
type=c(p, r, smooth))

xyplot(child ~ parent|par+chl, data=PearsonLee,  type=c(p, r, smooth))

 panel.lmline  and panel.smooth don't have a weights= argument, though 
lm() and loess() do.


# Try to control line colors: unsuccessfully -- only one value of 
col.lin is used
xyplot(child ~ parent|par+chl, data=PearsonLee, type=c(p, r, 
smooth), col.line=c(red, blue))


## try to use panel functions ... unsucessfully
xyplot(child ~ parent|par+chl, data=PearsonLee, type=p,
   panel = function(x, y, ...) {
   panel.xyplot(x, y, ...)
   panel.lmline(x, y, col=blue, ...)
   panel.smooth(x, y, col=red, ...)
   }
)

The following, using base graphics, illustrates the difference between 
the weighted and unweighted lines,

for the total data frame:

with(PearsonLee,
{
lim - c(55,80)
xv - seq(55,80, .5)
sunflowerplot(parent,child, number=frequency, xlim=lim, ylim=lim, 
seg.col=gray, size=.1)

# unweighted
abline(lm(child ~ parent), col=green, lwd=2)
lines(xv, predict(loess(child ~ parent), data.frame(parent=xv)), 
col=green, lwd=2)

# weighted
abline(lm(child ~ parent, weights=frequency), col=blue, lwd=2)
lines(xv, predict(loess(child ~ parent, weights=frequency), 
data.frame(parent=xv)), col=blue, lwd=2)

  })

thanks,
-Michael



--
Michael Friendly Email: friendly AT yorku DOT ca
Professor, Psychology Dept.
York University  Voice: 416 736-5115 x66249 Fax: 416 736-5814
4700 Keele StreetWeb:   http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Specifying Greek Character in Lattice Plot Label

2011-10-21 Thread Rich Shepard


  For an axis label I want to include the Greek letter mu within the string.
I've not found the proper way of including that expression within the
string.

  What I want is Conductivity (uS/cm) with the 'u' replaced by mu. When I
try Conductivity ( expression(paste(mu)) S/cm) I get an error. If I
don't separate Conductivity and S/cm with parentheses the string
'expression(paste(mu))' displays in the lable.

  What am I doing incorrectly?

Rich

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] stair-step plot

2011-10-21 Thread David Winsemius



On Oct 21, 2011, at 6:39 AM, knut-o wrote:


Hello
Is it possible to map a plot with horizontal lines like in the step- 
plot,

but without the vertical lines?


There is no function named 'step-plot'. If you are talking about the  
plot.stepfun function then look at the verticals argument.




Thanks, knut

--
View this message in context: 
http://r.789695.n4.nabble.com/stair-step-plot-tp3924903p3924903.html
Sent from the R help mailing list archive at Nabble.com.




David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Specifying Greek Character in Lattice Plot Label

2011-10-21 Thread David Winsemius



On Oct 21, 2011, at 11:27 AM, Rich Shepard wrote:

 For an axis label I want to include the Greek letter mu within the  
string.

I've not found the proper way of including that expression within the
string.

 What I want is Conductivity (uS/cm) with the 'u' replaced by mu.  
When I
try Conductivity ( expression(paste(mu)) S/cm) I get an error.  
If I

don't separate Conductivity and S/cm with parentheses the string
'expression(paste(mu))' displays in the lable.


try:

 plot(1,1, xlab=expression(Conductivity~(*mu*S/cm*)) )

Parens are the only characters that need to be quoted and you do need  
to use proper plotmath connectors,  ~ and * depending on whether  
you ant a space to appear or not. I don't hink you can join character  
values and expression values in the manner that you offer but I admit  
I never tried it, so I don't know for sure. Generally language and  
expression objects have their own special set of functions and syntax.




 What am I doing incorrectly?

Rich

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] cph/nomogram Design/RMS package hazard ratio: interquartile vs per unit

2011-10-21 Thread David Winsemius

On Oct 20, 2011, at 8:22 PM, renee wrote:

Hello,

I am constructing a nomogram using cph and nomogram commands in Dr.
Harrell's Design/RMS package. The HR that I obtain for dichotomous and
categorical variables are identical to those that I obtain using STATA
stcox.

When posting to r-help it is advised to produce the code used. One
might guess that you were talking about output from summary(fit) but
that would be a guess.

However, the inter-quartile HR I obtain for continuous variables is
obviously different, since STATA gives me HR for each unit (year,
centimeter, etc) like coxph would give.

If you want HR's for single unit difference on the scale of the
measured units then this should produce those:

exp(coef(fit))

My question is if this will effect
the output of the nomogram. I'm assuming that nomogram is
constructed using

hazard between each unit rather than quartiles - is this true?

Yes.

Also, I've found that I do not need to create indicator variables
for my

categorical variables when I use cph. Is this also correct?

If they are factor classed variables, then that is correct.

I appreciate your feedback. Thank you.

~Renee

--
View this message in context:
http://r.789695.n4.nabble.com/cph-nomogram-Design-RMS-package-hazard-ratio-interquartile-vs-per-unit-tp3923896p3923896.html
(At least until Nabble deletes it in a year or two.)

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plotting average effects.

2011-10-21 Thread gradstudent

i will include the data to read if if you so choose.

dat - read.dta(http://quantoid.net/hw1_2011.dta;)

model in question:

mod99 - glm(democracy ~ popc100kpc + ngrevpc, data=dat, family=binomial)
--

looking for average effects code, with error on mod99.  popckpc is coded in
1k per capita.

 dat3$popc100kpc - dat$popc100kpc - 100
 dat3$popc100kpc[which(dat3$popc100kpc  min(dat$popc100kpc))] -
 min(dat$popc100kpc)
 dat2 - dat3 - dat
 dat2$popc100kpc - dat2$popc100kpc + 100
 dat2$popc100kpc[which(dat2$popc100kpc  max(dat$popc100kpc))] -
 max(dat$popc100kpc)
 dat3$popc100kpc - dat$popc100kpc - 100
 dat3$popc100kpc[which(dat3$popc100kpc  min(dat$popc100kpc))] -
 min(dat$popc100kpc)
 pred1 - predict(mod99, type=response)
 pred2 - predict(mod99, newdata=dat2, type=response)
 pred3 - predict(mod99, newdata=dat3, type=response)

breaking the variable into groups:

 pop1.group - cut(dat$popc100kpc, breaks=quantile(dat$popc100kpc,
 seq(0,1,by=.25)), include.lowest=T)
   apply, 2, mean)

 means - by(cbind(pred1, pred2, pred3), list(pop1.group), 
+   apply, 2, mean)
 means - do.call(rbind, means)



and finally attempting to plot:

 par(mar=c(7,4,4,2))
 plot(c(1,10), range(c(means)), type=n, xlab=, 
+   ylab=Predicted Probability, axes=F)
 arrows(1:10, means[,1], 1:10, means[,2], code=2, length=.1)
 arrows(1:10, means[,1], 1:10, means[,3], code=2, length=.1, col=red)
 points(1:10, means[,1], pch=16)
Error in xy.coords(x, y) : 'x' and 'y' lengths differ
 axis(1, at=1:10, labels=rownames(means), las=2)
Error in axis(1, at = 1:10, labels = rownames(means), las = 2) : 
  'at' and 'labels' lengths differ, 10 != 4



I am not sure how to fix the error.  Thank you for your time.




-
Ph.D. Candidate
--
View this message in context: 
http://r.789695.n4.nabble.com/plotting-average-effects-tp3923982p3925945.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Multiple factorial comparison LSD

2011-10-21 Thread Richard M. Heiberger

Vera,

The glht function in the multcomp package provides the capability you are
looking for.
The MMC functions in the HH package build on the ghlt function.
There are examples in ?MMC on data with more than one factor.

Rich

On Fri, Oct 21, 2011 at 4:55 AM, Vera Marjorie E. Velasco 
velas...@univmail.cis.mcmaster.ca wrote:

 Please help.  I really like R and I have been looking at how to do LSD
 multiple comparison test with data that has more than one factor.  So far, I
 am unsuccessful. Please help!

 Me

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Specifying Greek Character in Lattice Plot Label

2011-10-21 Thread Rich Shepard


On Fri, 21 Oct 2011, David Winsemius wrote:


plot(1,1, xlab=expression(Conductivity~(*mu*S/cm*)) )


  Thank you, David. It did not occur to me to look for a help page. I'll
read that now that I looked and found it.

Rich

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] replicating SAS's proc rank procedure

2011-10-21 Thread David L Carlson

You can get the same results with the cut() function in R:

cut(cars$speed, breaks=quantile(cars$speed, probs=c(0:15/15)), labels=1:15,
include.lowest=TRUE)

--
David L Carlson
Associate Professor of Anthropology
Texas AM University
College Station, TX 77843-4352


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of riskcalc
Sent: Friday, October 21, 2011 4:06 AM
To: r-help@r-project.org
Subject: Re: [R] replicating SAS's proc rank procedure

Hi try this function ive written
it should be self explantory but let me know if you have any problems. 
I've only been using R for a few eeeks so apologies if its not the most
efficient!

rankit2-function(rankvar,cuts,data,factor) {
ranker-rankvar
ranker-0
range-c(1:cuts)
range2-range/cuts
range3-quantile(factor,range2)
over-length(factor)



for (i in 1:over){ 
for (j in 1:cuts) {
 
if (data[[i,1]]=range3[[j]])
{data[[i,3]]-j
##test-j
##print(j)
}
 if (data[[i,3]]0)  break
}
}
out2-data
return(out2)
}

cars$rank-0
try2-rankit2(rank,15,cars,cars$speed)
try2

all the best

Leigh
RCalc partner
www.RCalc.co.uk

--
View this message in context:
http://r.789695.n4.nabble.com/replicating-SAS-s-proc-rank-procedure-tp820510
p3924739.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Working With Variables Having Different Lengths

2011-10-21 Thread Rich Shepard


  Because of regulatory requirement changes over several decades and weather
conditions preventing site access the variables in my data set have
different lengths. I'd like guidance on how to perform linear regressions
and other models with these variables.

  For example, there are 2206 rows for the parameter TDS but only 1191
rows for the parameter Cond. Such discrepancies are common in these data.

  Is there a reference I can read to learn how to analyze such data?

Rich

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lattice::xyplot/ggplot2: plotting weighted data frames with lmline and smooth

2011-10-21 Thread Dennis Murphy

Hi Michael:

Here's one way to get it from ggplot2. To avoid possible overplotting,
I jittered the points horizontally by +/- 0.2. I also reduced the point
size from the default 2 and increased the line thickness to 1.5 for
both fitted curves. In ggplot2, the term faceting is synonymous with
conditioning (by groups).

library('HistData')
library('ggplot2')
ggplot(PearsonLee, aes(x = parent, y = child)) +
   geom_point(size = 1.5, position = position_jitter(width = 0.2)) +
   geom_smooth(method = lm, aes(weights = PearsonLee$weight),
   colour = 'green', se = FALSE, size = 1.5) +
   geom_smooth(aes(weights = PearsonLee$weight),
   colour = 'red', se = FALSE, size = 1.5) +
   facet_grid(chl ~ par)

# If you prefer a legend, here's one take, pulling the legend inside
# to the upper left corner. This requires a bit more 'trickery', but
# the tricks are found in the ggplot2 book.

ggplot(PearsonLee, aes(x = parent, y = child)) +
   geom_point(size = 1.5, position = position_jitter(width = 0.2)) +
   geom_smooth(method = lm, aes(weights = PearsonLee$weight,
   colour = 'Linear'), se = FALSE, size = 1.5) +
   geom_smooth(aes(weights = PearsonLee$weight,
   colour = 'Loess'), se = FALSE, size = 1.5) +
   facet_grid(chl ~ par) +
   scale_colour_manual(breaks = c('Linear', 'Loess'),
   values = c('green', 'red')) +
   opts(legend.position = c(0.14, 0.885),
legend.background = theme_rect(fill = 'white'))


HTH,
Dennis

On Fri, Oct 21, 2011 at 8:22 AM, Michael Friendly frien...@yorku.ca wrote:
 In the HistData package, I have a data frame, PearsonLee, containing
 observations on heights of parent and child, in weighted form:

 library(HistData)

 str(PearsonLee)
 'data.frame':   746 obs. of  6 variables:
  $ child    : num  59.5 59.5 59.5 60.5 60.5 61.5 61.5 61.5 61.5 61.5 ...
  $ parent   : num  62.5 63.5 64.5 62.5 66.5 59.5 60.5 62.5 63.5 64.5 ...
  $ frequency: num  0.5 0.5 1 0.5 1 0.25 0.25 0.5 1 0.25 ...
  $ gp       : Factor w/ 4 levels fd,fs,md,..: 2 2 2 2 2 2 2 2 2 2 ...
  $ par      : Factor w/ 2 levels Father,Mother: 1 1 1 1 1 1 1 1 1 1 ...
  $ chl      : Factor w/ 2 levels Daughter,Son: 2 2 2 2 2 2 2 2 2 2 ...

 I want to make a 2x2 set of plots of child ~ parent | par+chl, with
 regression lines and loess smooths, that
 incorporate weights=frequency.  The frequencies are not integers, so I
 can't simply expand the
 data frame.

 I'd also like to use different colors for the regression and smoothed lines.
 Here's what I've tried using xyplot, all unsuccessful.  I suppose I could
 also use ggplot2, if I could do what
 I want.

 xyplot(child ~ parent|par+chl, data=PearsonLee, weights=frequency,
 type=c(p, r, smooth))
 xyplot(child ~ parent|par+chl, data=PearsonLee,  type=c(p, r, smooth))

  panel.lmline  and panel.smooth don't have a weights= argument, though lm()
 and loess() do.

 # Try to control line colors: unsuccessfully -- only one value of col.lin is
 used
 xyplot(child ~ parent|par+chl, data=PearsonLee, type=c(p, r, smooth),
 col.line=c(red, blue))

 ## try to use panel functions ... unsucessfully
 xyplot(child ~ parent|par+chl, data=PearsonLee, type=p,
       panel = function(x, y, ...) {
           panel.xyplot(x, y, ...)
           panel.lmline(x, y, col=blue, ...)
           panel.smooth(x, y, col=red, ...)
           }
 )

 The following, using base graphics, illustrates the difference between the
 weighted and unweighted lines,
 for the total data frame:

 with(PearsonLee,
    {
    lim - c(55,80)
    xv - seq(55,80, .5)
    sunflowerplot(parent,child, number=frequency, xlim=lim, ylim=lim,
 seg.col=gray, size=.1)
    # unweighted
    abline(lm(child ~ parent), col=green, lwd=2)
    lines(xv, predict(loess(child ~ parent), data.frame(parent=xv)),
 col=green, lwd=2)
    # weighted
    abline(lm(child ~ parent, weights=frequency), col=blue, lwd=2)
    lines(xv, predict(loess(child ~ parent, weights=frequency),
 data.frame(parent=xv)), col=blue, lwd=2)
  })

 thanks,
 -Michael



 --
 Michael Friendly     Email: friendly AT yorku DOT ca
 Professor, Psychology Dept.
 York University      Voice: 416 736-5115 x66249 Fax: 416 736-5814
 4700 Keele Street    Web:   http://www.datavis.ca
 Toronto, ONT  M3J 1P3 CANADA

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] (no subject)

2011-10-21 Thread Trevor Davies

Alternatively, since you are on gmail you can set up a folder and filter so
all r-help emails bypass you inbox and go right to an r-help folder (or
something).  I find it very useful for just browsing during down time so I
can offer my assistance or move to an 'r-keepers ' folder the for little
gems that I would like to use later.


On Fri, Oct 21, 2011 at 4:02 AM, Lisa Henault lisa.hena...@gmail.comwrote:

 can i be taken off of this mailing list please?

 is there another way that you can access this without having to get all the
 emails??

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Specifying Greek Character in Lattice Plot Label

2011-10-21 Thread Luke Miller

The following produces something very similar to David's method:

plot(1,1, xlab = expression(paste(Conductivity (, mu, S / cm

but with a slightly different slash character. I think David's method
is more correct, but I've used the above method in the past with
some success.

On Fri, Oct 21, 2011 at 11:45 AM, David Winsemius
dwinsem...@comcast.net wrote:

 On Oct 21, 2011, at 11:27 AM, Rich Shepard wrote:

  For an axis label I want to include the Greek letter mu within the string.
 I've not found the proper way of including that expression within the
 string.

  What I want is Conductivity (uS/cm) with the 'u' replaced by mu. When I
 try Conductivity ( expression(paste(mu)) S/cm) I get an error. If I
 don't separate Conductivity and S/cm with parentheses the string
 'expression(paste(mu))' displays in the lable.

 try:

  plot(1,1, xlab=expression(Conductivity~(*mu*S/cm*)) )

 Parens are the only characters that need to be quoted and you do need to use 
 proper plotmath connectors,  ~ and * depending on whether you ant a space 
 to appear or not. I don't hink you can join character values and expression 
 values in the manner that you offer but I admit I never tried it, so I don't 
 know for sure. Generally language and expression objects have their own 
 special set of functions and syntax.


  What am I doing incorrectly?

 Rich

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 David Winsemius, MD
 West Hartford, CT

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



--
___
Luke Miller
Postdoctoral Researcher
Marine Science Center
Northeastern University
Nahant, MA
(781) 581-7370 x318

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Working With Variables Having Different Lengths

2011-10-21 Thread Weidong Gu

Sounds like you are dealing with missing data problem. At default, lm
or glm would only keep observations with complete records (complete
case analysis). This can be problematic if you have many missing
variables and missing values occur not completely at random (i.e.,
missing values are dependent on other (un)measured variables or
missing values themselves). Imputation is a common tool for handling
imcomplete data analysis. In R, you can find packages which conduct
single or multiple imputations, e.g. randomForest, norm, mice, mi
etc..

No easy way out with missing data problems, all imputations are based
on some strong and untestable assumptions.


Weidong Gu


On Fri, Oct 21, 2011 at 12:13 PM, Rich Shepard rshep...@appl-ecosys.com wrote:
  Because of regulatory requirement changes over several decades and weather
 conditions preventing site access the variables in my data set have
 different lengths. I'd like guidance on how to perform linear regressions
 and other models with these variables.

  For example, there are 2206 rows for the parameter TDS but only 1191
 rows for the parameter Cond. Such discrepancies are common in these data.

  Is there a reference I can read to learn how to analyze such data?

 Rich

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Working With Variables Having Different Lengths

2011-10-21 Thread B77S

I know in my experience Cond (conductivity??) doesn't vary much within a
stream except for during high flow events, and I would imagine the same is
true for TDS. If these are all low flow values, you could possibly
determine a mean/median value to use for the missing data points. Obviously
this is going to be different if you are sampling storm events. If you have
stage data and lots of data points, you may be able to model the parameters
as a function of stage.
HTH

Rich Shepard wrote:

Because of regulatory requirement changes over several decades and weather
conditions preventing site access the variables in my data set have
different lengths. I'd like guidance on how to perform linear regressions
and other models with these variables.

For example, there are 2206 rows for the parameter TDS but only 1191
rows for the parameter Cond. Such discrepancies are common in these
data.

Is there a reference I can read to learn how to analyze such data?

Rich

__
R-help@ mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
View this message in context:
http://r.789695.n4.nabble.com/Working-With-Variables-Having-Different-Lengths-tp3926023p3926158.html
Sent from the R help mailing list archive at Nabble.com.

Re: [R] Specifying Greek Character in Lattice Plot Label

2011-10-21 Thread Rich Shepard


On Fri, 21 Oct 2011, Luke Miller wrote:


The following produces something very similar to David's method:
plot(1,1, xlab = expression(paste(Conductivity (, mu, S / cm
but with a slightly different slash character. I think David's method
is more correct, but I've used the above method in the past with
some success.


  Thanks, Luke.

Rich

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Working With Variables Having Different Lengths

2011-10-21 Thread Rich Shepard


On Fri, 21 Oct 2011, Weidong Gu wrote:


No easy way out with missing data problems, all imputations are based on
some strong and untestable assumptions.


  Thanks for the insights.

  Let me rephrase my question in a way that should work: is there a way to
subset my comprehensive data frame ('chemdata') to select only those rows
that have values for two different parameters (i.e., in the same column)?

  I suspect not. But, I can select from the database table on those criteria
and read in a new R data frame.

Rich

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] (no subject)

2011-10-21 Thread Daniel Nordlund

I believe you could also set your subscription to NOMAIL and then read the 
posts from the R-help archive.  This would also allow you to post to R-help 
since you are still subscribed.

Hope this is helpful,

Dan

Daniel Nordlund
Bothell, WA USA

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf Of Trevor Davies
 Sent: Friday, October 21, 2011 9:15 AM
 To: Lisa Henault
 Cc: R-help@r-project.org
 Subject: Re: [R] (no subject)
 
 Alternatively, since you are on gmail you can set up a folder and filter
 so
 all r-help emails bypass you inbox and go right to an r-help folder (or
 something).  I find it very useful for just browsing during down time so I
 can offer my assistance or move to an 'r-keepers ' folder the for little
 gems that I would like to use later.
 
 
 On Fri, Oct 21, 2011 at 4:02 AM, Lisa Henault
 lisa.hena...@gmail.comwrote:
 
  can i be taken off of this mailing list please?
 
  is there another way that you can access this without having to get all
 the
  emails??
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Working With Variables Having Different Lengths

2011-10-21 Thread Rich Shepard


On Fri, 21 Oct 2011, B77S wrote:


I know in my experience Cond (conductivity??) doesn't vary much within a
stream except for during high flow events, and I would imagine the same is
true for TDS.


  This is generally true, but not in the streams with which we're working.
TDS values, for example, vary by orders of magnitude between sampling
locations on the same stream, and not with any pattern. Also, specific
conductance/conductivity (Cond) varies within a stream.

  These variations may well be on different dates, but this first look needs
to ignore time. I'll eventually get to that aspect.

Thanks,

Rich

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plotting average effects.

2011-10-21 Thread Dennis Murphy

Hi:

Your approach to computing the means is not efficient; a better way
would be to use the aggregate() function. I would start by combining
the grouping variable and the three prediction variables into a data
frame. To get the groupwise mean for all three prediction variables,
you can use a formula interface for aggregate() if you have R-2.11.0
or later, cbinding the three prediction variables into a matrix on the
LHS of the model formula, the grouping variable on the RHS, followed
by the data frame name and the summary function. See ?aggregate for
details; in particular, study the examples with a formula interface.
Save the result to an object. Since this is homework, the details are
left to you.

As far as the base graphics plot goes, I suggest the following:
  - use arrows() to produce the lines with arrows
  - plot the means by group as points with the points() function.

The arrows() function can take vector arguments; read its help page carefully.

The ggplot2 version of the plot I think you're trying to generate is
given below:

library('ggplot2')
ggplot(pmeans) +
   geom_point(aes(x = grp, y = pred1), colour = 'red') +
   geom_segment(aes(x = grp, xend = grp, y = pred3, yend = pred2),
arrow = arrow(length = unit(0.4, 'cm')), colour =
'red', size = 1)

pmeans is the name I gave for the averaged predictions by group, with
grp representing the grouping variable and pred1-pred3 per your
definitions.

In addition to the aggregate() and apply family functions, the
packages doBy, plyr and data.table are well designed for groupwise
data summarization and processing.

HTH,
Dennis

On Fri, Oct 21, 2011 at 8:51 AM, gradstudent nmf...@uwm.edu wrote:
 i will include the data to read if if you so choose.

 dat - read.dta(http://quantoid.net/hw1_2011.dta;)

 model in question:

 mod99 - glm(democracy ~ popc100kpc + ngrevpc, data=dat, family=binomial)
 --

 looking for average effects code, with error on mod99.  popckpc is coded in
 1k per capita.

 dat3$popc100kpc - dat$popc100kpc - 100
 dat3$popc100kpc[which(dat3$popc100kpc  min(dat$popc100kpc))] -
 min(dat$popc100kpc)
 dat2 - dat3 - dat
 dat2$popc100kpc - dat2$popc100kpc + 100
 dat2$popc100kpc[which(dat2$popc100kpc  max(dat$popc100kpc))] -
 max(dat$popc100kpc)
 dat3$popc100kpc - dat$popc100kpc - 100
 dat3$popc100kpc[which(dat3$popc100kpc  min(dat$popc100kpc))] -
 min(dat$popc100kpc)
 pred1 - predict(mod99, type=response)
 pred2 - predict(mod99, newdata=dat2, type=response)
 pred3 - predict(mod99, newdata=dat3, type=response)
 
 breaking the variable into groups:

 pop1.group - cut(dat$popc100kpc, breaks=quantile(dat$popc100kpc,
 seq(0,1,by=.25)), include.lowest=T)
       apply, 2, mean)

 means - by(cbind(pred1, pred2, pred3), list(pop1.group),
 +       apply, 2, mean)
 means - do.call(rbind, means)


 
 and finally attempting to plot:

 par(mar=c(7,4,4,2))
 plot(c(1,10), range(c(means)), type=n, xlab=,
 +       ylab=Predicted Probability, axes=F)
 arrows(1:10, means[,1], 1:10, means[,2], code=2, length=.1)
 arrows(1:10, means[,1], 1:10, means[,3], code=2, length=.1, col=red)
 points(1:10, means[,1], pch=16)
 Error in xy.coords(x, y) : 'x' and 'y' lengths differ
 axis(1, at=1:10, labels=rownames(means), las=2)
 Error in axis(1, at = 1:10, labels = rownames(means), las = 2) :
  'at' and 'labels' lengths differ, 10 != 4
 


 I am not sure how to fix the error.  Thank you for your time.




 -
 Ph.D. Candidate
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/plotting-average-effects-tp3923982p3925945.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Working With Variables Having Different Lengths

2011-10-21 Thread David Winsemius



On Oct 21, 2011, at 1:04 PM, Rich Shepard wrote:


On Fri, 21 Oct 2011, Weidong Gu wrote:

No easy way out with missing data problems, all imputations are  
based on

some strong and untestable assumptions.


 Thanks for the insights.

 Let me rephrase my question in a way that should work: is there a  
way to
subset my comprehensive data frame ('chemdata') to select only those  
rows
that have values for two different parameters (i.e., in the same  
column)?


The last part (in the same column) does not make sense, since I was  
interpreting the term parameter to mean a value in a particular  
column. Assuming these are R NA's then logical indexing:


with( chemdata, chemdata[!is.na(param1)  !is.na(param2) , ])

If you are talking about extracting different text features from a  
single character field then look at `grepl`.


patt1 - S2  # any appearance of that string
patt2 - E5  # any appearance of that string
with( chemdata, chemdata[ grepl(patt1, param1)  grepl(patt2,  
param1) , ])




 I suspect not. But, I can select from the database table on those  
criteria

and read in a new R data frame.


That to should b possible. Specifics are sorely lacking at this point,  
however.




Rich

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] nls making R not responding

2011-10-21 Thread Schatzi

Here is the code I am running:
library(nls2)

modeltest- function(A,mu,l,b,thour){
out-vector(length=length(thour))
for (i in 1:length(thour)) {
out[i]-b+A/(1+exp(4*mu/A*(l-thour[i])+2))
}
return(out)
}

A=1.3
mu=.22
l = 15
b = .07

thour = 1:25
Yvals-modeltest(A,mu,l,b,thour)-.125+runif(25)/4
st2 - expand.grid(A = seq(0.1, 1.6,.5), mu = seq(0.01, .41,.1), l=1, b
=seq(0,.6,.3))
lower.bound-list(A=.01,mu=0,l=0,b=0)

try(
invisible(capture.output(mod1-nls2(Yvals~modeltest(A,mu,l,b,thour),
#start = list(b =5, k = 2, l=0),
start = st2,
lower = lower.bound,
algorithm = brute-force
)))
)

try(nmodel-nls(Yvals~modeltest(A,mu,l,b,thour),
start=coef(mod1),
#start=list(A=1.8,mu=.2,l=.5,b=.15),
lower=lower.bound,
algorithm=port)
)

My problem seems to be with initial parameter estimates. I am running
through a couple hundred treatments, so I used nls2 to pick the best
starting values from some options, then I run through nls with those values.
If I have too many options (st2) in nls2, the run takes too long. When I cut
down options, there are either errors, or in some cases R stops responding
completely and I have to shut it down and start over. I do not know why it
shows the not responding. Is there a better way (well, I'm sure there's
always a better way) to do this where I can run through 200+ datasets with
robust enough starting values. 

Any ideas would be greatly appreciated.

Adele

-
In theory, practice and theory are the same. In practice, they are not - Albert 
Einstein
--
View this message in context: 
http://r.789695.n4.nabble.com/nls-making-R-not-responding-tp3926263p3926263.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Working With Variables Having Different Lengths

2011-10-21 Thread Rich Shepard


On Fri, 21 Oct 2011, David Winsemius wrote:

The last part (in the same column) does not make sense, since I was 
interpreting the term parameter to mean a value in a particular column.


David,

  That's what I meant: two values from the 'param' column.


Assuming these are R NA's then logical indexing:
with( chemdata, chemdata[!is.na(param1)  !is.na(param2) , ])


  I'll read the with() help page again. And, I'll try the above with TDS
replacing param1 and Cond replacing param2.

Thanks,

Rich

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Arima Models - Error and jump error

2011-10-21 Thread Flávio Fagundes

Hi people,

I´m trying to development a simple routine to run many Arima models result
from some parâmeters combination.
My data test have one year and daily level.
A part of routine is:

for ( d in 0:1 )
  { for ( p in 0:3 )
   { for ( q in 0:3 )
   { for ( sd in 0:1 )
  { for ( sp in 0:3 )
 { for ( sq in 0:3 )
 {
Yfit=arima(Yst[,2],order=c(p,d,q),seasonal=list(order=c(sp,sd,sq),period=7),include.mean=TRUE,xreg=DU0)
 }}

Until the step 187 it´s run normally, but in the step 187 return a error and
stop the program.


Yfit=arima(Yst[,2],order=c(1,0,1),seasonal=list(order=c(2,1,2),period=7),include.mean=TRUE,xreg=DU0)

Error in optim(init[mask], armafn, method = BFGS, hessian = TRUE, control
= optim.control,  :
  non-finite finite-difference value [1]

My questions is:

1. What this error mean and why it occured?
2. How can I do to this program disregard any error and to continue to run
until the end of looping?
3. Someone know if already have any routine that do this?

Thanks
Flávio

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Working With Variables Having Different Lengths

2011-10-21 Thread Rich Shepard


On Fri, 21 Oct 2011, David Winsemius wrote:


The last part (in the same column) does not make sense, since I was
interpreting the term parameter to mean a value in a particular column. 
Assuming these are R NA's then logical indexing:


with( chemdata, chemdata[!is.na(param1)  !is.na(param2) , ])


David,

  I asked the question improperly. What I should have asked is how to
specify only non-missing values of a parameter to create a new subset.
Example (this includes NA rows):

tds.basin - subset(chemdata, param == TDS, select = c(param, quant, \
basin), drop = TRUE)

  When I try to add '!is.na' with the param selection I get errors.

  To be as specific as I should have been in my original message, how should
I write the above expression to exclude rows where TDS is missing?

Rich

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] stacked plot

2011-10-21 Thread Henri-Paul Indiogine

Hi Dennis!

Fantastic, great, wonderful, beautiful.

I slightly changed your code to adapt it to my situation:

ggplot(DF.2, aes(x=file.name, y=value,
fill=codes))+geom_histogram(position=stack, stat=identity) +
labs(x=document, y=number of codings)

###
file.name codes value
-
file.1   code.1    2
file.1   code.2    0
file.1   code.3    0
file.1   code.4    5
file.1   code.5    4
file.2   code.1    3
file.2   code.2  18
 

There are 126 bars (file1 - file.126), so I should do the following:
(1) convert to a histogram with no gaps between the bars, and (2)
remove the labels at the bottom of each bar and just have
xlab=documents.

However, even with changing geom_bar to geom_histogram there are small
gaps between the bars.

Thanks for your help,

Henri-Paul



-- 
Curriculum  Instruction
Texas AM University
TutorFind Learning Centre

Email: hindiog...@gmail.com
Skype: hindiogine
Website: http://people.cehd.tamu.edu/~sindiogine

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Working With Variables Having Different Lengths

2011-10-21 Thread David Winsemius



On Oct 21, 2011, at 2:09 PM, Rich Shepard wrote:


On Fri, 21 Oct 2011, David Winsemius wrote:


The last part (in the same column) does not make sense, since I was
interpreting the term parameter to mean a value in a particular  
column. Assuming these are R NA's then logical indexing:


with( chemdata, chemdata[!is.na(param1)  !is.na(param2) , ])


David,

 I asked the question improperly. What I should have asked is how to
specify only non-missing values of a parameter to create a new subset.
Example (this includes NA rows):

tds.basin - subset(chemdata, param == TDS, select = c(param,  
quant, \

basin), drop = TRUE)

 When I try to add '!is.na' with the param selection I get errors.


If you do not offer both the code and the verbatim copy of the error  
there will be very little that we can do to diagnose your problem.




 To be as specific as I should have been in my original message, how  
should

I write the above expression to exclude rows where TDS is missing?


First you need to clarify whether TDS is the name of a column or a  
possible value in a column named param. This whole painful multi- 
question process would be greatly accelerated if you offered  
str(chemdata).


--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Arima Models - Error and jump error

2011-10-21 Thread Ken

Perhaps:
require(forecast) 
?auto.arima #
Or look into package fitAR. The first performs seasonal optimization so it is 
likely better for your application.
Ken Hutchison

On Oct 21, 2554 BE, at 1:59 PM, Flávio Fagundes flavi...@gmail.com wrote:

 Hi people,
 
 I´m trying to development a simple routine to run many Arima models result
 from some parâmeters combination.
 My data test have one year and daily level.
 A part of routine is:
 
 for ( d in 0:1 )
  { for ( p in 0:3 )
   { for ( q in 0:3 )
   { for ( sd in 0:1 )
  { for ( sp in 0:3 )
 { for ( sq in 0:3 )
 {
 Yfit=arima(Yst[,2],order=c(p,d,q),seasonal=list(order=c(sp,sd,sq),period=7),include.mean=TRUE,xreg=DU0)
 }}
 
 Until the step 187 it´s run normally, but in the step 187 return a error and
 stop the program.
 
 
 Yfit=arima(Yst[,2],order=c(1,0,1),seasonal=list(order=c(2,1,2),period=7),include.mean=TRUE,xreg=DU0)
 
 Error in optim(init[mask], armafn, method = BFGS, hessian = TRUE, control
 = optim.control,  :
  non-finite finite-difference value [1]
 
 My questions is:
 
 1. What this error mean and why it occured?
 2. How can I do to this program disregard any error and to continue to run
 until the end of looping?
 3. Someone know if already have any routine that do this?
 
 Thanks
 Flávio
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Serialization help.

2011-10-21 Thread rkevinburton


I have the following code:

c - file(c:/temp/r/SkuSalesModel.br, rb)
s - unserialize(c)
close(c)
rm(c)

And it worked as late as yesterday. Today when I came in I get the 
following error:

Error in .Call(R_unserialize, connection, refhook, PACKAGE = base) :
   negative length vectors are not allowed

I have not upgraded or changed any installation and the file has not 
changed. Any ideas on how I can get more info or solve this error?

Thank you.

Kevin

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Scatterplot with the 3rd dimension = color?

2011-10-21 Thread Kerry

Awesome, thank you so much for this! I plan to play around with this
more next week with my actual data, but it provides a lot more options
than I had before I posted. The link will help too.

kb

On Oct 20, 8:18 pm, Dennis Murphy djmu...@gmail.com wrote:
 AFAIK, you can't 'add' two ggplot2 graphs together; the problem in
 this case is that the two color scales would clash. If you're willing
 to discretize the z values, then you could pull it off. Here's an
 example:

 d - data.frame(x = rnorm(100), y = rnorm(100), z = factor(1 +
 (rnorm(100)  0)))
 d1 - data.frame(x = rnorm(100), y = rnorm(100), z = factor(3 +
 (rnorm(100)  0)))
 dd - rbind(d, d1)

 In each data frame, I'm assigning two factor levels depending on
 whether z  0 or not. The factor levels are 1, 2 in d and 3, 4 in d1;
 when rbinded together, z has four distinct levels. Now call ggplot():

 ggplot(dd, aes(x = x, y = y, colour = z)) + geom_point() +
    scale_colour_manual(values = c('1' = 'red', '2' = 'blue', '3' = 'green',
                                   '4' = 'yellow'))

 This may be coarser than you like, so you could always use the cut()
 function to discretize z in each data frame; you'll want to assign the
 levels so that they are distinct in the combined data frame. Example:

 d3 - data.frame(x = rnorm(100), y = rnorm(100),
                  z = cut(rnorm(100), breaks = c(-Inf, -0.5, 0.5, Inf),
 labels = 1:3))
 d4 - data.frame(x = rnorm(100), y = rnorm(100),
                  z = cut(rnorm(100), breaks = c(-Inf, -0.5, 0.5, Inf),
 labels = 4:6))
 dd2 - rbind(d3, d4)

 mycols - c('red', 'maroon', 'blue', 'green', 'cyan', 'yellow')
 ggplot(dd2, aes(x = x, y = y, colour = z)) + geom_point() +
    scale_colour_manual(breaks = levels(dd2$z),
                        values = mycols)

 You can always use the labels = argument of scale_colour_manual() to
 assign more evocative legend values, or equivalently, you can assign
 the labels in the cut() function within d3 and d4 to those you want in
 the legend and leave the plot code as is.

 BTW, there is a dedicated ggplot2 list to which you can subscribe
 throughhttp://had.co.nz/ggplot2/(look for the ggplot2 mailing list
 near the top of the page). The list archives are accessible through
 the same link.

 HTH,
 Dennis









 On Thu, Oct 20, 2011 at 12:25 PM, Kerry kbro...@gmail.com wrote:
  Can someone please help me out with this? The ggplot2 suggestion works
  great but I've spent a few days trying to figure out how to plot 2
  variables with it and I'm stuck. Here's my example code:

  library(ggplot2)
  #Here's the 1st plot
  x-rnorm(100)
  y-rnorm(100)
  z-rnorm(100)
  d - data.frame(x,y,z)
  dg-qplot(x,y,colour=z,data=d)
  dg + scale_colour_gradient(low=red, high=blue)

  #Here's the 2nd plot which will delete the 1st plot above but I'd
  like
  them to be plotted together
  x1-rnorm(100)
  y2-rnorm(100)
  z3-rnorm(100)
  d1 - data.frame(x1,y1,z1)
  dg1 -qplot(x1,y1,colour=z1,data=d1)
  dg1 + scale_colour_gradient(low=green, high=yellow)

  I've been trying to get long format working but it just doesn't make
  any sense to me.

  Thanks,
  kb

  On Oct 17, 3:10 pm, Kerry kbro...@gmail.com wrote:
  Yes, the qplot works great, but do you know how to allow for multiple
  plots? I want one variable to be plotted say from blue to red and
  another say from yellow to green but in the same graph, each having
  there own separate legends. I've tried print() and arrange() but no
  luck.

  Thanks again,
  kb

  On Oct 2, 10:42 pm, Ben Bolker bbol...@gmail.com wrote:

   Duncan Murdoch murdoch.duncan at gmail.com writes:

On 11-10-02 1:11 PM, Kerry wrote:
 I have 3 columns of data and want to plot each row as a point in a
 scatter plot and want one column to be represented as a color 
 gradient
 (e.g. larger  values being more red). Anyone know the command or
 package for this?

It's not a particularly effective display, but here's how to do it.  
Use
rainbow(101) in place of rev(heat.colors(101)) if you like.

x - rnorm(10)
y - rnorm(10)
z - rnorm(10)
colors - rev(heat.colors(101))
zcolor - colors[(z - min(z))/diff(range(z))*100 + 1]
plot(x,y,col=zcolor)

     or

   d - data.frame(x,y,z)
   library(ggplot2)
   qplot(x,y,colour=z,data=d)

     I agree about the not particularly effective display
   comment, but if you have two continuous predictors and
   a continuous response you've got a tough display problem --
   your choices are:

     1. use color, size, or some other graphical characteristic
   (pretty far down on the Cleveland hierarchy)
     2. use a perspective plot (hard to get the right viewing
   angle, often confusing)
     3. use coplots/small multiples/faceting (requires
   discretizing one dimension)

   __
   r-h...@r-project.org mailing 
   listhttps://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting 
   guidehttp://www.R-project.org/posting-guide.html

[R] PCA and Regression with complex categorical variables

2011-10-21 Thread seanstclair


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to use gev.fit (package ismev) under box constraints?

2011-10-21 Thread NoSkill ButStyle

Hallo,

it seems as if something did not work with my first email

I would like to estimate parameters of a general extreme value (GEV) 
distribution using maximum likelihood as implemented in the gev.fit function of 
package ismev. If I do the follwing:

y.training- c(22, 22, 18, 19, 18, 18, 22, 27, 25, 19, 18, 21, 18, 20, 18, 19, 
18, 21, 29, 18, 22, 19, 19, 24, 18, 21, 22, 20, 20, 27, 18, 20, 20, 18, 18, 18, 
21, 18, 18, 21, 26, 19, 18, 19, 19, 18, 19, 18, 20, 20, 25, 21, 26, 22, 20, 19, 
22, 21, 21, 20, 20, 19, 18, 22, 22, 27, 19, 20, 26, 29, 18, 20, 19, 22, 23, 18, 
20, 20, 22, 18, 23, 18, 20, 19, 27, 21, 22, 18, 18, 19, 18, 21, 18, 23, 18, 18, 
20, 20, 24, 19, 18, 19, 19, 23, 19, 18, 25, 18, 24, 19)
fit-gev.fit(xdat=y.training,show=F)
round(fit$mle,2) # 18.00 , 0.00 , 3.96

# The estimated shape parameter is 3.96. I would like to perform the estimation 
under the constraint that the shape parameter is smaller than 1, but the 
following does not work:

fit-gev.fit(xdat=y.training,show=F,method=L-BFGS-B,lower=c(0,0,-2),upper=c(50,10,1))
round(fit$mle,2) # 18.09 , 0.27 , 3.05

It seems as is the lower and upper values are not passed to the optim 
function in the way they should be. A warning says that they are only passed to 
the control part of optim. Therefore my question: (How) is it possible to use 
the gev.fit-function to perform the ML estimation under the constraint that the 
shape parameter is smaller than 1? Or more general: Is it possible to use the 
gev.fit function under box constraints as it should be possible for optim?

Thanks in advance.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] quantmod package

2011-10-21 Thread ATANU

thanks for the help. but with that code it is possible to save the current
quotes in a text file(only the date-time in the first columnis not
preserved). when i used read.table and tried to convert it into an xts
object it shows error as it cannot take the indices as time object. same
case happens if i only save the quotes in a dataframe using rbind.( i guess
in the latter case that happens because the symbol ,say TCS.NS gets attached
with the date). please suggest a solution to this problem. 

--
View this message in context: 
http://r.789695.n4.nabble.com/quantmod-package-tp3921071p3925863.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Scatterplot with the 3rd dimension = color?

2011-10-21 Thread Kerry

Beautiful! It works perfectly, thanks!

kb

On Oct 21, 7:42 am, Jim Lemon j...@bitwrit.com.au wrote:
 On 10/21/2011 06:25 AM, Kerry wrote:

  Can someone please help me out with this? The ggplot2 suggestion works
  great but I've spent a few days trying to figure out how to plot 2
  variables with it and I'm stuck. Here's my example code:
  ...

 Hi Kerry,
 This isn't ggplot2, but it may do what you want.

 library(plotrix)
 oldmar-par(mar=c(5,4,4,4))
 plot(x,y,type=n)
 plotlim-par(usr)
 rect(plotlim[1],plotlim[3],plotlim[2],plotlim[4],col=lightgray)
 grid(col=white)
 box()
 points(x,y,col=color.scale(z,c(1,0),0,c(0,1)),pch=19)
 points(x1,y2,col=color.scale(z3,1,c(0,1),0),pch=19)
 legendval1-seq(min(z),max(z),length.out=5)
 color.legend(2.9,0.5,3.1,1.5,round(legendval1,1),align=rb,gradient=y,
   rect.col=color.scale(legendval1,c(1,0),0,c(0,1)))
 legendval2-seq(min(z3),max(z3),length.out=5)
 color.legend(2.9,-1.5,3.1,-0.5,round(legendval2,1),align=rb,gradient=y,
   rect.col=color.scale(legendval2,c(1,1),c(0,1),0))
 par(xpd=TRUE)
 text(3,1.6,z)
 text(3,-0.4,z3)
 par(xpd=FALSE,oldmar)

 Jim

 __
 r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] PCA and Regression with complex categorical variables

2011-10-21 Thread David Winsemius

Did you perhaps send an HTML message? As detailed in the Posting  
Guide, those get scrubbed by the mail-server.



On Oct 21, 2011, at 10:48 AM, seanstcl...@verizon.net wrote:
nothing

--
David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lattice::xyplot/ggplot2: plotting weighted data frames with lmline and smooth

2011-10-21 Thread Michael Friendly


Thanks very much, Dennis.  See below for something I don't understand.

On 10/21/2011 12:15 PM, Dennis Murphy wrote:

Hi Michael:

Here's one way to get it from ggplot2. To avoid possible overplotting,
I jittered the points horizontally by ± 0.2. I also reduced the point
size from the default 2 and increased the line thickness to 1.5 for
both fitted curves. In ggplot2, the term faceting is synonymous with
conditioning (by groups).

library('HistData')
library('ggplot2')
ggplot(PearsonLee, aes(x = parent, y = child)) +
geom_point(size = 1.5, position = position_jitter(width = 0.2)) +
geom_smooth(method = lm, aes(weights = PearsonLee$weight),
colour = 'green', se = FALSE, size = 1.5) +
geom_smooth(aes(weights = PearsonLee$weight),
colour = 'red', se = FALSE, size = 1.5) +
facet_grid(chl ~ par)
This seems to work, but I don't understand *why*, since the weight 
variable is

PearsonLee$frequency, not PearsonLee$weight.

 PearsonLee$weight
NULL

I get an error if I try to use PearsonLee$frequency as the weights= 
variable.


 ggplot(PearsonLee, aes(x = parent, y = child)) +
+geom_point(size = 1.5, position = position_jitter(width = 0.2)) +
+geom_smooth(method = lm, aes(weights = PearsonLee$frequency),
+colour = 'green', se = FALSE, size = 1.5) +
+geom_smooth(aes(weights = PearsonLee$frequency),
+colour = 'red', se = FALSE, size = 1.5) +
+facet_grid(chl ~ par)
Error in eval(expr, envir, enclos) : object 'weight' not found

In the form below, it makes sense to me and does work, using 
weight=frequency in the initial aes(),

and weight= in geom_smooth:

ggplot(PearsonLee, aes(x = parent, y = child, weight=frequency)) +
   geom_point(size = 1.5, position = position_jitter(width = 0.2)) +
   geom_smooth(method = lm, aes(weight = PearsonLee$frequency),
   colour = 'green', se = FALSE, size = 1.5) +
   geom_smooth(aes(weight = PearsonLee$frequency),
   colour = 'red', se = FALSE, size = 1.5) +
   facet_grid(chl ~ par)



# If you prefer a legend, here's one take, pulling the legend inside
# to the upper left corner. This requires a bit more 'trickery', but
# the tricks are found in the ggplot2 book.

ggplot(PearsonLee, aes(x = parent, y = child)) +
geom_point(size = 1.5, position = position_jitter(width = 0.2)) +
geom_smooth(method = lm, aes(weights = PearsonLee$weight,
colour = 'Linear'), se = FALSE, size = 1.5) +
geom_smooth(aes(weights = PearsonLee$weight,
colour = 'Loess'), se = FALSE, size = 1.5) +
facet_grid(chl ~ par) +
scale_colour_manual(breaks = c('Linear', 'Loess'),
values = c('green', 'red')) +
opts(legend.position = c(0.14, 0.885),
 legend.background = theme_rect(fill = 'white'))


HTH,
Dennis



--
Michael Friendly Email: friendly AT yorku DOT ca
Professor, Psychology Dept.
York University  Voice: 416 736-5115 x66249 Fax: 416 736-5814
4700 Keele StreetWeb:   http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Working With Variables Having Different Lengths

2011-10-21 Thread Rich Shepard


On Fri, 21 Oct 2011, David Winsemius wrote:


First you need to clarify whether TDS is the name of a column or a
possible value in a column named param. This whole painful
multi-question process would be greatly accelerated if you offered
str(chemdata).


  Yes, I did on a different thread, but not on this one.

str(chemdata)
'data.frame':   47244 obs. of  6 variables:
 $ site: Factor w/ 143 levels BC-0.5,BC-1,..: 134 134 134 127 127
 $ sampdate: Date, format: 2006-12-06 2006-12-06 ...
 $ param   : Factor w/ 66 levels AGP,ANP,ANP/AGP,..: 58 66 12 24 59 66
 $ quant   : num  1.08e+04 7.95 1.80e-02 2.80e+02 1.90e+01 8.44 1.62e+03
 $ stream  : Factor w/ 24 levels B,C,..: 4 4 4 21 21 21 4
 $ basin   : Factor w/ 2 levels Basin1,Basin2: 1 1 1 1 1 1 1 1 1 2 ...

  What I need to do is examine the relationships between the parameter TDS
and other parameters associated with it; e.g., Cond and SO4. I started
by subsetting the main data frame (chemdata)

tds.basin - subset(chemdata, param == TDS, select = c(param, quant, \
basin), na.rm = TRUE, drop = TRUE)

cond.basin - subset(chemdata, param == Cond, select = c(param, quant, \
basin), na.rm = TRUE, drop = TRUE)

However, these left the NA rows in the new data frames.

  I can produce an xyplot() using tds.basin$quant and cond.basin$quant, but
it's obvious there are many points where one or the other have NA values.
When I tried a linear regression it failed because of an unequal number of
rows in both data frames.

  What I need to learn are: 1) how to write the subset() to remove the NA
rows for each one and 2) how to perform linear regression (and further
analyses) on these pairs of data frames.


If you do not offer both the code and the verbatim copy of the error there
will be very little that we can do to diagnose your problem.


str(tds.basin)
'data.frame':   2206 obs. of  3 variables:
 $ param: Factor w/ 66 levels AGP,ANP,ANP/AGP,..: 58 58 58 58 58 58 58
 $ quant: num  10800 530 3838 3658 3756 ...
 $ basin: Factor w/ 2 levels Basin1,Basin2: 1 2 2 2 2 2 2 2 2 2 ...

str(cond.basin)
'data.frame':   1191 obs. of  3 variables:
 $ param: Factor w/ 66 levels AGP,ANP,ANP/AGP,..: 24 24 24 24 24 24 24
 $ quant: num  280 3170 4220 3420 3700 ...
 $ basin: Factor w/ 2 levels Basin1,Basin2: 1 2 2 2 2 2 2 2 2 2 ...

then,

 m1 - lm(tds.basin$quant ~ cond.basin$quant)
Error in model.frame.default(formula = tds.basin$quant ~ cond.basin$quant,
:
  variable lengths differ (found for 'cond.basin$quant')

Rich

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] rgl device on web

2011-10-21 Thread Ben qant

Hello,

I'm looking for help putting an interactive rgl package 3d device on the web
so that it maintains full functionality. Where should I start? Is it
possible? Is there an example I can see? (Note: I'm also looking at putting
other normal plots on the web.) I'd like to stay within R as much as
possible...  I didn't find much online regarding rgl 3d plots.

Thanks,

Ben

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Working With Variables Having Different Lengths

2011-10-21 Thread David Winsemius



On Oct 21, 2011, at 3:02 PM, Rich Shepard wrote:


On Fri, 21 Oct 2011, David Winsemius wrote:


First you need to clarify whether TDS is the name of a column or a
possible value in a column named param. This whole painful
multi-question process would be greatly accelerated if you offered
str(chemdata).


 Yes, I did on a different thread, but not on this one.

str(chemdata)
'data.frame':   47244 obs. of  6 variables:
$ site: Factor w/ 143 levels BC-0.5,BC-1,..: 134 134 134 127  
127

$ sampdate: Date, format: 2006-12-06 2006-12-06 ...
$ param   : Factor w/ 66 levels AGP,ANP,ANP/AGP,..: 58 66 12  
24 59 66
$ quant   : num  1.08e+04 7.95 1.80e-02 2.80e+02 1.90e+01 8.44 1.62e 
+03

$ stream  : Factor w/ 24 levels B,C,..: 4 4 4 21 21 21 4
$ basin   : Factor w/ 2 levels Basin1,Basin2: 1 1 1 1 1 1 1 1 1  
2 ...


 What I need to do is examine the relationships between the  
parameter TDS

and other parameters associated with it; e.g., Cond and SO4.


How are we to determine which lines contain information about the   
relationships of param==TDS with  whatever cases or variable has  
values of Cond and SO4? Are you really trying to compare two  
disjoint groups on some statistic like the means and std-dev of  
quant? (This would be a job for `aggregate`.)



I started
by subsetting the main data frame (chemdata)

tds.basin - subset(chemdata, param == TDS, select = c(param,  
quant, \

basin), na.rm = TRUE, drop = TRUE)

cond.basin - subset(chemdata, param == Cond, select = c(param,  
quant, \

basin), na.rm = TRUE, drop = TRUE)


So now you have two disjoint subsets. Why should we think they can be  
analyzed with regression methods?




However, these left the NA rows in the new data frames.


Not for the param column I hope. And the na.rm= arguments should get  
ignored by subset.




 I can produce an xyplot() using tds.basin$quant and cond.basin 
$quant, but
it's obvious there are many points where one or the other have NA  
values.
When I tried a linear regression it failed because of an unequal  
number of

rows in both data frames.

 What I need to learn are: 1) how to write the subset() to remove  
the NA

rows for each one and 2) how to perform linear regression (and further
analyses) on these pairs of data frames.

If you do not offer both the code and the verbatim copy of the  
error there

will be very little that we can do to diagnose your problem.


str(tds.basin)
'data.frame':   2206 obs. of  3 variables:
$ param: Factor w/ 66 levels AGP,ANP,ANP/AGP,..: 58 58 58 58  
58 58 58

$ quant: num  10800 530 3838 3658 3756 ...
$ basin: Factor w/ 2 levels Basin1,Basin2: 1 2 2 2 2 2 2 2 2 2 ...

str(cond.basin)
'data.frame':   1191 obs. of  3 variables:
$ param: Factor w/ 66 levels AGP,ANP,ANP/AGP,..: 24 24 24 24  
24 24 24

$ quant: num  280 3170 4220 3420 3700 ...
$ basin: Factor w/ 2 levels Basin1,Basin2: 1 2 2 2 2 2 2 2 2 2 ...

then,

m1 - lm(tds.basin$quant ~ cond.basin$quant)
Error in model.frame.default(formula = tds.basin$quant ~ cond.basin 
$quant,

:
 variable lengths differ (found for 'cond.basin$quant')


In regression call it is almost alwasy better to construct them with a  
data argument:



m1 - lm(tds.basin$quant ~ cond.basin$quant)




Rich

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Kleinberg's burst detection algorithm

2011-10-21 Thread Stavros Macrakis

Has anyone here implemented Jon Kleinberg's burst detection algorithm
(Bursty and Hierarchical Structure in Streams
http://www.cs.cornell.edu/home/kleinber/bhs.pdf)?

I'd rather not reimplement if there's already running code available

Thanks,

-s

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] glm-poisson fitting 400.000 records

2011-10-21 Thread D_Tomas

My apologies for my vague comment. 

My data comprises 400.000 x 21 (17 explanatory variables, plus response
variable, plus two offsets). 

If I build the full model (only linear) I get: 

Error: cannot allocate vector of size 112.3 Mb 

I have a 4GB RAM laptop... Would i get any improvemnt on a 8GB computer 

Many thanks, 
 



--
View this message in context: 
http://r.789695.n4.nabble.com/glm-poisson-fitting-400-000-records-tp3925100p3925968.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Find a particular point on a curve

2011-10-21 Thread Joanie

The most important thing is the point termed C (on the image):
http://r.789695.n4.nabble.com/file/n3926631/courbe_temp%C3%A9rature.png 

which is the first point (time, temperature) where temperature stabilizes
after the temperature drop (end of feeding). The definition of that
particular point is :

The point where temperature stabilizes within 1 sd (calculated from the
prefeeding temperature) over a minimum of 10 minutes.

Can someone help me writing the code for it?

Thank you very much,

Joanie

--
View this message in context: 
http://r.789695.n4.nabble.com/Find-a-particular-point-on-a-curve-tp3882721p3926631.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

1 2 >

1 - 100 of 133 matches

Mail list logo