from:"Achim Zeileis"

Re: [R] Help with color coded bar graph

2007-09-07 Thread Achim Zeileis

On Fri, 7 Sep 2007, Luis Naver wrote:

 I have a list of observations that are -1, 1 or 0.  I would like to
 represent them in a horizontal bar color coded based on value like a
 stacked bar graph. I can achieve this in the form of a png with the
 following code:

 A = floor(runif(10)*3) - 1

 png(width=100, height=10)
 par(mar=c(0,0,0,0))
 image(matrix(A), col=grey(c(0.1, 0.5, 0.9)))
 dev.off()

If I understand you correctly, you want a sequence of bars with equal 
height and colors coded by A (treated like a factor). So Maybe something 
like
   cA - grey.colors(3)[factor(A)]
   barplot(rep(1, length(A)), col = cA, border = cA)
or
   barplot(rep(1, length(A)), col = cA, border = cA, space = 0,
 xaxs = i, axes = FALSE)
?

hth,
Z


 However I would like to do this with one of the standard plotting
 tools (i.e. barplot) to take advantage of labels and multiple
 series.  Any help would be appreciated.

 - Luis Naver

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] question: randomization t-test function already defined in R?

2007-09-05 Thread Achim Zeileis

On Wed, 5 Sep 2007, [EMAIL PROTECTED] wrote:

 Dear R Users,

 I am hoping you can help me.

 I have received code from a colleague who uses Matlab.  I need to
 translate it into R.

 I am wondering if there is a randomization t-test (from non-parametric
 statistics) function already defined in R.
 (In Matlab the function is randtest.m.)

I don't know what randtest in MATLAB does exactly, but I guess that you 
might want to look at the function independence_test() in package coin. 
The distribution argument should probably set to use either exact or 
approximate p values.

hth,
Z

 
 **
 QUESTION:  Is anyone aware of a randomization t-test function in R?
 
 **

 Thank you for your help,

 Karen

 ---

 Karen M. Green, Ph.D.

 Research Investigator

 Drug Design Group

 Sanofi Aventis Pharmaceuticals

 1580 E. Hanley Blvd.

 Tucson, AZ  85737-9525

  USA


   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] piecewise linear approximation

2007-08-29 Thread Achim Zeileis

On Wed, 29 Aug 2007, Naxerova, Kamila wrote:

 Dear list,

 I have a series of data points which I want to approximate with exactly two
 linear functions. I would like to choose the intervals so that the total
 deviation from my fitted lines is minimal. How do I best do this?

From the information you give it seems that you want to partition a model 
like
   lm(y ~ x)
along a certain ordering of the observations. Without any further 
restrictions you can do that with the function breakpoints() in package 
strucchange. If there are continuity restrictions or something like 
that, you want to look at the segmented package.

hth,
Z

 Thanks!
 Kamila


 The information transmitted in this electronic communication...{{dropped}}

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Est of SE of coefficients from lm() function

2007-08-24 Thread Achim Zeileis

On Fri, 24 Aug 2007, Arun Kumar Saha wrote:

 Dear all R users,

 Can anyone tell me how I can get estimate of SE of coefficients from, lm()
 function?

The vcov() function extracts the estimated covariance matrix:
  fm - lm(...)
  vcov(fm)
Thus, you can get the ESE via
  sqrt(diag(vcov(fm)))

hth,
Z

 I tried following :

 x = 1:10
 lm(x[-1]~x[-10]-1)$coefficients

 Here I got the est. of coefficient, however I also want to get some
 automated way to get estimate of SE.

 Regards,

   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Turning a logical vector into its indices without losing its length

2007-08-24 Thread Achim Zeileis

On Fri, 24 Aug 2007, Leeds, Mark (IED) wrote:

 I have the code below which gives me what I want for temp based on
 logvec but I was wondering if there was a shorter way ( i.e :
 a one liner ) without having to initialize temp to zeros.  This is
 purely for learning purposes. Thanks.

 logvec - c(TRUE,FALSE,FALSE,TRUE,FALSE,FALSE,TRUE,FALSE)

R logvec[logvec] - which(logvec)
R logvec
[1] 1 0 0 4 0 0 7 0

hth,
Z

 temp-numeric(length(invec))
 temp[invec]-which(invec)
 temp

 [1] 1 0 0 4 0 0 7 0

 obviously, the code below doesn't work.

 temp - which(invec)
  temp
 [1] 1 4 7
 

 This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Conversion of zoo object (with POSIXct-attributes) to dataframe

2007-08-23 Thread Achim Zeileis

On Thu, 23 Aug 2007 [EMAIL PROTECTED] wrote:

 i'm trying to convert a zoo object of the following specifications:

  str(z)
  atomic [1:15642] 0 0 0 0 0 0 0 0 0 0 ...
  - attr(*, index)='POSIXct', format: chr [1:15642] 2004-09-01 02:00:00 
 2004-09-01 03:00:00 2004-09-01 04:00:00 2004-09-01 05:00:00 ...
  - attr(*, frequency)= num 0.000278

(Side note 1: The frequency looks suspicious...you probably want a zoo
rather than a zooreg object.)

 to convert into a dataframe. the atomic object (or what it is...) is

(Side note 2: a vector for univariate zoo series, a matrix for
multivariate zoo series.)

 convertet correctly, but the time attribute in the POSIXct format is
 replaced by an index number

 how can i get my data with the corresponding date/time as a second column?

Just add it afterwards, e.g.,
  z - zoo(cbind(a = 1:5, b = 2:6), as.POSIXct(as.Date(2007-01-01) + 0:4))
  d - as.data.frame(z)
  d$Date - time(z)

(Side note 3: The reason for not including the date is that we wanted the
data to be as similar as possible in the zoo series and the data frame,
e.g., for
  lm(a ~ ., data = as.data.frame(z))
)

hth,
Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Conversion of zoo object (with POSIXct-attributes) to dataframe

2007-08-23 Thread Achim Zeileis

On Thu, 23 Aug 2007, Marc Gilgen wrote:

 Thanks for the inputs, but it's not exactly what I want...

You did not provide sufficient information. In particular, you still do
not provide a small reproducible example.

 My problem is, that my timeseries (precipitation) contain lots of NA-values

 and when I apply the code:

 z- data.frame(a=p1$time.date, precip=p1$m1)

Why do you store this as data frame and not as a zoo series? If you had a
zoo series, you could use...

 #gives a dataframe with precipitation and time (2004-01-01 02:12:00)
 z- z[complete.cases(z),]
 m1.h-tapply(z$precip, as.POSIXct(trunc(z$a, 'day')), sum)

...the aggregate() method for zoo series here.

 #gives me the daily sum values of this serie (I need to redo this also for
 hourly values)
 m1.h-as.data.frame(m1.h)

 to get daily summations, the NAs are out and the time series is not
 continous, therefore I need the time/dates corresponding to the measurement
 back in the DATA-FRAME!

 Can I do this, or is the way with tapply/trunc wrong to get hourly/daily
 summation of my measurements (which are originally in 10 and 15 min period
 (randomly).

I don't understand why you want to keep going back and forth between zoo
and data.frame. I would just keep it in zoo and use the corresponding
mehtods. See ?aggregate.zoo for example.

hth,
Z

 really, really thanks a lot...

 Marc

 2007/8/23, Achim Zeileis [EMAIL PROTECTED]:
 
  On Thu, 23 Aug 2007 [EMAIL PROTECTED] wrote:
 
   i'm trying to convert a zoo object of the following specifications:
  
str(z)
atomic [1:15642] 0 0 0 0 0 0 0 0 0 0 ...
- attr(*, index)='POSIXct', format: chr [1:15642] 2004-09-01
  02:00:00 2004-09-01 03:00:00 2004-09-01 04:00:00 2004-09-01 05:00:00
  ...
- attr(*, frequency)= num 0.000278
 
  (Side note 1: The frequency looks suspicious...you probably want a zoo
  rather than a zooreg object.)
 
   to convert into a dataframe. the atomic object (or what it is...) is
 
  (Side note 2: a vector for univariate zoo series, a matrix for
  multivariate zoo series.)
 
   convertet correctly, but the time attribute in the POSIXct format is
   replaced by an index number
  
   how can i get my data with the corresponding date/time as a second
  column?
 
  Just add it afterwards, e.g.,
z - zoo(cbind(a = 1:5, b = 2:6), as.POSIXct(as.Date(2007-01-01) +
  0:4))
d - as.data.frame(z)
d$Date - time(z)
 
  (Side note 3: The reason for not including the date is that we wanted the
  data to be as similar as possible in the zoo series and the data frame,
  e.g., for
lm(a ~ ., data = as.data.frame(z))
  )
 
  hth,
  Z
 
 
 


 --
 ***
 Marc Gilgen
 Plattenstrasse 21
 8032 Zürich
 +41 78 686 06 49
 [EMAIL PROTECTED]


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Subsetting zoo object with a vector of time values.

2007-08-21 Thread Achim Zeileis

On Tue, 21 Aug 2007, Todd Remund wrote:

 I have a zoo object for which I would like to subset using a vector of time
 values.  For example, I have the following time values represented in my zoo
 object.

 -50.000 -49.996 -49.995 -49.960 -49.956 -49.955 -49.920 -49.916 -49.915
 -49.880

 and would like to get observations corresponding to times

 -50 -49.96 -49.92 -49.88.

 What can I do without using the lapply or which functions?

Use the window() function:

z - zoo(1:10, c(-50.000, -49.996, -49.995, -49.960, -49.956, -49.955,
 -49.920, -49.916, -49.915, -49.880))
window(z, c(-50, -49.96, -49.92, -49.88))

hth,
Z

 Thank you.

 Todd Remund

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] time series with quality codes

2007-08-16 Thread Achim Zeileis


On Thu, 16 Aug 2007, Felix Andrews wrote:


list(...),

I am working with environmental time series (eg rainfall, stream flow)
that have attached quality codes for each data point. The quality
codes have just a few factor levels, like good, suspect, poor,
imputed. I use the quality codes in plots and summaries. They are
carried through when a time series is aggregated to a longer
time-step, according to rules like worst, median or mode.

I need to support time steps of anything from hours to years. I can
assume the data are regular time series -- they might be irregular
initially but could be 'regularized'. But I would want to plot
irregular time series along with regular ones.

So far I have been using a data frame with a POSIXct column, a numeric
column and a factor column. However I would like to use zoo instead,
because of its many utility functions and easy conversion to ts. Is
there any prospect of zoo handling such numeric + factor data? Other
suggestions on elegant ways to do it are also welcome.


There is some limited support for this in zoo. You can do
  z - zoo(myfactor, myindex)
and work with it like a zoo series and then
  coredata(z)
will recover a factor. However, you cannot bind this to other series 
without losing the factor structure. At least not in a plain zoo series. 
But you can do

  df - merge(z, Z, retclass = data.frame)
where every column of the resulting data.frame is a univariate zoo series.

The final option would be to just have a data.frame as usual and put your 
data/index into one column. But then it's more difficult to leverage zoo's 
functionality.


I would like to have more support for things like this, but currently this 
is what we have.


Best,
Z


Felix

--
Felix Andrews / 安福立
PhD candidate
Integrated Catchment Assessment and Management Centre
The Fenner School of Environment and Society
The Australian National University (Building 48A), ACT 0200
Beijing Bag, Locked Bag 40, Kingston ACT 2604
http://www.neurofractal.org/felix/
xmpp:[EMAIL PROTECTED]
3358 543D AAC6 22C2 D336  80D9 360B 72DD 3E4C F5D8

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Regression tree: labels in the terminal nodes

2007-08-16 Thread Achim Zeileis


On Thu, 16 Aug 2007, Juergen Kuehn wrote:


Dear everybody,


I'm a new user of R 2.4.1 and I'm searching for information on improving
the output of regression tree graphs.

In the terminal nodes I am up to now able to indicate the number of
values (n) and the mean of all values in this terminal node by the command

 text(tree, use.n=T, xpd=T)


Note that this is specific to the rpart implementation. There are also 
other recursive partitioning algorithms available, see e.g., the 
MachineLearning task view at

  http://CRAN.R-project.org/src/contrib/Views/MachineLearning.html

The ctree() algorithm in package party provides more flexibility 
concerning visualization. The plotting methods are all extensible although 
this is not extensively documented.


Maybe you also want to look at Simon Urbanek's interactive KLIMT software 
for tree visualization.


hth,
Z


Yet I would like to indicate automatically in the output graph of the
tree some quality measure, e.g. impurity (or standard deviation .)
and a character to identify which cases are in one terminal node, e.g.
given a ID number or name.

Until now I did not discover in my R help scripts or in the R programm
help how to do it. So I calculate impurity by hand, and I'm identifying
the cases in each node by hand, and I am adding these values with a
graphical software to the graph (as e.g. given *.jpg file). Therefore
anybody can see that I added these values by hand. I'm quite sure that
there is a more easy and faster way which looks (and is!) more professional.

Could anybody help me? That would be great!


Thank you very much for your support!


Dr. Jürgen Kühn
Leibniz-Centre for Agricultural Landscape Research (ZALF)
Institute of Soil Landscape Research_
http://www.zalf.de/home_zalf/institute/blf/blf_e/index.html_
Eberswalder Str. 84
D-15374 Müncheberg
Tel.: ++49/(0)33432/82-123
Fax: ++49/(0)33432/82-280
E-mail: [EMAIL PROTECTED] mailto:[EMAIL PROTECTED]
Internet: http://www.zalf.de/home_zalf/institute/blf/blf/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error message when using zero-inflated count regression model in package zicounts

2007-08-16 Thread Achim Zeileis

On Thu, 16 Aug 2007, James R. Milks wrote:

 Dr. Stevens,

 I've double-checked my variable lengths.  All of my variables
 (Total.vines, Site, Species, and DBH) came in at 549.  I did correct
 one problem in the data entry that had escaped my previous notice:
 somehow the undergrad who entered all the data managed to make the
 Acer negundo data split into two separate categories while still
 appearing to use the same ACNE abbreviation.  When I made that
 correction and re-ran zicounts, R gave me the following error messages:

Hmm, I don't know about the error messages in zicounts, but you could try 
to use the zeroinfl() implementation in package pscl:
   vines.zip - zeroinfl(Total.vines ~ Site + Species + DBH | Site +
 Species + DBH, data = sycamores.1)
and see whether this produces a similar error.
Z

  vines.zip-zicounts(resp=Total.vines~.,x=~Site+Species+DBH,z=~Site
 +Species+DBH,distname=ZIP,data=sycamores.1)

 Error in ifelse(y == 0, 1, y/mu) : dim-: dims [product 12] do not
 match the length of object [549]
 In addition: Warning messages:
 1: longer object length is not a multiple of shorter object length
 in: eta + offset
 2: longer object length is not a multiple of shorter object length
 in: y/mu

 In addition, zicounts would not run a normal poisson regression on
 the data, giving me the same error messages as the ZIP regression.
 Doing a poisson regression with glm did not show any error messages.
 However, the glm model with full interactions was still over-dispersed.

 Could the zicounts problem be that the individual sites and species
 had different population sizes?  For instance, Site A had 149 trees,
 site B had 55 trees, site C had 270 trees, and site D had 75 trees.
 The species had similar discrepancies in population sizes, with
 Platanus occidentalis and Acer negundo forming the majority of the
 trees.

 Thanks for your help.

 Jim Milks

 Graduate Student
 Environmental Sciences Ph.D. Program
 136 Biological Sciences
 Wright State University
 3640 Colonel Glenn Hwy
 Dayton, OH 45435

 On Aug 15, 2007, at 5:58 AM, Martin Henry H. Stevens wrote:

 Hi Jim,
 With regard to same number, I simply wanted to make sure that each
 variable was the same length. The error message you show is what
 you would get if, for instance, you misspelled one of the variables
 and it doesn't exist, in which case it would be NULL, while your
 other variables would each be 550 elements in length.
 Hank
 On Aug 14, 2007, at 4:47 PM, James Milks wrote:

 Dr. Stevens,

 Unfortunately, Poisson gives me an over-dispersed model with only
 3 out of 14 variables/interactions significant.  Doing a step-wise
 poisson regression still ended up with the same over-dispersed
 model.  Given the high number of zeros in the response variable,
 Dr. Thad Tarpey (one of our statisticians on campus) suggested
 zero-inflated poisson regression as a possible solution to the
 over-dispersion problem.

 As for variables of the same length, there are different numbers
 of trees for each species and site since we ran different numbers
 of transects at each site (some sites were larger than others) and
 there were different numbers of species and trees within each
 transect.  Acer negundo made up ~33% of all the trees we measured;
 Platanus occidentalis had 25%; Fraxinus americana was another 12%
 and ~11% was Ulmus americana.  The remaining 25% was divided among
 16 other species, all of which were excluded from the analysis due
 to singularities in the model when they were included (something
 about glm not liking singularities in the model).  So if the
 zicounts requires that each species and site have the same length,
 then I will not be able to use it unless I can get R to randomly
 select x trees from each species and site combination.

 Thanks for your input.

 Jim Milks

 Graduate Student
 Environmental Sciences Ph.D. Program
 136 Biological Sciences
 Wright State University
 3640 Colonel Glenn Hwy
 Dayton, OH 45435


 On Aug 14, 2007, at 9:37 AM, Hank Stevens wrote:

 Hi Jim,
 Two thoughts come to me, unencumbered by the thought process or
 knowledge of zicounts:
 1. Is Poisson really NOT appropriate? (do you have to use zicounts?)
 2. Are you 110% certain that all variables are the same length?
 Would NA's interfere?
 Cheers,
 Hank
 On Aug 13, 2007, at 5:10 PM, James Milks wrote:

 I have data on number of vines per tree for ~550 trees.  Over
 half of
 the trees did not have any vines and the data is fairly skewed
 (median = 0, mean = 1.158, 3rd qu. = 1.000).  I am attempting to
 investigate whether plot location (four sites), species (I'm using
 only the four most common species), or tree dbh has a significant
 influence on the number of vines per tree.  When I attempted to use
 the zicounts function, R gave me the following error message:

 vines.zip-zicounts(resp=Total.vines~.,x=~Site+Species+DBH,z=~Site
 +Species+DBH,distrname=ZIP,data=sycamores.1)
 Error in ifelse(y == 0, 1, y/mu) : dim- : dims [product 12] do not

Re: [R] cannot add lines to plot

2007-08-07 Thread Achim Zeileis

On Tue, 7 Aug 2007, Zeno Adams wrote:


 Hello,

 I want to plot a time series and add lines to the plot later on.
 However, this seems to work only as long as I plot the series against
 the default index. As soon as I plot against an object
 of class chron or POSIXt (i.e. I want to add a date/time axis), the
 lines do not appear anymore. The command to add the lines is executed
 without an error message.

1. Read the posting guide!
2. Provide a reproducible example (as suggested in the posting guide).
3. Use a time series object for your time series, see e.g. package zoo.
   If you set up a multivariate time series, you can get what you want by
 plot(z, plot.type = single)
   See the zoo vignettes for more information.
4. If you really want to modify the code below, you probably need to
   repeat (the right elements of) datum2 in lines(), e.g.,
 lines(datum2[...], gvarindus, lwd = 2)
   and you should really drop unnecessary brackets as in
 ylab = (Return)

hth,
Z

 (THIS DOES NOT ADD THE LINES)
 plot(datum2[(3653):(3653+i)],dlindus[(3653):(3653+i)], col
 =hcl(h=60,c=35,l=60), ylim=c(-8,8), type = l, xlab=(),
 ylab=(Return), main = (Industry))
 lines(gvarindus, type=l, lwd=2)
 lines(quantindustlow, col =black, type = l,lty=3)
 lines(quantindusthigh, col =black, type = l,lty=3)

 (THIS ADDS THE LINES, but then I dont have an date axis)
  plot(dlindus[(3653):(3653+i)], col =hcl(h=60,c=35,l=60), ylim=c(-8,8),
 type = l, xlab=(), ylab=(Return), main = (Industry))
 lines(gvarindus, type=l, lwd=2)
 lines(quantindustlow, col =black, type = l,lty=3)
 lines(quantindusthigh, col =black, type = l,lty=3)

 This sounds like a fairly simple problem, but I cannot find any answer
 in the R-help archives.

 Thanks alot.

 Zeno

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Functions for autoregressive Regressionmodels (Mix between times series and Regression Models) ?

2007-08-07 Thread Achim Zeileis

On Tue, 7 Aug 2007, Gabor Grothendieck wrote:

 arima (see ?arima) supports explanatory variables in the xreg= argument.

In addition to arima(), the function dynlm() in package dynlm might be
useful if you want to fit the model by OLS. See also package dyn for a
similar approach.
Z

 On 8/7/07, Maja Schröter [EMAIL PROTECTED] wrote:
  Hello everybody,
 
  I've a question about autoregressive Regressionmodels.
 
  Let Y[1],.,Y[n], be a time series.
 
  Given the model:
 
  Y[t] = phi[1]*Y[t-1] + phi[2]*Y[t-1] + ... + phi[p]*Y[t-p] + x_t^T*beta + 
  u_t,
 
 
  where x_t=(x[1t],x[2t],x[mt]) and beta=(beta[1],...,beta[m]) and 
  u_t~(0,1)
 
  I want to estimate the coefficients phi and beta.
 
  Are in R any functions or packages for autoregressive Regressionmodel 
  with special summaries?. I'm not meaning the function ar.
 
 
  Example: I have the data
 
  working.time- rnorm(100)   # Y
  vacation- rnorm(100)   #x1
  bank.holidays   - rnorm(100)   #x2
  illnes  - rnorm(100)   #x3
  education   - rnorm(100)   #x3
 
 
 
  Now I want to analyse:
 
   Y[t] = phi[1]*Y[t-1] + phi[2]*Y[t-1] + ... + phi[p]Y[t-p] + 
  beta1*vacation_t  +beta2*bank.holidays + beta3*illnes + beta4*eductation + 
  u_t-
 
 
 
  Has anyone an idea?
 
  I would be more than glad if so.
 
  Thank you VERY much in advance.
 
  Kindly regards from the Eastern Part of Berlin,
 
  Maja
 
  --
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] - round() strange behaviour

2007-08-02 Thread Achim Zeileis

On Thu, 2 Aug 2007, Monica Pisica wrote:


 Hi,

 I am getting some strange results using round - it seems that it depends if 
 the number before the decimal point is odd or even 

 For example:

  round(1.5)[1] 2 round(2.5)[1] 2
 While i would expect that round(2.5) be 3 and not 2.

 Do you have any explanation for that?

Yes: you obviously did not read the man page! Please do.
Z

 I really appreciate your input,

 Monica


 _
 [[trailing spam removed]]

   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] zeroinfl() or zicounts() error

2007-07-26 Thread Achim Zeileis

On Thu, 26 Jul 2007, Rachel Davidson wrote:

 I'm trying to fit a zero-inflated poisson model using zeroinfl() from the
 pscl library. It works fine for most models I try, but when I include either
 of 2 covariates, I get an error.

 When I include PopulationDensity, I get this error: Error in solve.default
 (as.matrix(fit$hessian)) :system is computationally singular:
 reciprocal condition number = 1.91306e-34

 When I include BuildingArea, I get this error: Error in optim(fn =
 loglikfun, par = c(start$count, start$zero, if (dist ==  :
 non-finite finite-difference value [2]

Might be due to some close to linear dependencies in your regressor
matrix...

  I tried fitting the models using zicounts in the zicounts library as well
 and had the same difficulty.

If I recall correctly, zicounts() usses a very similar type of
optimization compared to zeroinfl(), hence the similar problems.

  When I include PopulationDensity, it runs, but outputs only the parameter
 estimates, not the standard errors or p-values (those have NaN).

This is due to the same problem as above for zeroinfl(), the Hessian
matrix is (close to) singular.

 When I include BuildingArea, I get this error: Error in
 solve.default(z0$hessian)
 : system is computationally singular: reciprocal condition number =
 2.58349e-25

 Can anyone suggest what it is about these 2 covariates that might be causing
 the problem? I don't see any obvious problems with them. They are both
 nonnegative with smooth probability distributions and no missing (NA)
 values. The dataset has 3211 observations. It doesn't matter if there are
 other covariates in the models or not. If one of these is included, I get
 the errors.

Even if you include just one covariate and nothing else?
  zeroinfl(y ~ PopulationDensity, data = ...)

Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] aggregate.ts

2007-07-26 Thread Achim Zeileis

Jeff,

I'm really not a fan of subjective mine is bigger than yours
discussions. Just three comments that I try to keep as objective as
possible.

 Bottom line: use 'tis' series from the fame package, or 'zoo` stuff from
 Gabor's zoo package.

The last time I checked
  packageDescription(zoo)$Author
had more than one entry.

 As the author of the fame package, I hope you'll excuse
 me for asserting that the 'tis' class is easier to understand and use than the
 zoo stuff,

That surely depends on the user and the task he has to do...

 which takes a more general approach.  Some day Gabor or I or some
 other enterprising soul should try combining the best ideas from zoo and fame
 into a package that is better than either one.

I think combination should be straightforward: zoo is general enough to
allow for time indexes of class ti. Overall, ti seems to be
well-written and only some methods might need to be added/improved to
cooperate fully with zoo. Maybe some of the functionality that is
currently available for tis but is not available for all conceivalbe
zoo+arbitrary_index objects might be special cased for zoo+ti or zooreg or
zooreg+ti etc.

Best,
Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] aggregate.ts

2007-07-24 Thread Achim Zeileis

On Wed, 25 Jul 2007, laimonis wrote:

 Consider the following scrap of code:

...slightly modified to
  x1 - ts(1:24, start = c(2000, 10), freq = 12)
  x2 - ts(1:24, start = c(2000, 11), freq = 12)

and then
  y1 - aggregate(x1, nfreq = 4)
gives the desired result while
  y2 - aggregate(x2, nfreq = 4)
probably does not what you would like it to do. In both cases, the 24
observations are broken into 8 segments of equal length (as documented on
?aggregate.ts) and then aggregated. Therefore
  as.vector(y1)
  as.vector(y2)
are identical (and not matched by quarter...as you would probably like).

 And don't tell me that the aggregating a monthly series into quarters
 makes no sense!! (see response to Bug 9798).

1. Your tone is not appropriate.
2. You're not quoting the reply correctly. It said: You cannot aggregate
   a time series that does not run over quarters into quarters. The
   speculation is plain wrong.
   The reply means that aggregate.ts() does not do what you think
   it does. As I tried to illustrate with the example above.

One can probably argue about whether it makes sense to aggregate a monthly
time series into quarter when I don't have complete observations in each
quarter. But maybe it might be worth considering a change in
aggregate.ts() so that the old and new frequency are matched even with
incomplete observations?

Currently, the zoo implementation allows this: Coercing back and forth
gives:
  library(zoo)
  z1 - as.ts(aggregate(as.zoo(x1), as.yearqtr, sum))
  z2 - as.ts(aggregate(as.zoo(x2), as.yearqtr, sum))
where z1 is identical to y1, and z2 is what you probably want.

hth,
Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with Dates

2007-07-20 Thread Achim Zeileis

Alex:

 I am taking an excel dataset and reading it into R using read.table.

This sets up a data.frame object. The data you have are probably more
conveniently represented as a time series, storing the date in an
appropriate format, e.g., in class Date.

 (actually I am dumping the data into a .txt file first and then reading data
 in to R).

Then you can do both steps (calling read.table() and transformation to a
time series) in one go using the function read.zoo() from package zoo.

If your text file looks like

 Date  Price Open.Int. Comm.Long Comm.Short net.comm
15-Jan-86 673.25175645 65910  2842537485
31-Jan-86 677.00167350 54060  2712026940
14-Feb-86 680.25157985 37955  2542512530
28-Feb-86 691.75162775 49760  1603033730
14-Mar-86 706.50163495 54120  2799526125
31-Mar-86 709.75164120 54715  3039024325

then you can read it in via
  z - read.zoo(mydata.txt, format = %d-%b-%y, header = TRUE)

Then you can do all sorts of standard things for time series, such as
  plot(z)
or...

 The dataset runs from 1986 to 2007.

 I want to be able to take subsets of my data based on date e.g. data between
 2000 - 2005.

...subsetting

  z2 - window(z, start = as.Date(2000-01-01), end = as.Date(2005-12-31))

etc. Look at the zoo package vignettes for more information
  vignette(zoo-quickref, package = zoo)
  vignette(zoo, package = zoo)

hth,
Z

 As it stands, I can't work with the dates as they are not in correct format.

 I tried successfully converting the dates to just the year using:

 transform(data, Yr = format(as.Date(as.character(Date),format = '%d-%b-%y'),
 %y)))

 This gives the following format:

Date  Price Open.Int. Comm.Long Comm.Short net.comm Yr
 1 15-Jan-86 673.25175645 65910  2842537485 86
 2 31-Jan-86 677.00167350 54060  2712026940 86
 3 14-Feb-86 680.25157985 37955  2542512530 86
 4 28-Feb-86 691.75162775 49760  1603033730 86
 5 14-Mar-86 706.50163495 54120  2799526125 86
 6 31-Mar-86 709.75164120 54715  3039024325 86

 I can subset for a single year e.g:

 head(subset(df, Yr ==00)

 But how can I subset for multiple periods e.g 00- 05? The following won't
 work:

 head(subset(df, Yr ==00  Yr==01)

 or

 head(subset(df, Yr = c(00,01,02,03)

 I can't help but feeling that I am missing something and there is a simpler
 route.

 I leafed through R newletter 4.1 which deals with dates and times but it
 seemed that strptime and POSIXct / POSIXlt are not what I need either.

 Can anybody help me?

 Regards


 Alex

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ?R: Removing white space betwen multiple plots, traditional graphics

2007-07-20 Thread Achim Zeileis

On Fri, 20 Jul 2007, Mr Natural wrote:


 I would appreciate suggestions for removing the white spaces the graphs in a
 stack:

 par(mar=c(2,2,1,1), mfrow = c(6,1))
 mydates-dates(1:20,origin=c(month = 1, day = 1, year = 1986))
 plot(rnorm(20,0.1,0.1)~mydates, type=b,xlab=,ylim=c(0,1),xaxt = n)
 plot(rnorm(20,0.2,0.1)~mydates, type=b,xlab=,ylim=c(0,1),xaxt = n)
 plot(rnorm(20,0.3,0.1)~mydates, type=b,xlab=,ylim=c(0,1),xaxt = n)
 plot(rnorm(20,0.5,0.1)~mydates, type=b,xlab=,ylim=c(0,1),xaxt = n)
 plot(rnorm(20,0.7,0.1)~mydates, type=b,xlab=,ylim=c(0,1),xaxt = n)
 plot(rnorm(20,0.8,0.1)~mydates, type=b,xlab=,ylim=c(0,1) )

Really, there is no need to do this all by hand. Please have a look at the
zoo package (as recommended earlier this week).

Then you can do
   x - cbind(rnorm(20,0.1,0.1),
  rnorm(20,0.2,0.1),
  rnorm(20,0.3,0.1),
  rnorm(20,0.5,0.1),
  rnorm(20,0.7,0.1),
  rnorm(20,0.8,0.1))
   z - zoo(x, mydates)
   plot(z)
   plot(z, nc = 1, type = b)
and much more.

Please consult
  vignette(zoo-quickref, package = zoo)
  vignette(zoo, package = zoo)
as well as the Date and Time Classes article by Gabor Grothendieck and
Thomas Petzoldt in R News 4(1), 29-32. (available from your favourite CRAN
mirror).
Z

 Thanx, Don
 --
 View this message in context: 
 http://www.nabble.com/-R%3A--Removing-white-space-betwen-multiple-plots%2C-traditional-graphics-tf4119626.html#a11716176
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] dates() is a great date function in R

2007-07-18 Thread Achim Zeileis

...just a follow up to reading time series data from CSV files. If you've
got data like Gavin's (only with the dates in the first column)

  Date,Data
  01/01/2007,1
  02/01/2007,2
  03/01/2007,3
  04/01/2007,4
  ...

then you can use read.zoo() in package zoo:
  x - read.zoo(mydata.csv, sep = ,, format = %d/%m/%Y, header = TRUE)
  plot(x)
which produces the time-series plot.

This uses the Date class contained in base R rather than dates from
chron. Concerning the different time/date classes, see the R News article
Gabor already mentioned. For some more examples of using zoo/read.zoo see
  vignette(zoo-quickref, package = zoo)

hth,
Z


On Wed, 18 Jul 2007, Gavin Simpson wrote:

 On Wed, 2007-07-18 at 12:14 -0700, Mr Natural wrote:
  Proper calendar dates in R are great for plotting and calculating.
  However for the non-wonks among us, they can be very frustrating.
  I have recently discussed the pains that people in my lab have had
  with dates in R. Especially the frustration of bringing date data into R
  from Excel, which we have to do a lot.

 I've always found the following reasonably intuitive:

 Given the csv file that I've pasted in below, the following reads the
 csv file in, formats the dates and class Date and then draws a plot.

 I have dates in DD/MM/ format so year is not first - thus attesting
 to R not hating dates in this format ;-)

 ## read in csv data
 ## as.is = TRUE stops characters being converted to factors
 ## thus saving us an extra step to convert them back
 dat - read.csv(date_data.csv, as.is = TRUE)

 ## we convert to class Date
 ## format tells R how the dates are formatted in our character strings
 ## see ?strftime for the meaning and available codes
 dat$Date - as.Date(dat$Date, format = %d/%m/%Y)

 ## check this worked ok
 str(dat$Date)
 dat$Date

 ## see nicely formatted dates and not a drop of R-related hatred
 ## but just about the most boring graph I could come up with
 plot(Data ~ Date, dat, type = l)

 And you can keep your Excel file formatted as dates as well - bonus!

 Oh, and before you get Martin'd, it is the chron *package*!

 HTH

 G

 CSV file I used, generated in OpenOffice.org, but I presume it stores
 Dates in the same way as Excel?:

 Data,Date
 1,01/01/2007
 2,02/01/2007
 3,03/01/2007
 4,04/01/2007
 5,05/01/2007
 6,06/01/2007
 7,07/01/2007
 8,08/01/2007
 9,09/01/2007
 10,10/01/2007
 11,11/01/2007
 10,12/01/2007
 9,13/01/2007
 8,14/01/2007
 7,15/01/2007
 6,16/01/2007
 5,17/01/2007
 4,18/01/2007
 3,19/01/2007
 2,20/01/2007
 1,21/01/2007
 1,22/01/2007
 2,23/01/2007
 3,24/01/2007

  Please find below a simple analgesic for R date importation that I
  discovered
  over the last 1.5 days (Learning new stuff in R is calculated in 1/2 days).
 
  The functiondates()gives the simplest way to get calendar dates into
  R from Excel that I can find.
  But straight importation of Excel dates, via a csv or txt file, can be a a
  huge pain (I'll give details for anyone who cares to know).
 
  My pain killer is:
  Consider that you have Excel columns in month, day, year format. Note that R
  hates date data that does not lead with the year.
 
  a. Load the chron library by typing   library(chron)   in the console.
  You know that you need this library from information revealed by
  performing the query,
  ?dates()in the Console window. This gives the R documentation
  help file for this and related time, date functions.  In the upper left
  of the documentation, one sees dates(chron). This tells you that you
  need the library chron.
 
  b. Change the format dates in Excel to format general, which gives
  5 digit Julian dates. Import the csv file (I useread.csv()  with the
  Julian dates and other data of interest.
 
  c.  Now, change the Julian dates that came in with the csv file into
  calendar dates with thedates() function. Below is my code for performing
  this activity, concerning an R data file called ss,
 
  ss holds the Julian dates, illustrated below from the column MPdate,
 
  ss$MPdate[1:5]
  [1] 34252 34425 34547 34759 34773
 
  The dates() function makes calendar dates from Julian dates,
 
  dmp-dates(ss$MPdate,origin=c(month = 1, day = 1, year = 1900))
 
   dmp[1:5]
  [1] 10/12/93 04/03/94 08/03/94 03/03/95 03/17/95
 
  I would appreciate the comments of more sophisticated programmers who
  can suggest streamlining or shortcutting this operation.
 
  regards, Don
 
 
 
 
 --
 %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
  Gavin Simpson [t] +44 (0)20 7679 0522
  ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
  Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
  Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
  UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
 %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

 __
 R-help@stat.math.ethz.ch mailing list

Re: [R] Time Series Data

2007-07-16 Thread Achim Zeileis

On Mon, 16 Jul 2007, livia wrote:


 Hi all, I have got some time series data. Data[[1]] is the data in the format
 1975-12-05 1975-12-12 1975-12-19..., data[[2]] is the time series data. I
 would like to generate the time series format as
 1975-12-05  1.5
 1975-12-12  2.3etc.

 I am thinking about cbind(data[[1]],data[[2]]), but it results in
   [,1]  [,2]
   [1,]1 1.5
   [2,]2 2.3

 Could anyone give me some advice?

Have a look at the zoo package and its vignettes:
  vignette(zoo-quickref, package = zoo)
  vignette(zoo, package = zoo)

hth,
Z

 --
 View this message in context: 
 http://www.nabble.com/Time-Series-Data-tf4088688.html#a11621776
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Changing default library

2007-07-11 Thread Achim Zeileis

Corinna:

 I got a question concerning the .libPaths(). if I do the command
 .libPaths() than the result is /usr/lib/R/library. This is the default
 folder. I now want to change this one into /home/csc/usr/lib/R/library.
 I thought it would work with the command
 .libPaths(/home/csc/usr/lib/R/library). When I than do the command
 .libPaths() the result is: /usr/lib/R/library
 /home/csc/usr/lib/R/library. But if I start R the next time the result
 of the command .libPaths() is again just /usr/lib/R/library.

 Can anyone help me?

You should set the R_LIBS environment variable (that the man page of
.libPaths points you to). See also FAQ 5.2.

hth,
Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] RWeka control parameters classifiers interface

2007-07-11 Thread Achim Zeileis

On Wed, 11 Jul 2007 [EMAIL PROTECTED] wrote:

   The problem is, that the functions
   result=classifier(formula, data, subset, na.action, control = 
 Weka_control(mycontrol))
   do not seem to be manipulated by the mycontrol- arguments

Yes, they are...not all parameter changes have always an effect on the
specified learner.

   Perhaps this should be resepected via the handlers- argument ,
   but the documentation in this regard is rather sparse.

Handlers are not needed here.

Re: sparse docs. In case you have not seen that paper already, there is a
technical report on the ideas behind RWeka:
  http://epub.wu-wien.ac.at/dyn/openURL?id=oai:epub.wu-wien.ac.at:epub-wu-01_ba6

Re: SMO. Compare

m1 - SMO(Species ~ ., data = iris)
m2 - SMO(Species ~ ., data = iris, control = Weka_control(
  K = weka.classifiers.functions.supportVector.RBFKernel))

which yield different results so the Weka_control() works.

The same happens if you register the mySMO() interface yourself. I'm not
sure why the E = ... argument has no influence on the SMO, please check
the Weka docs for this particular learner.

Best,
Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Wilcoxon Rank Sum Test.

2007-06-28 Thread Achim Zeileis

On Thu, 28 Jun 2007 14:18:52 -0700 Nordlund, Dan (DSHS/RDA) wrote:

  -Original Message-
  From: [EMAIL PROTECTED] 
  [mailto:[EMAIL PROTECTED] On Behalf Of Marcus
  Vinicius Sent: Thursday, June 28, 2007 1:32 PM
  To: r-help@stat.math.ethz.ch
  Subject: [R] Wilcoxon Rank Sum Test.

  Dear,

  I'm using R software to evaluate Wilcoxon Rank Sum Test and 
  I' getting one
  Warning message as this:

   C1dea_com
   [1] 1.000 0.345 0.200 0.208 0.508 0.480 0.545 0.563 0.451 
  0.683 0.380 0.913
  1.000 0.506
   C1dea_sem
   [1] 1.000 0.665 0.284 0.394 0.509 0.721 0.545 0.898 0.744 
  0.683 0.382 0.913
  1.000 0.970

   wilcox.test(C1dea_sem,C1dea_com, paired = TRUE, alternative 
  = two.sided)

  Wilcoxon signed rank test with continuity correction

  data:  C1dea_sem and C1dea_com
  V = 45, p-value = 0.009152
  alternative hypothesis: true mu is not equal to 0

  Warning message:
  Cannot compute exact p-value with zeroes in: 
  wilcox.test.default(C1dea_sem,
  C1dea_com, paired = TRUE, alternative = two.sided)

  What is happening?

  Best Regards,

  Marcus Vinicius

 Marcus,

 It means that you have one or more pairs of observations (5 in your
 case) where the difference is 0.  The wilcox.test can only compute an
 approximate p-value under these circumstances.

...while wilcox.exact() from package exactRankTests can evaluate the
permutation distribution correctly.
Z

 Hope this is helpful,

 Dan

 Daniel J. Nordlund
 Research and Data Analysis
 Washington State Department of Social and Health Services
 Olympia, WA  98504-5204

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html and provide commented,
 minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sweave bug? when writing figures / deleting variable in chunk

2007-06-27 Thread Achim Zeileis

On Wed, 27 Jun 2007 21:06:04 -0400 D G Rossiter wrote:

 I have found a quite strange (to me) behaviour in Sweave. It only  
 occurs in the following situation:
 
 1. define a variable in one chunk
 2. use it within a subsequent figure-generating chunk
 3. delete it at the end of that same chunk
 Then the Sweave driver chokes, not finding the variable name when  
 generating the figure

Sweave() executes figure chunks more than once because they might
also create printed output. If you create both EPS and PDF graphics, it
executes them three times (print + EPS + PDF). Hence, data
manipulations should be avoided in figure chunks.

I was also once bitten by this when I permuted columns of a table prior
to plotting and never obtained the desired order...

Fritz, maybe this is worth an entry in the FAQ?
Z

 Example:
 
 % document bug2.Rnw
 \documentclass{article}
 \usepackage{Sweave}
 \begin{document}
 \SweaveOpts{eps=false}
 =
 sel - 1:5
 @
 fig=T=
 plot(trees[sel,])
 rm(sel)
 @
 \end{document}
 
 Try to sweave:
 
   Sweave(bug2.Rnw)
 Writing to file bug2.tex
 Processing code chunks ...
 1 : echo term verbatim
 2 : echo term verbatim pdf
 Error: no function to return from, jumping to top level
 Error in plot(trees[sel, ]) : error in evaluating the argument 'x'
 in selecting a method for function 'plot'
 Error in driver$runcode(drobj, chunk, chunkopts) :
   Error in plot(trees[sel, ]) : error in evaluating the
 argument 'x' in selecting a method for function 'plot'
 
 The generated .tex is complete up through the rm() but no figure is  
 generated. The file bug2-002.pdf is incomplete (corrupt).
 
 ...
 \begin{Schunk}
 \begin{Sinput}
   plot(trees[sel, ])
   rm(sel)
 \end{Sinput}
 \end{Schunk}
 
 The following ALL eliminate the problem:
 
 0. Executing the code directly, also with ESS
 1. fig=F
 2. moving rm(sel) to a separate, later code chunk
 3. Stangle the source and then source it
 4. don't use a variable, i.e. in this case:  plot(trees[1:5,])
 
 It seems that Sweave is executing the rm(sel)  before it uses it in  
 the trees[sel,].
 
 Technical details: R 2.5.0, Mac OS X 10.4.10, PPC
 Same behaviour in stand-alone R for Mac and for R within Aquamacs  
 using ESS
 
 Workaround: I am putting any deletions into a later code chunk. This  
 only has the disadvantage of making more chunks, so now that I know  
 what's happening it's no big deal. But it's driving me crazy... am I  
 missing something? Thanks!
 
 D. G. Rossiter
 Senior University Lecturer
 Department of Earth Systems Analysis
 International Institute for Geo-Information Science and Earth  
 Observation (ITC)
 Hengelosestraat 99
 PO Box 6, 7500 AA Enschede, The Netherlands
 Phone:+31-(0)53 4874 499
 Fax:  +31-(0)53 4874 336
 mailto:rossiter--at--itc.nl,  Internet: http://www.itc.nl/personal/ 
 rossiter
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html and provide commented,
 minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] aggregating daily values

2007-06-26 Thread Achim Zeileis

On Tue, 26 Jun 2007, antonio rodriguez wrote:

 Hi,

 I swear I have read almost all the posted messages about this issue, but
 it's evident I couldn't find an answer (surely esay) to my problem. What
 I want is the following:

 Make  8 days aggregates from a daily series like this (dput output):

I'm not sure which days you want to aggregate exactly. If you want to
replace the observations from 1985-01-02 until 1985-01-09 by a single
observation, maybe you want something like
  new.time - as.Date(8 * floor(as.numeric(time(z))/8) + 7)
  z2 - aggregate(z, new.time, mean)
which gives averages for 8 days (anchored on the last day of the 8-day
period).

Does this what you're looking for? Look at the zoo vignettes for more
information/examples.
Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Imputing missing values in time series

2007-06-25 Thread Achim Zeileis

On Fri, 22 Jun 2007, Horace Tso wrote:

 Thanks to Mark and Erik for different versions of locf, also Erik's
 pointer to archive where I found another function due to Simon Fear. I
 haven't tested the zoo locf function.

Just as an addition to what Marc already wrote:

zoo offers at least two advantages here:
  - it does not break if the first element is NA
  - if available it incorporates the corresponding time stamps

When we wrote na.locf(), Gabor also tried to optimize it for speed. I
haven't compared it to the solutions suggested here but would be surprised
if any of them would substantially faster.

As for your time series, I would recommend that you hold it as a zoo
series with Date time stamps. See
  vignette(zoo-quickref, package = zoo)
  vignette(zoo, package = zoo)
for some examples of zoo series in general and na.locf() in particular.
Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Averaging dates?

2007-06-20 Thread Achim Zeileis

On Wed, 20 Jun 2007, Sérgio Nunes wrote:

 Hi,

 What's the best way to average dates?
 I though mean.POISXct would work fine but...

  a
 [1] 2007-04-02 19:22:00 WEST
  b
 [1] 2007-03-17 16:23:00 WET
  class(a)
 [1] POSIXt  POSIXct
  class(b)
 [1] POSIXt  POSIXct
  mean(a,b)
 [1] 2007-04-02 19:22:00 WEST
  mean(b,a)
 [1] 2007-03-17 16:23:00 WET

Would you usually call mean() in this way?
  mean(1, 2)
  mean(2, 1)

Probably not...

mean() expects a vector, try mean(c(a, b))!
Z

 ?!

 Thanks in advance for any advice,
 --
 Sérgio Nunes

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] viewing source code

2007-06-18 Thread Achim Zeileis

Werner:

 could somebody give me a quick hint how to view the
 source code of a function if sole entering of the
 function name does not work?

 In particular, I am trying to look at cd_plot from
 the vcd package.

Strategy 1: Typing cd_plot tells you that it is a generic function
  and methods(cd_plot) shows you which methods exist (default and
  formula) which are both non-visible. You can still directly access
  them via vcd:::cd_plot (which is the main work horse).

Strategy 2 (preferred, especially if you want to take a closer look):
  Obtain the source package from CRAN. Unpack the tar.gz file and
  look into the vcd/R folder where you will find cd_plot.R containing
  the sources of both methods.

grx,
Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Chow Test

2007-06-06 Thread Achim Zeileis

On Wed, 6 Jun 2007, Bernd Stampfl wrote:


 Hello R-users!
 I tried to find a package to run a CHOW TEST. As a reference package I found
 the STRUCCHANGE package. Do you know if it works well

If you have concerns regarding the reliability, you can check the
underlying source code and read the accompanying publications (which also
comment on reliability and reproducibility).

 otherwise can you recommend a different one?

If the breakpoint is known, you can easily compute the Chow test by hand:
set up a factor that codes the two regimes and then fit the un-segmented
and segmented regressions:
  fac - my_time  my_break
  fm0 - lm(y ~ x1 + x2 ...)
  fm1 - lm(y ~ fac / (x1 + x2 ...))
  anova(fm0, fm1)
If you want other covariances in the test, you can also use waldtest()
from the lmtest package.

If the breakpoint is not known in advance, a supF test (aka supChow) over
all conceivable breakpoints is more appropriate. This is also provided in
strucchange along with a rich set of other testing (and dating)
techniques.

Best,
Z

 Thanks, Bernd
 --
 View this message in context: 
 http://www.nabble.com/Chow-Test-tf3878416.html#a10990270
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] time serie generation

2007-06-01 Thread Achim Zeileis

On Fri, 1 Jun 2007 12:51:35 +0200 [EMAIL PROTECTED] wrote:

 
 Dear all,
 
 I would like to generate a regular time serie, i.e. a list of dates
 and time for each our of the period 2002-2004.

use the seq() method for POSIXct objects:

seq(from = as.POSIXct(2002-01-01 00:00:00),
to = as.POSIXct(2004-12-31 23:00:00),
by = hour)

hth,
Z

 the time format should be
 2002-01-01 12:00:00 (year-month-day hour:min:sec)
 
 
 so the list should contain all hours of the period 2002-2004
 2002-01-01 00:00:00
 2002-01-01 01:00:00
 2002-01-01 02:00:00
 ...
 2004-12-31 23:00:00
 Does a function exist to create that kind of list ?
 
 Thanks in advance,
 
 
 Jessica
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html and provide commented,
 minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] white test to check homoscedasticity of the residuals

2007-05-30 Thread Achim Zeileis

On Wed, 30 May 2007, Benoit Chemineau wrote:

 Hi R-programmers,

 I can't find find the White test to check the homoscedasticity of the
 residuals from a linear model. Could you please help me with this?

The package lmtest includes the function bptest() for performing
Breusch-Pagan tests. White's test is a special case of this. For example,
if you fit a linear regression
  fm - lm(y ~ x + z, data = foo)
then you can carry out White's test via
  bptest(fm, ~ x * z + I(x^2) + I(z^2), data = foo)
i.e., include all regressors and the squares/cross-products in the
auxiliary regression.

I haven't yet written a simple convenience interface for this...

hth,
Z

 Thank you !

 BC

   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] aggregation of a zoo object

2007-05-29 Thread Achim Zeileis

On Tue, 29 May 2007 [EMAIL PROTECTED] wrote:

 # then I want to do the sum per hour

 z_sum_per_hour - aggregate(na.omit(z), function(x) as.POSIXct(trunc(x,
 hour)),sum)
 Warning message:
 some methods for âzooâ objects do not work if the index entries in
 âorder.byâ are not unique in: zoo(rval[i], x.index[i])

 Do anyone has an idea how to avoid that ?

The warning does not come from the aggregate() call, but from the
na.omit() call. After omitting the NAs, you have still duplicated time
stamps, hence the warning is issued again. After that, aggregating works
fine and produces no warnings.
Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] data in lmtest

2007-05-22 Thread Achim Zeileis

On Tue, 22 May 2007, Henning Schoelen wrote:


Hi everyone!

I am beginner in using R, so please excuse easy questions in advance.

I want to reproduce results from the data available in the lmtest-package.

Thatâs the failure code I get:

 data(bondyield)
Warning message:
file 'bondyield.rda' has magic number 'RDX1'
   Use of save versions prior to 2 is deprecated

Looks like you have an old version of lmtest (although you didnt't tell us
so). Please upgrade.
Z

Can anyone help me?

Thanks in advance!

Henning
-- 
View this message in context: 
http://www.nabble.com/data-in-lmtest-tf3797285.html#a10741036
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Selecting complementary colours

2007-05-21 Thread Achim Zeileis

On Mon, 21 May 2007, John Fox wrote:

 Dear r-helpers,

 I wonder whether, given the #rrggbb representation of a colour, there is a
 simple way to select the complementary colour, also expressed as a #rrggbb
 string.

Is the complementary color uniquely defined? My understanding is that you
can take opposite colors on a color wheel, but there are of course various
color wheels available. With colorspace you can experiment with this,
e.g.:
  x - #81A9D0
  y_hcl - as(hex2RGB(x), polarLUV)
  [EMAIL PROTECTED], H] - [EMAIL PROTECTED], H] + 180
  y_hcl - hex(y_hcl)
which is a bit more balanced than
  y_hsv - as(hex2RGB(x), HSV)
  [EMAIL PROTECTED], H] - [EMAIL PROTECTED], H] + 180
  y_hsv - hex(y_hsv)

hth,
Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Quick question on merging two time-series of different frequencies

2007-05-10 Thread Achim Zeileis

On Thu, 10 May 2007, Patnaik, Tirthankar  wrote:

 Hi,
   A quick beginner's question. I have two time series, A with
 daily data, and another B with data at varying frequencies, but mostly
 annual. Both the series are sorted ascending.

 I need to merge these two series together in the following way: For any
 entry of A, the lookup should match with B until we find an entry of B
 that's larger than A's.

I'm not sure what exactly you want. I assume that A and B are not the
observations but the corresponding time stamps, right?

In any case, I guess that the zoo package will have some useful
functionality for what you want, e.g., if you have data like
  library(zoo)
  set.seed(123)
  x - zoo(rnorm(11), as.Date(2000-01-01) + 0:10)
  y - zoo(rnorm(4), as.Date(2000-01-01) + c(0, 3, 7, 10))
then you can merge them with
  z - merge(x, y)
and then eliminate NAs, e.g. by
  na.locf(z)
or you could aggregate() the x series so that it becomes a monthly
series or something like that.

See
  vignette(zoo, package = zoo)
for more information.

Best,
Z

 For all A[i], i = 1,...,N and B[j], j=1,...M

 Match with B[j] where A[i] = B[j]

 When A[i]  B[j], match with B[j+1] where A[i] = B[j+1]

 Basically the less-frequent attributes stay true for a stock while the
 daily prices change.

 One example of this is the vlookup function in Excel with the TRUE
 option.

 So we have an implementation of this in R?

 TIA and best,
 -Tir

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] time series problem

2007-04-24 Thread Achim Zeileis

On Tue, 24 Apr 2007, Tomas Mikoviny wrote:

 Hi everybody,

 I work with data with following pattern

   comm

   Date  Value
 1 4/10/2007   361.2
 2 4/11/2007   370.1
 3 4/12/2007   357.2
 4 4/13/2007   362.3
 5 4/16/2007   363.5
 6 4/17/2007   368.7
 7 4/18/2007   354.7
 8 4/19/2007   368.8
 9 4/20/2007   367.1
 10... ...

 and trying to convert it to time series using:

   t=strptime(comm[1,1],%m/%d/%Y)
   x=ts(comm, start=as.POSIXct(t))

 Error in Math.difftime((end - start) * frequency + 1.01) :
   floornot defined for difftime objects

 Definitely I do something wrong but I can't find what is it!

At least things:
  - ts() can only work with numeric time stamps (not POSIXct)
  - you try to create a regular series (although your data is
not: there is not an observation on each day)

I would recommend to create a zoo series with Date index:
  R library(zoo)
  R z - zoo(comm[,2], as.Date(strptime(comm[,1], %m/%d/%Y)))
  R z
  2007-04-10 2007-04-11 2007-04-12 2007-04-13 2007-04-16 2007-04-17
   361.2  370.1  357.2  362.3  363.5  368.7
  2007-04-18 2007-04-19 2007-04-20
   354.7  368.8  367.1

hth,
Z

 Can anyone help me with this?

 Thanks.

 Tomas

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] zoo merge() method

2007-04-12 Thread Achim Zeileis

On Thu, 12 Apr 2007 [EMAIL PROTECTED] wrote:

 attempted to get at this by merging the two time series (by union),
 interpolating the NAs, and finally, subtracting one vector from the
 other.  The problem is that I can not combine the two zoo time series
 objects using the merge() or cbind() functions.  I get the following
 error:

 Error in z[match0(index(a), indexes), ] - a[match0(indexes, index(a))]
 :
 number of items to replace is not a multiple of replacement
 length

Usually, problems like this occur when the time stamps in one of your time
series are not unique. Maybe we should improve the error message by
explicitely trying to catch this error.

You can easily check this, e.g., via
  any(table(time(zoo_object))  1)

 The input/output from a recent R Console session might help,

Not really, the data itself would have been much more helpful...and even
better some simplified artificial data set.

hth,
Z

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Construct time series objects from raw data stored in csv files

2007-04-12 Thread Achim Zeileis

On Thu, 12 Apr 2007, tom soyer wrote:

 Hi,

 I have time series data stored in csv files (see below for an example of the
 data). I understand that in order to analyze my data in R, I need to first
 transform it into a ts object. Howeve, I could not find an example on how
 exactly to do that. Is ts the only function I need? What are the steps that
 I need to go through to build a time series object from raw data like this?

With the zoo package you can do
  library(zoo)
  z - read.zoo(yourdata.csv, sep = ,)
  plot(z)

See
  vignette(zoo, package = zoo)
for more information and also some more details about other time series
classes.
Z

 Thanks,

 Tom

 --- DATE,VALUE
 1921-01-01,19.000
 1921-02-01,18.400
 1921-03-01,18.300
 1921-04-01,18.100
 1921-05-01,17.700
 1921-06-01,17.600
 1921-07-01,17.700
 1921-08-01,17.700
 1921-09-01,17.500
 1921-10-01,17.500
 1921-11-01,17.400
 1921-12-01,17.300
 1922-01-01,16.900
 1922-02-01,16.900
 1922-03-01,16.700
 1922-04-01,16.700
 1922-05-01,16.700
 1922-06-01,16.700
 1922-07-01,16.800
 1922-08-01,16.600
 1922-09-01,16.600
 1922-10-01,16.700
 1922-11-01,16.800
 1922-12-01,16.900

   [[alternative HTML version deleted]]

 __
 [EMAIL PROTECTED] mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] When to use quasipoisson instead of poisson family

2007-04-10 Thread Achim Zeileis

On Tue, 10 Apr 2007, ronggui wrote:

 It seems that MASS suggest to judge on the basis of
 sum(residuals(mode,type=pearson))/df.residual(mode). My question: Is
 there any rule of thumb of the cutpoiont value?

 The paper On the Use of Corrections for Overdispersion  suggests
 overdispersion exists if the deviance is at least twice the number of
 degrees of freedom.

There are also formal tests for over-dispersion. I've implemented one for
a package which is not yet on CRAN (code/docs attached), another one is
implemented in odTest() in package pscl. The latter also contains
further count data regression models which can deal with both
over-dispersion and excess zeros in count data. A vignette explaining the
tools is about to be released.

hth,
Z

 Are there any further hints? Thanks.

 --
 Ronggui Huang
 Department of Sociology
 Fudan University, Shanghai, China

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


\name{dispersiontest}
\alias{dispersiontest}
\title{Dispersion Test}
\description{
 Tests the null hypothesis of equidispersion in Poisson GLMs against
 the alternative of overdispersion and/or underdispersion.
}
\usage{
dispersiontest(object, trafo = NULL, alternative = c(greater, two.sided, 
less))
}
\arguments{
  \item{object}{a fitted Poisson GLM of class \code{glm} as fitted
by \code{\link{glm}} with family \code{\link{poisson}}.}
 \item{trafo}{a specification of the alternative (see also details),
   can be numeric or a (positive) function or \code{NULL} (the default).}
 \item{alternative}{a character string specifying the alternative hypothesis:
   \code{greater} corresponds to overdispersion, \code{less} to
   underdispersion and \code{two.sided} to either one.}
}
\details{
The standard Poisson GLM models the (conditional) mean
\eqn{\mathsf{E}[y] = \mu}{E[y] = mu} which is assumed to be equal to the
variance \eqn{\mathsf{VAR}[y] = \mu}{VAR[y] = mu}. \code{dispersiontest}
assesses the hypothesis that this assumption holds (equidispersion) against
the alternative that the variance is of the form:
  \deqn{\mathsf{VAR}[y] \quad = \quad \mu \; + \; \alpha \cdot 
\mathrm{trafo}(\mu).}{VAR[y] = mu + alpha * trafo(mu).}
Overdispersion corresponds to \eqn{\alpha  0}{alpha  0} and underdispersion to
\eqn{\alpha  0}{alpha  0}. The coefficient \eqn{\alpha}{alpha} can be 
estimated
by an auxiliary OLS regression and tested with the corresponding t (or z) 
statistic
which is asymptotically standard normal under the null hypothesis.

Common specifications of the transformation function 
\eqn{\mathrm{trafo}}{trafo} are
\eqn{\mathrm{trafo}(\mu) = \mu^2}{trafo(mu) = mu^2} or \eqn{\mathrm{trafo}(\mu) 
= \mu}{trafo(mu) = mu}.
The former corresponds to a negative binomial (NB) model with quadratic 
variance function
(called NB2 by Cameron and Trivedi, 2005), the latter to a NB model with linear 
variance
function (called NB1 by Cameron and Trivedi, 2005) or quasi-Poisson model with 
dispersion 
parameter, i.e.,
  \deqn{\mathsf{VAR}[y] \quad = \quad (1 + \alpha) \cdot \mu = 
\mathrm{dispersion} \cdot \mu.}{VAR[y] = (1 + alpha) * mu = dispersion * mu.}
By default, for \code{trafo = NULL}, the latter dispersion formulation is used 
in
\code{dispersiontest}. Otherwise, if \code{trafo} is specified, the test is 
formulated
in terms of the parameter \eqn{\alpha}{alpha}. The transformation \code{trafo} 
can either
be specified as a function or an integer corresponding to the function 
\code{function(x) x^trafo},
such that \code{trafo = 1} and \code{trafo = 2} yield the linear and quadratic 
formulations
respectively.
}

\value{An object of class \code{htest}.}

\references{
Cameron, A.C. and Trivedi, P.K. (1990). Regression-based Tests for 
Overdispersion in the Poisson Model.  
\emph{Journal of Econometrics}, \bold{46}, 347--364.

Cameron, A.C. and Trivedi, P.K. (1998). \emph{Regression Analysis of Count 
Data}. 
Cambridge: Cambridge University Press.

Cameron, A.C. and Trivedi, P.K. (2005). \emph{Microeconometrics: Methods and 
Applications}.
Cambridge: Cambridge University Press.
}

\seealso{\code{\link{glm}}, \code{\link{poisson}}, \code{\link[MASS]{glm.nb}}}

\examples{
data(RecreationDemand)
fm - glm(trips ~ ., data = RecreationDemand, family = poisson)

## linear specification (in terms of dispersion)
dispersiontest(fm)
## linear specification (in terms of alpha)
dispersiontest(fm, trafo = 1)
## quadratic specification (in terms of alpha)
dispersiontest(fm, trafo = 2)
dispersiontest(fm, trafo = function(x) x^2)

## further examples
data(DoctorVisits)
fm - glm(visits ~ . + I(age^2), data = DoctorVisits, family = poisson)
dispersiontest(fm)

data(DebTrivedi)
fm - glm(ofp ~ health + age + gender + married + faminc + privins, data = 
DebTrivedi, family = poisson)
dispersiontest(fm)
}

Re: [R] Linear model and time series

2007-04-02 Thread Achim Zeileis

On Sun, 1 Apr 2007, John C Frain wrote:

 This question is not as simple as might appear.

But it might as well be - I try to answer the questions people ask instead
of guessing what they wanted to ask but did not.

 As the data are time
 series one should be very concerned about the distribution of the
 residuals,   Are the series stationary and if not are they integrated
 of the same order and cointegrated.

snip

 My recommendation to Andre would be to study a good book on
 time-series analysis.  One is not doing him a favour by recommending a
 procedure to him that may lead to spurious results when not applied
 properly.

All the points you raise above are valid, of course, but note that I did
not recommend a procedure. The original poster wrote down a linear model
that he wanted to estimate by OLS (aka lm()) and I just told him how he
can do that conveniently in R.

Whether or not the linear model is appropriate in this situation (and
there are time-series relationships where it is) I cannot say - but,
contrary to you, I assume that the poster knows what he is doing.

Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Linear model and time series

2007-03-31 Thread Achim Zeileis

On Sat, 31 Mar 2007, Andre Jung wrote:

 Dear all,

 I have three timeseries Uts, Vts, Wts. The relation between the time
 series can be expressed as

 Uts = x Vts + y Wts + residuals

 How would I feed this to lm() to evaluate the unknowns x and y?

If the time series are aligned (and univariate) you can just do
  lm(Uts ~ Vts + Wts)
If not, have a look at the dynlm and/or dyn packages.
Z

 Thanks,
 andre

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Tail area of sum of Chi-square variables

2007-03-29 Thread Achim Zeileis

Hi Klausch:

 I was wondering if there are any R functions that give the tail area
 of a sum of chisquare distributions of the type:
 a_1 X_1 + a_2 X_2
 where a_1 and a_2 are constants and X_1 and X_2 are independent 
 chi-square variables with different degrees of freedom.

Christian Kleiber (Cc on this reply) has code for computing the 
distribution function of general linear combinations of chi-squared 
variables. It's not yet on CRAN, but just ask him for a devel-snapshot.

Best wishes,
Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] aggregating data with Zoo

2007-03-28 Thread Achim Zeileis

On Wed, 28 Mar 2007, Alfonso Sammassimo wrote:

 Is there a way of aggregating 'zoo' daily data according to day of week? eg
 all Thursdays

Sure, the easiest way will probably differ depending on the time stamp 
class. One example might be this:
   ## small example with Date index
   z - read.zoo(file.path(.find.package(zoo), doc, demo1.txt),
 sep = |, format = %d %b %Y)
   ## visualization
   plot(z)
   ## aggregate along week days (via POSIXlt representation)
   aggregate(z, as.POSIXlt(time(z))$wday, mean)

hth,
Z

 I came across the 'nextfri' function in the documentation but am unsure how
 to change this so any day of week can be aggregated.

 I have used POSIX to arrange the data (not as 'zoo' series) according to day
 of week, but am curious if I've missed if a similar option available with
 zoo.

 Thank you for any help,

 Alf Sammassimo
 Melbourne, Australia

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help with zicounts

2007-03-09 Thread Achim Zeileis

Jaap:

 I have simulated data from a zero-inflated Poisson model, and would like
 to use a package like zicounts to test my code of fitting the model.
 My question is: can I use zicounts directly with the following simulated
 data?

I guess you can use zicounts, but personally I'm more familiar with 
zeroinfl() from package pscl (because I have written this function :)). 
With that you can easily do:

 beta.true-1.0
 gamma.true-1.0
 n-1000
 x-matrix(rnorm(n),n,1)
 pi-expit(x*beta.true)
 mu-exp(x*gamma.true)
 y-numeric(n) # blank vector
 z-(runif(n)pi) # logical: T with prob p_i, F otherwise
 y[z]-rpois(sum(z),mu[z]) # draw y_i ~ Poisson(mu_i) where z_i = T
 y[!z]-0 # set y_i = 0 where z_i = F

library(pscl)
zeroinfl(y ~ 0 + x | 0 + x)

which by default fits a ZIP (with log link and logit inflation).

hth,
Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] where can I find Durbin-Waston test tables for Confidence Level 2.5% or 0.5%?

2007-03-07 Thread Achim Zeileis

Michael:

 I am doing a two-sided DW test:

 H0: rho = 0
 H1: rho =/= 0

 My understanding is that most test statistics tables are one-sided. It's the
 way they created the table.

...because rho  0 is the alternative of interest in most applications.

 So from online, by doing Googling, I found a bunch of DW tables for
 Confidence Level 5%.

Using tables for the DW test is difficult because it's distribution 
depends on the particular set of regressors used. The tables of DW just 
give upper and lower bounds.

Back when the DW test was suggested, tables was the only way to make 
application of the test feasible. Today, you would either use the exact 
combination of chi-square distributions or an asymptotic approximation 
(both implemented in dwtest() from lmtest) or a bootstrap approximation 
(implemented in durbin.watson() from car). For 278 observations, the 
normal approximation should be sufficient.

hth,
Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem with party and ordered factors

2007-02-02 Thread Achim Zeileis

Christoph:

 I've got an ordered factor as response. As far as i know i have to use
 scores to be able to use this ordered factor.

If you want to exploit the ordering in the statistical tests (used for
variable selection in CTree), a natural approach is to use a
linear-by-linear test with scores assigned to the ordered levels of your
factor. That's what the example below does.

 But if i do so i get a tree
 which predicts all observations as the first level of my ordered factor.

That is not due to the factor being ordered. It results simply from the
fact that more than half of the observations have Never in the variable
ME.

 There i got the same problem. I execute the following code:
  data(mammoexp, package = party)
  mtree - ctree(ME ~ ., data = mammoexp, scores = list(ME = 1:3, SYMPT =
 1:4, DECT = 1:3))
  plot(mtree)

If you look at this picture, you can see that majority voting in each node
will result in the prediction Never.

 So now i'm stuck. Am i doing anything wrong?

Nothing.
If you want to see how the distribution in each node changes, you can
look at
  treeresponse(mtree)

 I'm using R 2.4.1 and all packages are uptodate.

Not anymore, I just uploaded a new party version to CRAN ;-))

Best wishes,
Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with efficient double sum of max (X_i, Y_i) (X Y vectors)

2007-02-01 Thread Achim Zeileis

Jeff,

you can do

 sum1: \sum_i\sum_j max(X_i,X_j)
 sum2: \sum_i\sum_j max(Y_i,Y_j)

sum(x * (2 * rank(x) - 1))

 sum3: \sum_i\sum_j max(X_i,Y_j)

sum(outer(x, y, pmax))

Probably, the latter can be speeded up even more...
Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Rbind for appending zoo objects

2007-01-30 Thread Achim Zeileis

On Tue, 30 Jan 2007, Shubha Vishwanath Karanth wrote:

 1.The above rbind function for the zoo objects doesn't take care
 of the column names while merging. Example: Column 'a' of y1 is appended
 with column 'b' of y2. Why is this so?

We do not check the column names at all. This should probably be changed.

 How do I get rid of this?

Sort the columns by hand before calling rbind().

 2.In the rbind function, I have given y2 first and then y1. But in
 the appended data, I see the data corresponding to y1 first and then of
 y2. Is this because of ordering of the index elements of the zoo
 objects?

Yes, recall that the second o in zoo means ordered.

Best,
Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R packages

2007-01-30 Thread Achim Zeileis

Also have a look at the excellent Bayesian CRAN task view
  http://CRAN.R-project.org/src/contrib/Views/Bayesian.html
Z

On Tue, 30 Jan 2007, Giovanni Petris wrote:

 Have you tried a search of CRAN? Seaching r-help archives may result
 in useful information as well.

 As formulated, the questions are difficult to answer. There are
 packages that use MCMC, Gibbs Sampling, Metropolis Hastings for
 specific classes of models.

 Best,
 Giovanni Petris

  Date: Tue, 30 Jan 2007 20:55:11 +0530
  From: Shubha Vishwanath Karanth [EMAIL PROTECTED]
  Sender: [EMAIL PROTECTED]
  Precedence: list
  Thread-topic: R packages
  Thread-index: AcdEgtTn1ik2yhc/Qvav5FLBJea68Q==
 
  Hi,
 
 
 
  Do any body know which packages of R I need to go for the below topics?
 
 
 
  1.  Monte Carlo Markov chain (MCMC)
  2.  Gibbs Sampling
  3.  Metropolis Hastings
 
 
 
  Thanks in advance...
 
  Shubha
 
 
  [[alternative HTML version deleted]]
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reversal of Aggregation

2007-01-29 Thread Achim Zeileis



On Mon, 29 Jan 2007, Roland Rau wrote:

 Dear all,

 given I have a data.frame in a format like this

 mydf - data.frame(age=rep(1:3,5),
year=c(rep(1996,3), rep(1997,3), rep(1998,3),
  rep(1999,3), rep(2000,3)),
income=1:15)
 mydf


 Now I convert it to some 2D-frequency table like this:
 mymatrix - tapply(X=mydf$income, INDEX=list(mydf$age, mydf$year),
FUN=sum)
 mymatrix


 My question is:
 How can I go the opposite way, i.e. from 'mymatrix' to 'mydf'?
 Is there an elegant way?

You could do
  as.data.frame(as.table(mymatrix))
and then set appropriate column names. (The first two variables are also
coded as factors which might or might not be what you want in this
example.)

Z


 Thanks,
 Roland

   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] SARIMA with dynlm

2007-01-23 Thread Achim Zeileis

Yannig:

 Does anyone have an exemple of how to fit a SARIMA model , with a MA
 part, with the package dynlm?

This is not yet possible. Currently, dynlm() just offers the functionality
of lm() for time series data, i.e., OLS for auto-regressive models. I hope
to interface arima() one day as well, but I haven't started working on
this. So currently, just use arima() directly.

hth,
Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Definition of t-value

2007-01-13 Thread Achim Zeileis

On Sat, 13 Jan 2007 [EMAIL PROTECTED] wrote:

 My problem is that I do not know how to compute the standard error Sb of some 
 regression coefficient, when I have done nothing more than to use the lm 
 command in this manner:

 Out = lm(A~ data$B + data$C + data$D)

better make that
  Out - lm(A ~ B + C + D, data = mydata)
and then
  summary(Out)
computes the usual t statistics. See the source code for what is exactly
computed. Several components of the summary also have their own extractor
function, e.g.
  coef(Out)
  vcov(Out)
extract the estimated coefficients and covariance matrix respectively.
Thus, you can compute the t statistics by hand via
  coef(Out) / sqrt(diag(vcov(Out)))

hth,
Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] color of opposite sign values in filled.contour

2007-01-05 Thread Achim Zeileis

On Thu, 4 Jan 2007, Richard M. Heiberger wrote:

 Get the RColorBrewer package from CRAN

 Description: The packages provides palettes for drawing nice maps
shaded according to a variable.

The package vcd also offers the function diverge_hcl() that constructs 
diverging palettes (based on HCL colors) particularly aimed at 
filled.contour() or image() plots. See
   
http://epub.wu-wien.ac.at/dyn/openURL?id=oai:epub.wu-wien.ac.at:epub-wu-01_abd
for some more background information.
Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help re zinb model

2007-01-04 Thread Achim Zeileis

On Thu, 4 Jan 2007, [EMAIL PROTECTED] wrote:

 Remember that -2 * the difference in the likelihoods between the two 
 models is asymptotically chi-squared distributed, with degrees of 
 freedom equal to the difference in number of parameters between the 
 models. So you can just calculate that for your preferred and null 
 models, then use the pchisq function
 to test significance. Get the likelihoods from obj$maxlike.

The function lrtest() in package lmtest offers a flexible implementation 
of this which works for fitted models that provide a logLik() method. The 
zicounts() implementation does not, but zeroinfl() in package pscl. E.g. 
you can do:

library(pscl)
data(teeth, package = zicounts)
fm1 - zeroinfl(dmft ~ gender + age | gender + age, data = teeth,
   dist = negbin)
summary(fm1)
fm2 - zeroinfl(dmft ~ 1, data = teeth, dist = negbin)
summary(fm2)

library(lmtest)
lrtest(fm1, fm2)

hth,
Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] extend summary.lm for hccm?

2006-12-26 Thread Achim Zeileis

On Sun, 24 Dec 2006, John Fox wrote:

 If I remember Freedman's recent paper correctly, he argues that sandwich
 variance estimator, though problematic in general, is not problematic in the
 case that White described -- an otherwise correctly specified linear model
 with heteroscedasticity estimated by least-squares.

More generally, sandwich-type estimators are valid (i.e., estimate
the right quantity, although not necessarily precisely, as Frank pointed
out) in situations where the estimating functions are correctly specified
but remaining aspets of the likelihood (not captured in the estimating
functions) are potentially not.

In linear models, it is easy to see what this means: the mean function has
to be correctly specified (i.e., the errors have zero mean) but the
correlation structure of the errors (i.e., their (co-)variances) might
differ from the usual assumptions. In GLMs, in particular logistic
regression, it is much more difficult to see against which types of
misspecification sandwich-based inference is robust.

Freedman's paper stresses the point that many model misspecifications also
imply misspecified estimating functions (and in his example this is rather
obvious) so that consequently the sandwich-type estimators estimate the
wrong quantity.

Best wishes,
Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] extend summary.lm for hccm?

2006-12-25 Thread Achim Zeileis

On Sun, 24 Dec 2006, John Fox wrote:

 Dear Dirk and Ivo,

 It's true that the sandwich package has more extensive facilities for
 sandwich variance estimation than the hccm function in the car package, but
 I think that the thrust of Ivo's question is to get a model summary that
 automatically uses the adjusted standard errors rather than having to
 compute them and use them manually.

I've written the function coeftest() in package lmtest for this purpose.
With this you can
  coeftest(obj, vcov = sandwich)
or you can put this into a specialized summary() function as John
suggested (where you probably would want the F statistic to use the other
vcov as well). See also function waldtest() in lmtest,
linear.hypothesis() in car and
  vignette(sandwich, package = sandwich)

Although this works, it is still a nuisance to use a different function
and not summary() directly. In addition, it would also be nice to plug in
different vcovs into confint() or predict() methods. Of course, one could
write different generics or overload the methods in certain packages.
However, I guess that many practitioners want to use different vcov
estimators - especially in the social and political scieneces, and
econometrics etc. - so that this might justify that the original lm (and
glm) methods are extended to allow for plugging in different vcov
matrices. Maybe we could try to convince R-core to include somthing like
this?

Best wishes,
Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Heteroscedasticity consistent standard errors for Spatial error models

2006-12-06 Thread Achim Zeileis

On Thu, 7 Dec 2006, Samarasinghe, Oshadhi Erandika wrote:

 Hello,

 Could anyone please tell me how to estimate Heteroscedasticity
 Consistent standard errors for a Spatial error model? All the functions
 I have looked at only works for lm objects.

I assume that you looked also at the sandwich package: The methods there
do not only work for lm objects but are object-oriented, appropriate
methods are already provided for a range of different object classes. So,
in principle, you can plug in other models as well, potentially including
spatial models if appropriate methods are provided. See
  vignette(sandwich-OOP, package = sandwich)

Disclaimer: I'm not sure whether the spatial structure of spatial models
will be appropriately captured by the class of estimators implemented in
sandwich. But someone who knows spatial models and their HC covariances
should be able to figure that out from the vignette above. I'm also not
sure what specialized methods exist...

Best,
Z


 Thank you very much!

 - Oshadhi

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problem with indexing a zoo object

2006-11-29 Thread Achim Zeileis

On Wed, 29 Nov 2006, Leeds, Mark (IED) wrote:

 My problem is the following : I create 2 zoo objects and then I try to
 subset one of them using logic. indicesthatpass is a vector of trues and

Nope, it's a matrix (because you want the coredata to be a matrix) but
should be a vector. (Maybe we could extend [.zoo here...)

 2) even if the answer to 1) is yes, there could be some other way better
 than what I am doing ? It took me a while to even get to this point
 because I was taking is.infinite of fwdret and abs of bckret rather than
 their respective coredatas and that was really causing a mess because of
 the different indexing that was getting returned.

Logical comparisons on zoo objects return zoo objects as documented in
various places (which you should know about by now) including ?zoo and the
package vignette. The documentation suggests various strategies (including
using coredata).
Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Multivariate time-series

2006-11-13 Thread Achim Zeileis

On Mon, 13 Nov 2006 15:42:06 -0600 David Kaplan wrote:

 Hi all,
 
 I'm looking for R packages that estimate multivariate time-series
 models or vector-autoregression (VAR) time-series models.

The Econometrics task view at
  http://CRAN.R-project.org/src/contrib/Views/Econometrics.html
has in the Time-series modelling section:

For estimating VAR models, several methods are available: simple models
can be fitted by ar() in stats, more elaborate models are provided
package vars, estVARXls() in dse and a Bayesian approach is
available in MSBVAR.

hth,
Z

 Thanks
 
 David
 
 
 -- 
 ===
 David Kaplan, Ph.D.
 Professor
 Department of Educational Psychology
 University of Wisconsin - Madison
 Educational Sciences, Room, 1061
 1025 W. Johnson Street
 Madison, WI 53706
 
 email: [EMAIL PROTECTED]
 homepage:
 http://www.education.wisc.edu/edpsych/facstaff/kaplan/kaplan.htm
 Phone: 608-262-0836
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html and provide commented,
 minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] as.zoo behavior (

2006-11-03 Thread Achim Zeileis

Mark:

 temp-ewma(actualdiff,.05),

ok, because ewma() is a wrapper for filter() this returns an object of
class ts. Therefore you want to add the index again, but:

 rollmeandifflogbidask-as.zoo(temp,index(actualdiff))

is not the right thing to do. as.zoo() is a coercion generic and its ts
method ignores all arguments except the first. Thus, index(actualdiff) is
*not* added and a zooreg object created (which is appropriate because
you supply an object of class ts, a regular series. See
  zoo:::as.zoo.ts
for the actual code).

You want to create a zoo series from scratch using the data from temp
and the index from actualdiff. Hence, the appropriate command is
  zoo(coredata(temp), index(actualdiff))

But let's go back to the ewma() function. The NA in temp looks suspicious
(and stems from another implicit coercion from zoo to ts). I guess you
want:

  ewma - function(x, lambda = 1, init = 0) {
rval - filter(lambda * coredata(x),
   filter = (1-lambda),
   method = recursive,
   init = init)
rval - zoo(coredata(rval), index(x))
rval
  }

And then you can do
  ewma(actualdiff, 0.05)
and already get a nicely formatted zoo object back.

Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] install.views

2006-10-27 Thread Achim Zeileis

On Fri, 27 Oct 2006 19:36:38 +0900 Pierre D'Ange wrote:

 Since I had too much junk cluttering my computer, I recently re-set it
 back to factory settings, in the process eliminating R.  I now have to
 reload R and packages, and would like to do that via the install.views
 command.  I have successfully downloaded the package ctv but R does
 not recognize install.views.

Works for me.
 
 What am I doing wrong?

Did you load the package, i.e. did you say
  library(ctv)
?

Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Up- or downsampling time series in R

2006-10-26 Thread Achim Zeileis

On Thu, 26 Oct 2006, Brandt, T. (Tobias) wrote:

 I have data that is sampled (in time) with a certain frequency and I would
 like to express this time series as a time series of a higher (or lower)
 frequency with the newly added time points being filled in with NA, 0, or
 perhaps interpolated. My data might be regularly or irregularly spaced. For
 example, I might have quarterly data that I would like to handle as a
 monthly time series with NAs filled in for the missing months.

Both can be done easily with zoo, examples for both are given in the
package vignettes and the man pages of the package.

1. extend to a finer grid (upsampling)
## generate some time series
z - zoo(sample(1:3, 20, replace=TRUE),
 as.yearmon(seq(2000, by=0.5, length=20)))
## generate emtpy series on finer grid
z2 - zoo(,seq(start(z), end(z), by = 1/12))
## merge (returns univariate series, by default filled with NAs)
merge(z, z2)

2. aggregate to a coarser grid (downsampling)
## transform to annual data
as.annual - function(x) floor(as.numeric(x))
## average within years
aggregate(z, as.annual, mean)
## first observation within year
aggregate(z, as.annual, head, 1)

hth,
Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Citation for tseries help pages for R

2006-10-23 Thread Achim Zeileis

On Mon, 23 Oct 2006, Jenny Vander Pluym wrote:

 Good Morning/Afternoon!

 I am editing a document and am not sure how to cite the Time series
 analysis and computational finance help pages that are accessible at
 this url: www.maths.lth.se/help/R/.R/library/tseries/html/00Index.html.
 Can anyone on this list help me with this?

Cite the package itself using the citation provided by
  citation(tseries)

Best,
Z

 Thank you for your time and have an excellent day!!

 Jenny VP

 --
 Jenny Vander Pluym
 NOAA, NOS, Center for Coastal Fisheries and Habitat Research
 Applied Ecology and Restoration Research Branch
 101 Pivers Island Rd.
 Beaufort, NC 28516-9722 office: 252.728.8777 fax: 252.728.8784

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Recursive decreasing sequences

2006-10-20 Thread Achim Zeileis

On Fri, 20 Oct 2006 15:16:12 -0500 Marc Schwartz wrote:

  Range - c(2:6)
 
 Gack
 
 Disregard the 'c' and parens there.  Left over from a first attempt
 at a solution using c(2, 6)...

Like this one:
  myseq4 - function(start, end)
unlist(lapply(start:end, function(x) x:end))
which is very similar to Marc's that was essentially
  myseq3 - function(start, end) {
rval - start:end
unlist(sapply(seq(along = rval), function(x) rval[x]:max(rval)))
  }

The problem with Julian's original suggestion
  myseq1 - function(start, end) {
rval - integer(0)
for(i in start:end) rval - c (rval, i:end)
return(rval)
  }
is that rval is recursively grown which becomes really slow (as
documented in various places). The program can already be improved
significantly if this is avoided by creating a sufficiently large
vector initially:
  myseq2 - function(start, end) {
rval - integer((end-start+1) * (end-start+2)/2)
j - 1
for(i in start:end) {
  rval[j:(j+end-i)] - i:end
  j - j+end-i+1
}
return(rval)
  }

And on my machine I obtain the following timings:

R system.time(myseq1(2, 1000))
[1] 7.150 1.110 8.312 0.000 0.000
R system.time(myseq2(2, 1000))
[1] 0.060 0.000 0.059 0.000 0.000
R system.time(myseq3(2, 1000))
[1] 0.050 0.000 0.048 0.000 0.000
R system.time(myseq4(2, 1000))
[1] 0.020 0.010 0.028 0.000 0.000

Thus, the main problem was growing the return value recursively. My
last suggestion seems to be slightly faster than the previous ones...

hth,
Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] correlation b/w a continuous variable and a categorical variable

2006-10-13 Thread Achim Zeileis

On Fri, 13 Oct 2006 17:15:45 -0400 Weiwei Shi wrote:

 Dear Listers:
 
 I happen to have this question in mind, is there a way to evaluate the
 correlation between
 a continuous variable and a categorical variable (without discretizing
 the former)? My intuitive is using lda by considering the latter as
 response variable but not sure.

It depends what exactly you mean by evaluate correlation. If you want
to test independence of two variables X and Y against some form of
association, you can generally use statistics based on
  sum h(Y) * g(X)
where h() and g() are suitable transformations of X and Y. Special
cases of this framework are tests for correlation of continuous
variables and Chi-squared type statistics for categorical variables.
This approach is implemented in the package coin, see
independence_test() and the package vignette.

hth,
Z

 thanks,
 
 weiwei
 
 -- 
 Weiwei Shi, Ph.D
 Research Scientist
 GeneGO, Inc.
 
 Did you always know?
 No, I did not. But I believed...
 ---Matrix III
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html and provide commented,
 minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] correlation b/w a continuous variable and a categorical variable

2006-10-13 Thread Achim Zeileis

On Fri, 13 Oct 2006 18:17:10 -0400 Weiwei Shi wrote:

 I see.
 
 i think the question is, I did not have a clear idea of the
 correlation between them (if I insist no transformation). Otherwise,
 for a binary variable case, maybe a simple one-way t-test serves the
 purpose if I defined such correlation or dependency as the group mean
 difference.

...another special case of the general framework I outlined below. But
the man page and package vignette I already pointed you to, give you a
much better explanation of this.
Z


 thanks.
 
 On 10/13/06, Achim Zeileis [EMAIL PROTECTED] wrote:
  On Fri, 13 Oct 2006 17:15:45 -0400 Weiwei Shi wrote:
 
   Dear Listers:
  
   I happen to have this question in mind, is there a way to
   evaluate the correlation between
   a continuous variable and a categorical variable (without
   discretizing the former)? My intuitive is using lda by
   considering the latter as response variable but not sure.
 
  It depends what exactly you mean by evaluate correlation. If you
  want to test independence of two variables X and Y against some
  form of association, you can generally use statistics based on
sum h(Y) * g(X)
  where h() and g() are suitable transformations of X and Y. Special
  cases of this framework are tests for correlation of continuous
  variables and Chi-squared type statistics for categorical variables.
  This approach is implemented in the package coin, see
  independence_test() and the package vignette.
 
  hth,
  Z
 
   thanks,
  
   weiwei
  
   --
   Weiwei Shi, Ph.D
   Research Scientist
   GeneGO, Inc.
  
   Did you always know?
   No, I did not. But I believed...
   ---Matrix III
  
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
   http://www.R-project.org/posting-guide.html and provide commented,
   minimal, self-contained, reproducible code.
  
 
 
 
 -- 
 Weiwei Shi, Ph.D
 Research Scientist
 GeneGO, Inc.
 
 Did you always know?
 No, I did not. But I believed...
 ---Matrix III
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html and provide commented,
 minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ts vs zoo

2006-10-12 Thread Achim Zeileis

On Thu, 12 Oct 2006 12:26:42 +0200 Schweitzer, Markus wrote:

 thank you very much for the information.
 
 I guess I should have been more clear here.
 I was looking for the monthly or weekly trends within this one year
 period.
 
 to get there I now only took the zoo object x and made
 
 
 x-as.ts(x)
 x-ts(x, frequency=7) #to get 52 weeks(Periods) with 7 days each
 
 - to get 12 periods e.g. months with 29,30 or 31 days, I guess I can
 only choose frequency=30
 
 I then can run stl
 It is just a pitty, that the labeling (jan 2005, feb 2005 ..) has
 gone.

But you can do the following
  x - zoo(rnorm(365), seq(from = as.Date(2005-01-01),
to = as.Date(2005-12-31), by = 1 day))
  xt - ts(as.ts(x), frequency = 7)
  xt_stl - stl(xt, s.window = 28, t.window = 28)
  xz_stl - as.zoo(xt_stl$time.series)
  time(xz_stl) - time(x)
  plot(xz_stl)
i.e., convert the extracted series back to zoo and add the original
index for plotting.
Z
 

 So thank you for your hint with barplot and rollmean
 
 best regards, markus
  
 
 -Original Message-
 From: Achim Zeileis [mailto:[EMAIL PROTECTED] 
 Sent: Donnerstag, 12. Oktober 2006 12:15
 To: Schweitzer, Markus
 Cc: R-help@stat.math.ethz.ch
 Subject: Re: [R] ts vs zoo
 
 Markus,
 
 several comments:
 
   I have lots of data in zoo format and would like to do some time 
   series analysis. (using library(zoo), library(ts) )
 
 The ts package has been integrated into the stats package for a
 long time now...
 
   My data is usually from one year, and I try for example  stl() to 
   find some seasonalities or trends.
 
 As pointed out by Philippe, this is not what STL is made for. In STL
 you try to find seasonality patterns by loess smoothing the
 seasonality of subsequent years. If you have observations from just
 one year, there is just one seasonality pattern (at least if you look
 for monthly or quaterly patterns).
 
   I have now accepted, that I might have to convert my series into
   ts () but still I am not able to execute the comand since stl()
   is not satisfied
 
 And there are reasons for this: you need to have a regular time series
 with a certain frequency so that STL is applicable. (One could argue
 that ts is not the only format for regular time series but typically
 you can easily coerce back and forth between ts and zoo/zooreg.
 
   x-zoo(rnorm(365), as.Date(2005-01-01):as.Date(2005-12-31))
 
 I don't think that this is what you want. Look at time(x). I guess you
 mean
   x - zoo(rnorm(365), seq(from = as.Date(2005-01-01),
 to = as.Date(2005-12-31), by = 1 day))
 
   x-as.ts(x)
   #x-as.ts(x, frequency=12)  #this has no effect frequency is not
 
 Here, it seems to me that you want to aggregate to monthly data, this
 can be done via
   x2 - aggregate(x, as.yearmon, mean)
 
 This is now (by default) a regular series with frequency 12
   frequency(x2)
 
 and hence it can be easily coereced to ts and back (with almost no
 loss of information):
   as.zoo(as.ts(x2))
 
 However, calling stl(as.ts(x2)) still complains that there are not
 enough periods because this is just a single year, i.e., only a single
 seasonality pattern. To look at this, you could do
barplot(x2)
 
 For looking at the trend you could use a simple running mean
   plot(x)
   lines(rollmean(x, 14), 2)
 or you could also use loess() or some other smoother...
 
 For more details on the zoo package, see
   vignette(zoo, package = zoo)
 
 Best,
 Z


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Heteroskedasticity test

2006-09-30 Thread Achim Zeileis

On Fri, 29 Sep 2006, Alberto Monteiro wrote:

 Is there any heteroskedasticity test in the package? Something
 that would flag a sample like

  x - c(rnorm(1000), rnorm(1000, 0, 1.2))

The package lmtest contains several tests for heteroskedasticity, in 
particular the Breusch-Pagan test (and also the Goldfeld-Quandt test for 
known change point). Furthermore, some of the structural change tests in 
strucchange can be used to test for non-constant variances, e.g, the 
Nyblom-Hansen test.
Z

 Alberto Monteiro

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Histogram of data with categorical varialbe

2006-09-15 Thread Achim Zeileis

On Fri, 15 Sep 2006 16:45:31 +0200 Alexandre Depire wrote:

 Hello,
 I have the following data:
 Km Sex
 250 1
 300 2
 290 2
 600 1
 450 2
 650 1
 .
 
 I would like to obtain one histogram where the data (or the part) of
 each sex is visible, it is like cumulative histogram or spinogram.
 To be more comprehensible, i would like to know if the following
 graph is obtainable easily in R. It is the first graph on page 5 in
 the following document
 http://jasp.ism.ac.jp/~nakanoj/workshop04/TalkII.pdf

I think it's not available out of the box, but you can easily generate
this yourself. Using the space shuttle o-ring data as an example (see
also ?spineplot):

  fail - factor(c(2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 2, 1, 2, 1, 1, 1,
1, 2, 1, 1, 1, 1, 1), levels = c(1, 2), labels = c(no, yes))
  temperature - c(53, 57, 58, 63, 66, 67, 67, 67, 68, 69, 70, 70,
70, 70, 72, 73, 75, 75, 76, 76, 78, 79, 81)

The you can do the following:

  ## generate unconditional histogram of numeric variable P(x)
  col - gray.colors(2)
  ht - hist(temperature, freq = FALSE, col = col[2])
  ## using the same breaks, compute the conditional histogram P(x|group)
  br - ht$breaks
  ht2 - hist(temperature[fail == yes], plot = FALSE, freq = FALSE,
breaks = br)
  ## and then add the rectangles using the joint densities
  ## P(x  group) = P(x|group) * P(group)
  rect(br[-length(br)], 0, br[-1], ht2$density * mean(fail == yes),
col = col[1])

Furthermore, you can take a look at the iplots package which should
provide an interactive approach to this.
Z
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html and provide commented,
 minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] [R-pkgs] zoo: new version 1.2-0

2006-08-24 Thread Achim Zeileis

Dear useRs,

the new version 1.2-0 of the zoo package for dealing with regular and
irregular time series data is available from the CRAN mirrors.

This version includes two important changes/enhancements:

  - rapply() was re-named to rollapply() because from R 2.4.0 on,
base R provides a function rapply() for recursive (not rolling)
application of functions, which was already described in the Green
Book. zoo::rapply() currently still exists for backward
compatibility, however, it is flagged as deprecated and now
dispatches to rollapply() methods. We recommend to change existing
scripts from using rapply() to rollapply().

  - xyplot() methods for zoo, ts, and its objects have been
added for creating trellis time series graphs. The functions
are still under development and suggestions for improvement are
welcome.

For general introductions to the package see:
  vignette(zoo, package = zoo)
  vignette(zoo-quickref, package = zoo)

Best wishes,
Z

___
R-packages mailing list
R-packages@stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] [R-pkgs] sandwich: new version 2.0-0

2006-08-24 Thread Achim Zeileis

Dear useRs,

a new version 2.0-0 of the sandwich package for estimating sandwich
covariance matrices is available from the CRAN mirrors.

The tools for computing heteroskedasticity (and autocorrelation)
consistent covariance matrix estimators (also called HC
and HAC estimators, including the Eicker-Huber-White estimator)
have been generalized over the last releases from linear regression to
general parametric models. These new object-oriented features of the
sandwich package are also described in paper published in the Journal
of Statistical Software (JSS) that accompanies this release. See
  http://www.jstatsoft.org/

The new JSS paper and the previous one (accompanying version 1.0-0) are
also available as package vignettes:
  vignette(sandwich, package = sandwich)
  vignette(sandwich-OOP, package = sandwich)

Best wishes,
Z

___
R-packages mailing list
R-packages@stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problem with tseries (was unable to restore saved data)

2006-08-11 Thread Achim Zeileis

On Wed, 9 Aug 2006 07:58:45 +0100 (BST) Prof Brian Ripley wrote:

 Loading the workspace is done before loading the standard set of
 packages. This appears to be a bug in tseries, for if its namespace
 needs 'time', it should import it from 'stats', and it is not even
 depending on 'stats'.

[...]
 
  loadNamespace(tseries)
 Error: object 'time' not found whilst loading namespace 'tseries'
 
 That is a matter for the 'tseries' maintainer (Cc:ed here).  If you
 try library(tseries) at that point you find
 
  library(tseries)
 Loading required package: zoo
 Error: object 'aggregate' not found whilst loading namespace 'zoo'
 Error: package 'zoo' could not be loaded
 
 so package zoo has a similar problem (maintainer Cc:ed).

Thanks for the pointer: I started revising my packages as to properly
declare dependencies, and also did the same for tseries. I'll try to
get them out on CRAN asap.

thx
Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] sort() for time/date objects (was: more on date conversion differences in 2.2.1 vs 2.3.1)

2006-08-11 Thread Achim Zeileis

On Sat, 12 Aug 2006 09:25:10 +1000 paul sorenson wrote:

 With dates I get different results with 2.2.1 and 2.3.1.  From my 
 somewhat naive point point of view, the 2.2.1 behaviour seems more
 sensible.

Your example below does not really make the source of the problem
obvious. I tried to see where the problem comes from and the following
seems to happen:
  xtabs() calls factor() calls sort(unique()) for computing the levels
And in the last step sort() seems to strip off the class attribute,
leaving only the underlying numeric vector.

Of course, the functions never claimed to work for POSIXct objects,
hence you should call xtabs() appropriately, e.g. via
  xtabs(V2 ~ as.character(V1), data = t)

However, it might be desirable to have sort() working for time/date
objects (such as POSIXct, Date and date). sort() would just have to
preserve the class as it did in R 2.2.1.
Z

 Running the code below in 2.2.1:
 V1
 2006-08-01 2006-08-01
1 1
 
 With 2.3.1 I get:
 V1
 1154354400 1154440800
  1  1
 
 # testdate.R
 t - read.csv2('testdate.csv', header=FALSE)
 t$V1 - as.POSIXct(t$V1)
 print(t)
 x - xtabs(V2 ~ V1, data=t)
 print(x)
 
 # testdate.csv
 2006-8-1;0;1
 2006-8-1;1;1
 2006-8-2;0;1
 2006-8-2;0;0
 2006-8-2;1;1
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html and provide commented,
 minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Variance-Covariance matrix from glm()

2006-08-04 Thread Achim Zeileis

On Fri, 4 Aug 2006, Daniel Jeske wrote:

 We are trying to find out how to get the variance-covariance matrix of the
 MLEs out of the glm function.  Can anyone help?

It can be extracted with the corresponding vcov() method.

Best,
Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] arules package: using image() deliveres unexpected results

2006-07-28 Thread Achim Zeileis

Markus:

 it seems like there is a problem with the image()-method in the package
 arules.

For problems with packages, please contact the maintainer or at least have
im Cc as the posting guide tells you.

 Using an ordninary matrix works fine:
 image(matrix(rnorm(200), 10, 20), axes = FALSE, col=brewer.pal(9, Blues) )

Note that image() is a generic function, the methods called in your
example above is rather different from the methods below.

 delivers an image with blue colors and no axes.

 Using an object of the class associations (arules package) does not work:

 image(items(ta.eclat), axes = FALSE, col=brewer.pal(9, Blues) )
 delivers an error message telling that the argument col matches to
 several formal arguments.

So maybe you should check what arguments are allowed for the method in
arules:
  help(image, package = arules)
tells you to look that essentially levelplot() is called and
  help(levelplot, package = lattice)
shows that there are two arguments starting with `col' (col.regions and
colorkey) and that there is no argument `axes'.

 In the source code of arules I couldn't find any definition of the
 image-function (especially no image.R-file.)

It's in itemMatrix.R, because it is the image() method for itemMatrix
objects. You can also see in
  getMethod(image, itemMatrix)
that it calls the dgTMatrix method:
  getMethod(image, dgTMatrix)
which calls levelplot().

A note to Michael (arules maintainer): the link from the image()
documentation in arules to Matrix is broken (should be to
dgTMatrix-class) and maybe you should also link levelplot() in lattice
explicitely?

hth,
Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Codes; White's heteroscedasticity test and GARCH models

2006-07-26 Thread Achim Zeileis

Spyros:

   I have just recently started using R and was wondering whether anybody
   had a code written for White's heteroscedasticity correction for
   standard errors.

See package sandwich, particularly functions vcovHC() and sandwich().

   Also, can anybody share a code for the GARCH(1,1) and GARCH-in-mean
   models for modelling regression residuals?

See function garch() in package tseries.

Furthermore, the econometrics and finance task views might be helpful for
you:
  http://CRAN.R-project.org/src/contrib/Views/Econometrics.html
  http://CRAN.R-project.org/src/contrib/Views/Finance.html

hth,
Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plain shading (not residuals) in mosaic plot

2006-07-20 Thread Achim Zeileis

On Thu, 20 Jul 2006, Kie Zuraw wrote:

 Thank you very much, Gabor Grothendieck and Muhammad Subianto. Both of
 these work perfectly. I think I was misunderstanding gp and gpar()
 before. Again, thank you both.

Two further additions to this:

  Maybe like this:
  mosaic(allmorph, direction = v, pop = FALSE,
  gp=gpar(fill=c(grey(0.8),grey(0.4

This works because the fill pattern is expanded along the dependent
variable (last splitting variable) in the plot. In general you can supply
arbitrary patterns by supplying a fill pattern of the same dimension as
the original table, e.g.,

  mycol - allmorph
  mycol[,1] - grey.colors(2)[2]
  mycol[,2] - grey.colors(2)[1]
  mycol[1,2] - grey(0)
  mosaic(allmorph, split_vertical = TRUE, gp = gpar(fill = mycol))

Furthermore, several high-level functions producing similar pictures are
available, in particular

  doubledecker(allmorph)
  doubledecker(allmorph, col = grey.colors(2)[2:1])

or

  spine(allmorph)

Note that the latter is not based on strucplot() and hence can not be
labeled with the labeling functions for mosaic().

Best,
Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Robust standard errors in logistic regression

2006-07-04 Thread Achim Zeileis

On Tue, 4 Jul 2006 13:14:24 -0300 Celso Barros wrote:

 I am trying to get robust standard errors in a logistic regression.
 Is there any way to do it, either in car or in MASS?

Package sandwich offers various types of sandwich estimators that can
also be applied to objects of class glm, in particular sandwich()
which computes the standard Eicker-Huber-White estimate.

These robust covariance matrices can be plugged into various inference
functions such as linear.hypothesis() in car, or coeftest() and
waldtest() in lmtest.

See the man pages and package vignettes for examples.
Z

 Thanks for the help,
 
   Celso
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] useR! 2006: presentation slides

2006-06-30 Thread Achim Zeileis

Dear useRs,

the useR! 2006 conference took place in Vienna two weeks ago: it was an
exciting and interesting meeting with about 400 useRs from all over the
world and more than 150 presentations.

Especially for those of you who could not make it to the conference, we
have made 4up PDF versions of the presentations slides available.

The slides for the keynote lectures are available at
  http://www.R-project.org/useR-2006/Keynotes/
and the user-contributed presentations at
  http://www.R-project.org/useR-2006/Presentations/

Finally, some materials for the panel discussion on Getting recognition
for excellence in computational statistics are provided at
  http://www.R-project.org/useR-2006/PanelDisc/
including a summary, the short presentation slides of the panelists and
the full discussion as a video.

A big thank you to everyone who contributed to the conference and...
best wishes from Vienna!

The organizing team
Achim, Torsten, David, Bettina, Fritz and Kurt.

___
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-announce

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] converting to time series object : ts - package:stats

2006-06-26 Thread Achim Zeileis

On Mon, 26 Jun 2006, Sachin J wrote:

 Hi,

   I am trying to convert a dataset (dataframe) into time series object
 using ts function in stats package. My dataset is as follows:

   df
   [1] 11.08 7.08  7.08  6.08  6.08  6.08  23.08 32.08 8.08  11.08 6.08  13.08 
 13.83 16.83 19.83 8.83  20.83 17.83
   [19] 9.83  20.83 10.83 12.83 15.83 11.83

Please provide a reproducible example. You just showed us the print output
for an object, claiming that it is an object of class data.frame which
is rather unlikely given the print output.

   I converted this into time series object as follows

   tsdata -  ts((df),frequency = 12, start = c(1999, 1))

which produces the right result for me if `df' is a vector or a
data.frame:

df - c(11.08, 7.08, 7.08, 6.08, 6.08, 6.08, 23.08, 32.08, 8.08, 11.08,
6.08, 13.08, 13.83, 16.83, 19.83, 8.83, 20.83, 17.83, 9.83, 20.83,
10.83, 12.83, 15.83, 11.83)
ts(df, frequency = 12, start = c(1999, 1))
ts(as.data.frame(df), frequency = 12, start = c(1999, 1))

   The resulting time series is as follows:

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
 1999   2  15  15  14  14  14  12  13  16   2  14   5
 2000   6   8  10  17  11   9  18  11   1   4   7   3

   I am unable to understand why the values of df and tsdata does not match.

So are we because you didn't really tell us enough about df...

Best,
Z

 I looked at ts function and I couldn't find any data transformation. Am
 I missing something here? Any pointers would be of great help.

   Thanks in advance.

   Sachin


 -

   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Problems with weekday extraction from zoo objects

2006-06-23 Thread Achim Zeileis

On Fri, 23 Jun 2006 16:12:27 -0500 Kerpel, John wrote:

  SP500-read.zoo(SP500.csv, sep = ,)
 Error in read.zoo(SP500.csv, sep = ,) : 
 index contains NAs

Well, there are two problems with this: 1. the CSV is not
comma-separated (despite its siffix) and 2. the date format should be
specified.

 First ten records of SP500.csv:
 
 Date  OpenHighLow Close
 VolumeAdj. Close*
^^^
and I changed this name to Adj.Close which gives a syntactically
valid name in R.

Then I successfully employed:
  SP500 - read.zoo(SP500.csv, format = %d-%b-%y, header = TRUE)
  DGS10 - read.zoo(DGS10.csv, format = %m/%d/%Y, header = TRUE)
and then
  weekdays(time(SP500))
  weekdays(time(DGS10))

Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] timediff

2006-06-20 Thread Achim Zeileis

On Tue, 20 Jun 2006 12:38:11 +0200 COMTE Guillaume wrote:

  
 
 Hello,
 
  
 
 I've got 2 dates like these :
 
  
 
 2006-02-08 17:12:55
 
 2006-02-08 17:15:26
 
  
 
  
 
 I wish to get the middle of these two date :

There is a mean method for POSIXct objects:

  x - as.POSIXct(c(2006-02-08 17:12:55, 2006-02-08 17:15:26))
  mean(x)

Best,
Z

  
 
 2006-02-08 17 :14 :10
 
  
 
 Is there is a function in R to do that ?
 
  
 
 Thks all
 
 guillaume
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] [R-pkgs] zoo: new version 1.1-0

2006-06-06 Thread Achim Zeileis

Dear useRs,

I've just uploaded the new version 1.1-0 of the zoo package for dealing
with regular and irregular time series data to the main CRAN site.
Source and binary packages should be available from the mirrors in the
next days.

There are two changes that are nut fully backwards compatible:
  - The package now has a NAMESPACE and most S3 methods are not exported
explicitely anymore. If some function needs to be exported for
useRs/developeRs of other packages, please let us know.
  - By default read.zoo() does not try to produce regular series
anymore. The previous version could result in misleading
computations for daily financial data, affecting especially also the
zoo-quickref vignette. (Thanks to Ajay Shah for pointing this out.)
The old behaviour can be restored by setting `regular = TRUE'.

For general introductions to the package see:
  vignette(zoo, package = zoo)
  vignette(zoo-quickref, package = zoo)

Best wishes,
Z

___
R-packages mailing list
[EMAIL PROTECTED]
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] lm() variance covariance matrix of coefficients.

2006-06-02 Thread Achim Zeileis

On Fri, 2 Jun 2006 18:24:57 -0300 (ADT) Rolf Turner wrote:

 summary(object)$cov.unscaled

As the name suggests: this is the unscaled covariance matrix!

Use the vcov() extractor method, i.e.,
  vcov(object)
which has methods not only for lm but also many other fitted model
objects.
Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] useR! 2006: final program

2006-05-29 Thread Achim Zeileis

Dear useRs,

the final scientific program for useR! 2006, the second R user
conference taking place in Vienna June 15-17, is now available online at

http://www.R-project.org/useR-2006/

The registration is open until May 31, so you still have two days to
register for this exciting event where you can meet 400 other useRs,
attend 160 presentations and participate in some of the pre-conference
tutorials given by prominent members of the R community. 

We look forward to meeting you in Vienna!

The organizing team
Torsten, Achim, David, Bettina, Fritz and Kurt.

___
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-announce

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] adding line to spinogram/histogram/etc solved

2006-05-17 Thread Achim Zeileis

Viktor,

what you said below was all correct for the specific situation you
looked at, to get more of a general overview, look at some of the
vignettes in the grid package
  vignette(package = grid)
particularly grid and viewport and then maybe at
  vignette(package = vcd)
which explains some of the ideas implemented in vcd.

Best wishes,
Z

On Wed, 17 May 2006 01:56:02 +0100 Viktor Tron wrote:

 Thanks loads, extremely useful.
 
 So just for the record again.
 You have to get into the viewport that contains the actual graphplot
 with the axes.
 So, how you do it:
 
 1. viewports have names.
 In case of histograms the relevant viewport is something like  
 plotA.panel.X.Y.vp
 where A is the plot index and X, Y are the x, y indexes of the panel
 (as usual, starting from bottom left)
 but you can try out the other names shown by the
 current.vpTree()
 command in case in trouble.
 In case of spinograms you have to use the pop=FALSE option for the
 graph viewport inside the root to be retained.
 Then it is just called spinelot.
 
 2. Once you figured out the name of the viewport you want to add plot
 to, issue
 seekViewport(spineplot)
 to get the focus on the viewport and then you can use the grid
 primitives the same way as you use lines, segments on normal plots.
 
 3. If you want to use the scale of the axes (natrually what you want)
 you have to tell grid to use the NATIVE scale of the viewport  
 (default.units=native).
 
 So something like this will do:
 
 grid.segments(,100,,100,gp=gpar(col=red),default.units=native)
 
 to add a 100 items cut-off line across a count type histogram.
 (Note the left out parameters which resolve to start/end as x1 and x2)
 
 Brilliant, cheers Achim.
 V
 
 
 
 On Tue, 16 May 2006 19:09:14 +0100, Achim Zeileis  
 [EMAIL PROTECTED] wrote:
 
  On Tue, 16 May 2006 17:42:22 +0100 Viktor Tron wrote:
 
  Hello,
  Thanks for the hint.
  grid.segments seemed the closest I got.
  I did manage to draw (well fake) a line with it. I can only address
  the whole drawing frame, which means I can only adjust the position
  and length of the line
  by trial and error. I see no way to address the y axis scale of my
  spinogram/histogram.
  Is there a way?
 
  Yes, that's the wonderful thing about grid!
 
  Consider this example with data from vcd
spine(Fail ~ Temperature, data = SpaceShuttle)
  Then you can look at the viewport tree in which you can navigate:
current.vpTree()
  which leaves you here only with the ROOT node, hence you had
  troubles adjusting your lines. But looking at ?spine reveals that
spine(Fail ~ Temperature, data = SpaceShuttle, pop = FALSE)
  does *not* pop away the viewport tree which is here relatively
  simple current.vpTree()
  just shows viewport[ROOT]-(viewport[spineplot]).
 
  So you can hop into the main picture
seekViewport(spineplot)
  (which you can also name differently) and do more or less sensible
  things, e.g.
grid.rect(gp = gpar(col = 2))
  adds a red box around the plot or
grid.lines(c(0, 1), c(0.3, 0.7), gp = gpar(col = 4))
  adds a blue line. Note that both x- and y-axis are on a probability
  scale, i.e., it plots P(Temperature = x) vs. P(Fail = no).
 
  To see a more elaborated example how these graphics can be re-used,
  look at example(mob) in library(party).
 
  Best,
  Z
 
  Not a huge problem, but I thought someone must have thought of
  adding lines to their spinograms or histograms before...
  V
 
 
  On Mon, 15 May 2006 14:13:00 +0100, Prof Brian Ripley
  [EMAIL PROTECTED] wrote:
 
   Package vcd is built on grid, not base graphics.
  
   On Mon, 15 May 2006, Viktor Tron wrote:
  
   Dear all,
   I wonder what's special about spinograms {vcd} that prevents me
   from using
   it the way I do with other plots.
  
   I do:
  
   spine(f.speaker.identity ~ x.log.lengthening,
   data=ms,breaks=45,gp=gpar(fill=c(red,green)),xlab=length
   difference
   (log ms),ylab=speaker)
   curve(0*x,add=T)
   Error in plot.xy(xy.coords(x, y), type = type, col = col, lty =
   lty, ...) :
   plot.new has not been called yet
  
  
   OK, if I do
   curve(0*x,add=)
   spine(f.speaker.identity ~ x.log.lengthening,
   data=ms,breaks=45,gp=gpar(fill=c(red,green)),xlab=length
   difference
   (log ms),ylab=speaker)
   curve(0*x,add=T)
  
   then the plot is what I want, but note that I had to use y=0 to
   get the line put at 0.5 so it is already suspicious.
   But then:
  
   dev.print(pdf,mde_speakerration_by_lengthening.pdf)
   Error in dev.copy(device = function (file = ifelse(onefile,
   Rplots.pdf,
   :
   invalid graphics state
  
   Can anyone suggest a remedy?
  
   Use grid primitives to add to the plot.
  
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide!
  http://www.R-project.org/posting-guide.html
 
 __
 R-help@stat.math.ethz.ch mailing list

Re: [R] adding line to spinogram

2006-05-16 Thread Achim Zeileis

On Tue, 16 May 2006 17:42:22 +0100 Viktor Tron wrote:

 Hello,
 Thanks for the hint.
 grid.segments seemed the closest I got.
 I did manage to draw (well fake) a line with it. I can only address
 the whole drawing frame, which means I can only adjust the position
 and length of the line
 by trial and error. I see no way to address the y axis scale of my  
 spinogram/histogram.
 Is there a way?

Yes, that's the wonderful thing about grid!

Consider this example with data from vcd
  spine(Fail ~ Temperature, data = SpaceShuttle)
Then you can look at the viewport tree in which you can navigate:
  current.vpTree()
which leaves you here only with the ROOT node, hence you had troubles
adjusting your lines. But looking at ?spine reveals that
  spine(Fail ~ Temperature, data = SpaceShuttle, pop = FALSE)
does *not* pop away the viewport tree which is here relatively simple
  current.vpTree()
just shows viewport[ROOT]-(viewport[spineplot]).

So you can hop into the main picture
  seekViewport(spineplot)
(which you can also name differently) and do more or less sensible
things, e.g.
  grid.rect(gp = gpar(col = 2))
adds a red box around the plot or
  grid.lines(c(0, 1), c(0.3, 0.7), gp = gpar(col = 4))
adds a blue line. Note that both x- and y-axis are on a probability
scale, i.e., it plots P(Temperature = x) vs. P(Fail = no). 

To see a more elaborated example how these graphics can be re-used,
look at example(mob) in library(party).

Best,
Z

 Not a huge problem, but I thought someone must have thought of
 adding lines to their spinograms or histograms before...
 V
 
 
 On Mon, 15 May 2006 14:13:00 +0100, Prof Brian Ripley  
 [EMAIL PROTECTED] wrote:
 
  Package vcd is built on grid, not base graphics.
 
  On Mon, 15 May 2006, Viktor Tron wrote:
 
  Dear all,
  I wonder what's special about spinograms {vcd} that prevents me
  from using
  it the way I do with other plots.
 
  I do:
 
  spine(f.speaker.identity ~ x.log.lengthening,
  data=ms,breaks=45,gp=gpar(fill=c(red,green)),xlab=length  
  difference
  (log ms),ylab=speaker)
  curve(0*x,add=T)
  Error in plot.xy(xy.coords(x, y), type = type, col = col, lty =
  lty, ...) :
 plot.new has not been called yet
 
 
  OK, if I do
  curve(0*x,add=)
  spine(f.speaker.identity ~ x.log.lengthening,
  data=ms,breaks=45,gp=gpar(fill=c(red,green)),xlab=length  
  difference
  (log ms),ylab=speaker)
  curve(0*x,add=T)
 
  then the plot is what I want, but note that I had to use y=0 to
  get the line put at 0.5 so it is already suspicious.
  But then:
 
  dev.print(pdf,mde_speakerration_by_lengthening.pdf)
  Error in dev.copy(device = function (file = ifelse(onefile,  
  Rplots.pdf,
  :
 invalid graphics state
 
  Can anyone suggest a remedy?
 
  Use grid primitives to add to the plot.
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Can't there be a cd command?

2006-05-16 Thread Achim Zeileis

On Tue, 16 May 2006 16:50:36 -0500 Marc Schwartz (via MN) wrote:

 On Tue, 2006-05-16 at 17:49 -0300, Rolf Turner wrote:
  Duncan Murdoch wrote (amongst other things):
  
   Statistical computing is not easy, so how could R be?  Who has
   ever claimed it is?  Any package that makes statistical computing
   appear to be easy is probably giving you wrong answers half the
   time, or is extremely limited in scope.
  
  Well expressed, Duncan!  (A slam Dunc, one might say. :-) )
  
  cheers,
  
  Rolf
 
 In looking back at the archive, I don't see an indication of someone
 suggesting that Duncan's prose might be a candidate for the fortunes()
 package, unless Achim has already done so...  :-)

...I was waiting for someone to suggest it ;-) Done now. We're also
beginning to have enough new fortunes for a new release...will try to
put it out soon.

Best,
Z

 Regards,
 
 Marc
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] command completion?

2006-05-10 Thread Achim Zeileis

On Wed, 10 May 2006 13:26:32 -0700 Berton Gunter wrote:

 Is the following a fortunes package candidate?

...promoted to fortunes package member in the devel-version.

thx,
Z

 
  Others need to run under ESS.
  
  While this is a good things for Emacs lovers, the requirement 
  is rather 
  unwelcome for pagans!  :-)
  
  -- 
  François Pinard   http://pinard.progiciels-bpi.ca
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
  http://www.R-project.org/posting-guide.html
  
 
 
 :-)
 
 -- Bert
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Can't there be a cd command?

2006-05-10 Thread Achim Zeileis

On Wed, 10 May 2006 13:51:06 -0700 Berton Gunter wrote:

 ...another fortunes package candidate?

Hehe, yes, this time I was quicker ;-)

 I especially liked the sections
 beginning R is a 4 wheel drive SUV..., but a lot of it is great
 IMHO. Well said!

Indeed.

 Bestimmt!

Ja, das auch...
Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Help on zoo and datetime series

2006-05-08 Thread Achim Zeileis

On Mon, 8 May 2006 22:09:45 +0200 [EMAIL PROTECTED] wrote:

 
 Hello,
 i would like to import this txt file:
 
 Giorno;PM10
 2006-01-01 10:10;10.3
 2006-02-02 20:22;50.3
 2006-03-03 23:33;20.1
 .
 
 
 As it's an irregular time series i use zoo as follow:
 
 require(zoo)
 z - read.table(c:\\1.csv, sep=;, na.strings=-999, header=TRUE)
 q - zoo(z$PM10, strptime(as.character(z$Giorno),%Y-%m-%d %H:%M))
 
 At this point this error message appears:
 
 Errore in order(x, ..., na.last = na.last, decreasing = decreasing) : 
 tipo 'list' non implementato in 'orderVector1'

Your datetime vector is of class POSIXlt which is essentially a list.
Better use POSIXct:

q - zoo(z$PM10,
  as.POSIXct(strptime(as.character(z$Giorno),%Y-%m-%d %H:%M)))

However, it becomes even simpler via read.zoo():

z - read.zoo(c:\\1.csv, sep = ;, header = TRUE,
  na.strings = -999, tz = CET)

hth,
Z

 
 I can't understand how can I import such a file. Is there anybody who
 can help me?
 
 Thanks
 
 Domenico
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] function for linear regression with White std. errors

2006-04-28 Thread Achim Zeileis

Brian:

 Thanks for the suggestion, but tomorrow I am teaching a little seminar
 for my department trying to convince people about how wonderful R is.
 These people are all Stata users, and they really like the idea that
 they only have to type , robust to get het. consistent std. errors.

With the package lmtest, you can do
  fm - lm(...)
  coeftest(fm, vcov = hccm)
which does not seem to be too hard. Also look at the sandwich package
which provides a flexible framework and large collection of HC and HAC
estimators (more so than Stata afaik).

Also see
  Achim Zeileis (2004). Econometric Computing with HC and HAC
  Covariance Matrix Estimators. Journal of Statistical Software 11(10),
  1-17. URL http://www.jstatsoft.org/v11/i10/
for more examples.

Best wishes,
Z

 My pitch to a lot of them has been that R is a great and flexible tool
 which one doesn't have to be a programming whiz to use.  If they have
 to start writing functions to do something as basic as this, then my
 colleagues will probably stay in Stata-land.  I'm trying to prevent
 that!
 
 Regards,
 
 Brian
 
 2006/4/27, John Fox [EMAIL PROTECTED]:
  Dear Brian,
 
  How about sqrt(diag(hccm(mod)))? If that's too onerous, then you
  could define a function to do it, e.g.,
 
   wse - function(mod) sqrt(diag(hccm(mod)))
 
  and then enter wse(mod). If you want a more complete, table-like
  summary of the model using White standard errors, then you could
  easily write a function to provide that.
 
  Regards,
   John
 
  
  John Fox
  Department of Sociology
  McMaster University
  Hamilton, Ontario
  Canada L8S 4M4
  905-525-9140x23604
  http://socserv.mcmaster.ca/jfox
  
 
   -Original Message-
   From: [EMAIL PROTECTED]
   [mailto:[EMAIL PROTECTED] On Behalf Of Brian
   Quinif Sent: Thursday, April 27, 2006 7:34 PM
   To: r-help@stat.math.ethz.ch
   Subject: [R] function for linear regression with White std. errors
  
   I would like to know if there is a function that will run a
   linear regression and report the White (heteroscedasticity
   consistent) std.
   errors.
  
   I've found the hccm() function in the car library, but that
   just gives me the White covariance matrix.  I'd like to be
   able to see the White std. errors without having to do much
   more work, if possible.
  
   Thanks,
  
   Brian
  
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide!
   http://www.R-project.org/posting-guide.html
 
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] using betareg: problems with anova and predict

2006-04-18 Thread Achim Zeileis

On Mon, 17 Apr 2006, Paul Johnson wrote:

 Dear R-helpers:

 We have had fun using betareg to fit models with proportions as
 dependent variables.

 However, in the analysis of these models we found some wrinkles and
 don't know where is the best place to start looking for a fix.

 The problems we see (so far) are that

 1. predict ignores newdata

The reason for this is that predict.betareg() does not take a newdate
argument, you need to use the terms argument instead.

 2. anova does not work

...because there is no anova() method for betareg objects.

I think it is owrth addressing both issues in betareg, I'll write to you
and Alexandre (the package maintainer) off-list.
Z

 Here is the small working example:

 
 x - c(1, 3, 1, 4, 5, 3, 5, 2, 5, 2)
 y - c(.3, .4, .4, .5, , .7, .4, .3, .4, .3, .5)
 x2 - c( 4, 2, 1, 5, 1, 2, 4, 2, 1, 3)
 library(betareg)
 mybreg - betareg(y~x)
 summary(mybreg)

 predict(mybreg, newdata=data.frame(x=c(2,2)))

 mybreg2 - betareg(y~x+x2)
 summary(mybreg)
 anova(mybreg, mybreg2)

 ---
 Example output:

  predict(mybreg, newdata=data.frame(x=c(2,2)))
  [1] 0.3903155 0.4207632 0.3903155 0.4362319 0.4518258 0.4207632 0.4518258
  [8] 0.4054484 0.4518258 0.4054484
 ...
  anova(mybreg, mybreg2)
 Error in as.double.default(lapply(objects, df.residual)) :
   (list) object cannot be coerced to 'double'

 I have been digging in this a little bit and notice betareg does not
 return the log likelihood at the maximum likelihood estimates, but I
 am able to hack the file /usr/lib/R/library/betareg/R/betareg and save
 the value of the optim function and print that out, so at least I
 can do a likelihood-ratio test by hand in place of the anova.  But I'm
 stumped on why the predict ignores newdata.

 I'm using R-2.1.1 on Fedora Core Linux 5 with the betareg package from CRAN.

 In case you are interested in betareg, we have found this article to
 be extremely readable and helpful: FERRARI, S.L.P., CRIBARI-NETO, F.
 (2004). Beta regression for  modeling rates and proportions. Journal
 of Applied Statistics, v. 31, n. 7, p. 799-815.


 --
 Paul E. Johnson
 Professor, Political Science
 1541 Lilac Lane, Room 504
 University of Kansas

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] useR! 2006 program online

2006-03-27 Thread Achim Zeileis

Dear useRs,

we are happy to inform you that the program for the 2nd R user conference
useR! 2006 is now available online from the conference Web page at
  http://www.R-project.org/useR-2006/program.html

We would like to thank the useR community for submitting so many
interesting abstracts about an astonishing variety of R applications. We
are looking forward to an exciting and diversified conference!

The useR! organization team and program committee

___
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-announce

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] comparative density estimates

2006-03-24 Thread Achim Zeileis

On Fri, 24 Mar 2006 09:04:50 -0500 Michael Friendly wrote:

 Thanks, Achim
 
 The cdplot is quite interesting too, though it answers a slightly
 different question

Yes, certainly true.

 Here's a similar plot, fleshed out as my others:
[...]
 This also solves the little problem I had with offsetting
 the two rug plots (so as not to rely on color).

Yes, very informative addition.
 
 But I wonder why my main= title does not appear.

That's a bug. The `main' argument is processed almost everywhere, but
not passed on to the actual plot call, argh. Will provide a fix.
As a workaround you can use title().

Best,
Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] comparative density estimates

2006-03-23 Thread Achim Zeileis

Michael,

very nice and interesting plots!

One alternative idea to compare the proportion of milestone items
(that does not really answer the bandwith question) in Europe and North
America might be a conditional density plot. After running your R
source code, you could do:
  where - factor(c(rep(North America, length(sub1)),
rep(Europe, length(sub2 
  year - c(sub1, sub2)
  cdplot(where ~ year, bw = sj)
showing the decrease in the European proportion.

Internally, this first computes the unconditional density as in
  plot(density(year, bw = sj))
and then the density for Europe with the same bandwidth.

Best wishes,
Z


On Thu, 23 Mar 2006 14:25:53 -0500 Michael Friendly wrote:

 I have two series of events over time and I want to construct a graph
 of the relative frequency/density of these events that allows their 
 distributions to
 be sensibly compared.  The events are the milestones items in my
 project on milestones in the history of data visualization [1], and I
 want to compare trends
 in Europe vs. North America.
 
 I decided to use a graph of two overlaid density estimates with rug 
 plots, but then
 the question arises of how to choose the bandwidth (BW) for the two 
 series to allow them
 to be sensibly compared, because the range of time and total
 frequency differ
 for the two series.  To avoid clutter on this list, I've placed the
 data and R code
 at
 http://euclid.psych.yorku.ca/SCS/Gallery/milestone/Test/kde-bug/mileyears4.R
 
 I have two versions of this graph, one selecting an optimal BW for
 each separately
 and the other using the adjust= argument of density() to
 approximately equate
 the BW to the value determined for the whole series combined.  The
 two versions
 (done with SAS) are shown at
 
 http://euclid.psych.yorku.ca/SCS/Gallery/milestone/Test/kde-bug/mileyears32.gif
  
 
 http://euclid.psych.yorku.ca/SCS/Gallery/milestone/Test/kde-bug/mileyears33.gif
  
 
 
 The densities in the first are roughly equivalent to the R code
 d1 - density(sub1, from=1500, to=1990, bw=sj, adjust=1)
 d2 - density(sub2, from=1500, to=1990, bw=sj, adjust=1)
 
 the second to
 d1 - density(sub1, from=1500, to=1990, bw=sj, adjust=2.5)
 d2 - density(sub2, from=1500, to=1990, bw=sj, adjust=0.75)
 
 The second graph seems to me to undersmooth the more extensive data
 from Europe and undersmooth the data from North America.
 
 - any comments or suggestions?
 - are there other methods I should consider?
 
 I did find overlap.Density() in the DAAG package, but perversely, it 
 uses a bw=
 argument to select a BW/grayscale plot.
 
 thanks,
 -Michael
 
 
 [1] http://www.math.yorku.ca/SCS/Gallery/milestone/
 
 -- 
 Michael Friendly Email: [EMAIL PROTECTED] 
 Professor, Psychology Dept.
 York University  Voice: 416 736-5115 x66249 Fax: 416 736-5814
 4700 Keele Streethttp://www.math.yorku.ca/SCS/friendly.html
 Toronto, ONT  M3J 1P3 CANADA
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Problems to get a ctree plot (library party) in a file via jpeg/png

2006-03-02 Thread Achim Zeileis

Carlos:

 I am using library party and I have found a curious/strange behaviour when
 trying to save the output of a ctree in a file via jpeg/png command.

thanks for reporting this. We've seen this problem before with the vcd
package. The reason is that the default background is usually
transparent whereas for jpeg() and png() devices it is white. That has
the effect that the
  grid.rect()
command at the end of node_boxplot() not only draws the bounding rectangle
but also fills it with white, thus hiding the actual boxplots. The fix
is to use
  grid.rect(gp = gpar(fill = transparent))
instead. We'll commit a fixed party to CRAN asap.

Best wishes,
Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Cross-validation in SVM

2006-02-23 Thread Achim Zeileis

On Thu, 23 Feb 2006, Amir Safari wrote:

 Calculation of Cross-Validation for SVM, with thoese time series which
 include negative and positive values ( for example return of a stock
 exchange index) must be different from a calculation of Cross-Validation
 with time series which includes just absolute values( for example a
 stock exchange index).

Not necessarily, depends on the type of data.

 How is it calculated for a return time series?

From the man page of svm():

   cross: if a integer value k0 is specified, a k-fold cross
  validation on the training data is performed to assess the
  quality of the model: the accuracy rate for classification
  and the Mean Squared Error for regression

i.e., MSE will be used.
Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Assumption in Bassic Tobit regression

2006-02-14 Thread Achim Zeileis

On Wed, 15 Feb 2006 05:15:28 +0700 bambang pramono wrote:

 Is it assumption Tobit regression like OLS ? :
 1. Normal
 2. No autocorrelation
 3. Homogeneity of variance
 4. No Multicolinearity
 
 please make sure me !
 I need answer because i want to Judisium/ kompre/thesis test

This is the third time you ask the same question and it did not get
more meaningful! Please consult an econometrics textbook (e.g. Greene
or Cameron  Trivedi) for the theoretical background of tobit models.

Concerning available R implementations of various tobit flavours, you
have already been pointed to a useful thread in the mailing list
archives.
Z

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

1 2 3 4 >

1 - 100 of 331 matches

Mail list logo