[R] SOLVED: Lattice: How to do error bars

2009-08-12 Thread amvds
Thank you Gerrit,

Your suggestion proved right after all:

This does put error bars (dy) on the points. You only need to figure out
the proper y axis scaling.

 xyplot(y~x|f,data=xy,groups=dy,panel=function(x,y,groups,subscripts,...)
{panel.xyplot(x,y,...)
;panel.segments(x,y+groups[subscripts],x,y-groups[subscripts],...)})

Regards,
Alex van der Spek

 Hello, Alex,

 not sure in this case, but I think you have to provide and use an argument
 named subscripts for the panel function. See ?xyplot.


 On Tue, 11 Aug 2009, Alex van der Spek wrote:

 I am trying to add 2 stdev error bars to lattice type plots:

 panel.ebar-function(x,y,dy=NULL,...) {
  panel.xyplot(x,y,...)
  panel.segments(x,y-dy,x,y+dy,...)
 }

 Then:

 xyplot(y~x|fc,data=dat,dy=dat$dy,panel=panel.ebar)

 This adds error bars but they are not conditioned on the factor fc.

 xyplot(y+I(y-dy)+I(y+dy)~x|fc,data=dat)

 This produces 3 series of points in different colors, conditioned on fc
 as the data is now grouped per panel.

 Obviously I need a combination of the above two. I can't figure it out.

 Any help? Just point to docs if avaiable. The R doc is excellent. Most
 of the time I can save myself.

 Thanks in advance!
 Alex van der Spek

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


   Regards  --  Gerrit

 -
 AOR Dr. Gerrit Eichner Mathematical Institute, Room 305 E
 gerrit.eich...@math.uni-giessen.de   Justus-Liebig-University Giessen
 Tel: +49-(0)641-99-32104  Arndtstr. 2, 35392 Giessen, Germany
 Fax: +49-(0)641-99-32109  http://www.uni-giessen.de/~gcb7
 -


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] transform(_data,...) using strptime gives an error

2009-07-19 Thread amvds
I have timstamped data like this:

 sd[1:10,]
 Tstamp Density Mesh50 Mesh70 Mesh100 Mesh150 Mesh200
2  2009/02/27 07:0030.50.7   10.721.432.841.6
3  2009/02/27 08:0032.21.6   12.423.334.543.0
4  2009/02/27 09:0032.74.8   13.024.035.143.5
5  2009/02/27 10:0026.70.36.517.628.136.9
6  2009/02/27 11:0026.60.96.617.028.637.9
7  2009/02/27 12:0023.36.33.414.025.534.6
8  2009/02/27 13:0025.21.15.115.427.336.8
9  2009/02/27 14:0028.60.28.719.430.940.0
10 2009/02/27 15:0028.00.68.018.630.239.3
11 2009/02/27 16:0028.30.98.318.930.539.5

The timstamps are character vectors:

 str(sd)
'data.frame':   591 obs. of  7 variables:
 $ Tstamp : chr  2009/02/27 07:00 2009/02/27 08:00 2009/02/27 09:00
2009/02/27 10:00 ...
 $ Density: num  30.5 32.2 32.7 26.7 26.6 23.3 25.2 28.6 28 28.3 ...
 $ Mesh50 : num  0.7 1.6 4.8 0.3 0.9 6.3 1.1 0.2 0.6 0.9 ...
 $ Mesh70 : num  10.7 12.4 13 6.5 6.6 3.4 5.1 8.7 8 8.3 ...
 $ Mesh100: num  21.4 23.3 24 17.6 17 14 15.4 19.4 18.6 18.9 ...
 $ Mesh150: num  32.8 34.5 35.1 28.1 28.6 25.5 27.3 30.9 30.2 30.5 ...
 $ Mesh200: num  41.6 43 43.5 36.9 37.9 34.6 36.8 40 39.3 39.5 ...
 - attr(*, na.action)=Class 'exclude'  Named int [1:58] 1 88 89 90 250
318 319 320 321 322 ...
  .. ..- attr(*, names)= chr [1:58] 1 88 89 90 ...

Trying to transform the timestamped character vector 'in place' using
transform gives this error message:

 sd-transform(sd,Tstamp=strptime(Tstamp,format='%Y/%m/%d %H:%M'))
Error in `[-.data.frame`(`*tmp*`, inx[matched], value = list(Tstamp =
list( :
  replacement element 1 has 9 rows, need 591

Why? It beats me...

I do have a backup of course:

td-strptime(sd$Tstamp,format='%Y/%m/%d %H:%M')
sd-data.frame(Tstamp=td, sd[2:7])

this works fine but is one step more complicated. Something I miss about
transform()?

Thanks in advance,
Alex van der Spek

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Cluster analysis, defining center seeds or number of clusters

2009-06-11 Thread amvds
I use kmeans to classify spectral events in high and low 1/3 octave bands:

#Do cluster analysis
CyclA-data.frame(LlowA,LhghA)
CntrA-matrix(c(0.9,0.8,0.8,0.75,0.65,0.65), nrow = 3, ncol=2, byrow=TRUE)
ClstA-kmeans(CyclA,centers=CntrA,nstart=50,algorithm=MacQueen)

This works well when the actual data shows 1,2 or 3 groups that are not
too close in a cross plot. The MacQueen algorithm will give one or more
empty groups which is what I want.

However, there are cases when the groups are closer together, less compact
or diffuse which leads to the situation where visually only 2 groups are
apparent but the algorithm returns 3 splitting one group in two.

I looked at the package 'cluster' specifically at clara (cannot use pam as
I have 1 observations). But clara always returns as many groups as you
aks for.

Is there a way to help find a seed for the intial cluster centers?
Equivalently, is there a way to find a priori the number of groups?

I know this is not an easy problem. I have looked at principal components
(princomp, prcomp) because there is a connection with cluster analysis. It
is not obvious to me how to program that connection though.

http://en.wikipedia.org/wiki/Principal_Component_Analysis
http://ranger.uta.edu/~chqding/papers/Zha-Kmeans.pdf
http://ranger.uta.edu/~chqding/papers/KmeansPCA1.pdf

Thanks in advance,
Alex van der Spek

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Returning only a file path on Windows

2009-05-22 Thread amvds
I am choosing a file like this:

#Bring up file selection box
fn-file.choose()
fp-file.path(fn,fsep='\\')

Unfortunately, the file path contains the short file name and extension as
well. I had hoped to get only the path so I could make my own long
filenames (for output graphs) by concatenation with this file path.

Of course I can split the string and assemble the components from the
returned list:

fp-strsplit(fp,'\\',fixed=TRUE)


But there must be a better way?

Thanks in advance,
Alex van der Spek

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dropping 'empty' panels from lattice

2009-04-28 Thread amvds
How would I paste factors c1...c10 to one grouping factor?

Can you give an example?

Thanks much!
Alex


  amvds at xs4all.nl writes:


 I have 8 cofactors possibly affecting one and only one variable.

 I make conditional histograms:

 -pdf(file=tst3.pdf,paper=special,width=36,height=36)

 -histogram(~Oversized|dat$c1*dat$c2*dat$c5*dat$c6*dat$c7*dat$c8*
   dat$c9*dat$c10,nint=21,layout=c(32,8),data=dat,type=count)
 -dev.off()

 This works (compliments to R developers!) but it does generate a large
 plot with many panels being 'empty', e.g. that combination of factors
 c1..c10 never occurs in this data set.

 Is there a way to autmatically drop those empty panels?


 Since there is little hope that you get a useful arrangement in the
 2D-paper world, and neither in 4D relativistic space, I would
 suggest to make a new factor by pasting all cX factors, use that
 as the only grouping factor.

 Dieter

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Dropping 'empty' panels from lattice

2009-04-28 Thread amvds
I have 8 cofactors possibly affecting one and only one variable.

I make conditional histograms:

-pdf(file=tst3.pdf,paper=special,width=36,height=36)
-histogram(~Oversized|dat$c1*dat$c2*dat$c5*dat$c6*dat$c7*dat$c8*dat$c9*dat$c10,nint=21,layout=c(32,8),data=dat,type=count)
-dev.off()

This works (compliments to R developers!) but it does generate a large
plot with many panels being 'empty', e.g. that combination of factors
c1..c10 never occurs in this data set.

Is there a way to autmatically drop those empty panels?

I looked at the docs: there is a drop.unused.levels parameter for trellis
graphs but the docs says it is default true. I checked but could not make
much sense out of the list of possibilties.

Thanks!
Alex van der Spek

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Convert data frame containing time stamps to time series

2009-04-09 Thread amvds
What is zoo? I cannot find anything about zoo int he documentation.

I did try as.ts() see below.

Thank you,
Alex van der Spek

 have you tried using zoo and then using the function as.ts()

 On Wed, Apr 8, 2009 at 11:56 AM,  am...@xs4all.nl wrote:
 Converting dates is getting stranger still. I am coercing a data frame
 into a ts as follows:


 tst1-as.POSIXct(1/21/09 5:01,format=%m/%d/%y %H:%M)
 tst2-as.POSIXct(1/28/09 3:40,format=%m/%d/%y %H:%M)
 tsdat-as.ts(dat,start=tst1,end=tst2,frequency=1)

 This generates a ts object. But strangely enough the first column of
 that
 matrix starts at the numeric value of 841 counts up to 1139 and then
 starts at 1 again, only to count up from there. The restart at 1 occurs
 at
 the first day 1/21/09 at 10:00:00.

 What is so special about that time? This phenomenon happens several
 times
 in the long file. But the restart count is always a different number.
 This creates a ramp with some bumps.

 Can anybody explain this?
 Thanks in advance,
 Alex van der Spek


 I read records using scan:

 dat-data.frame(scan(file=KDA.csv,what=list(t=%m/%d/%y
 %H:%M,f=0,p=0,d=0,o=0,s=0,a=0,l=0,c=0),skip=2,sep=,,nmax=np,flush=TRUE,na.strings=c(I/OTimeout,ArcOff-line)))

 which results in:

 dat[1:5,]
              t     f    p  d  o   s    a  l c
 1 1/21/09 5:01 16151  8.2 76 30 282 1060 53 7
 2 1/21/09 5:02 16256  8.3 76 23 282 1059 54 7
 3 1/21/09 5:03 16150  8.4 76 26 282 1059 55 7
 4 1/21/09 5:04 16150  9.0 76 25 282 1051 57 6
 5 1/21/09 5:05 15543 10.4 76  7 282 1024 58 6

 I have been unable to find a way to convert this into a time series. I
 did
 read the manuals and came across a way to coerce a data frame to a ts
 object: as.ts()

 Trouble is I do not know how to keep the timestamps in column t in the
 data frame above. The t column is not strings. If I do:

 plot.ts(dat)

 I can see how the first graphics panel is indeed numbers not text. So I
 think scan converted the text correctly per the format string I put in.

 Much more difficult still. The datafiles I have contain invalid data,
 missing values and other none relevant information. I filter this out
 using subset which works brilliantly. However, how can I filter using
 subset and convert to a time series afterwards. Since after subsetting
 there will be 'holes' i.e. missing records. Can a ts object deal with
 missing records? If so, how? Just point me to a document. I can and
 will
 put in the work to figure it out myself.

 Thank you!
 Alex van der Spek

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Stephen Sefick

 Let's not spend our time and resources thinking about things that are
 so little or so large that all they really do for us is puff us up and
 make us feel like gods.  We are mammals, and have not exhausted the
 annoying little problems of being mammals.

   -K. Mullis



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Convert data frame containing time stamps to time series

2009-04-08 Thread amvds
I read records using scan:

dat-data.frame(scan(file=KDA.csv,what=list(t=%m/%d/%y
%H:%M,f=0,p=0,d=0,o=0,s=0,a=0,l=0,c=0),skip=2,sep=,,nmax=np,flush=TRUE,na.strings=c(I/OTimeout,ArcOff-line)))

which results in:

 dat[1:5,]
 t fp  d  o   sa  l c
1 1/21/09 5:01 16151  8.2 76 30 282 1060 53 7
2 1/21/09 5:02 16256  8.3 76 23 282 1059 54 7
3 1/21/09 5:03 16150  8.4 76 26 282 1059 55 7
4 1/21/09 5:04 16150  9.0 76 25 282 1051 57 6
5 1/21/09 5:05 15543 10.4 76  7 282 1024 58 6

I have been unable to find a way to convert this into a time series. I did
read the manuals and came across a way to coerce a data frame to a ts
object: as.ts()

Trouble is I do not know how to keep the timestamps in column t in the
data frame above. The t column is not strings. If I do:

plot.ts(dat)

I can see how the first graphics panel is indeed numbers not text. So I
think scan converted the text correctly per the format string I put in.

Much more difficult still. The datafiles I have contain invalid data,
missing values and other none relevant information. I filter this out
using subset which works brilliantly. However, how can I filter using
subset and convert to a time series afterwards. Since after subsetting
there will be 'holes' i.e. missing records. Can a ts object deal with
missing records? If so, how? Just point me to a document. I can and will
put in the work to figure it out myself.

Thank you!
Alex van der Spek

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Convert data frame containing time stamps to time series

2009-04-08 Thread amvds
Converting dates is getting stranger still. I am coercing a data frame
into a ts as follows:


tst1-as.POSIXct(1/21/09 5:01,format=%m/%d/%y %H:%M)
tst2-as.POSIXct(1/28/09 3:40,format=%m/%d/%y %H:%M)
tsdat-as.ts(dat,start=tst1,end=tst2,frequency=1)

This generates a ts object. But strangely enough the first column of that
matrix starts at the numeric value of 841 counts up to 1139 and then
starts at 1 again, only to count up from there. The restart at 1 occurs at
the first day 1/21/09 at 10:00:00.

What is so special about that time? This phenomenon happens several times
in the long file. But the restart count is always a different number.
This creates a ramp with some bumps.

Can anybody explain this?
Thanks in advance,
Alex van der Spek


 I read records using scan:

 dat-data.frame(scan(file=KDA.csv,what=list(t=%m/%d/%y
 %H:%M,f=0,p=0,d=0,o=0,s=0,a=0,l=0,c=0),skip=2,sep=,,nmax=np,flush=TRUE,na.strings=c(I/OTimeout,ArcOff-line)))

 which results in:

 dat[1:5,]
  t fp  d  o   sa  l c
 1 1/21/09 5:01 16151  8.2 76 30 282 1060 53 7
 2 1/21/09 5:02 16256  8.3 76 23 282 1059 54 7
 3 1/21/09 5:03 16150  8.4 76 26 282 1059 55 7
 4 1/21/09 5:04 16150  9.0 76 25 282 1051 57 6
 5 1/21/09 5:05 15543 10.4 76  7 282 1024 58 6

 I have been unable to find a way to convert this into a time series. I did
 read the manuals and came across a way to coerce a data frame to a ts
 object: as.ts()

 Trouble is I do not know how to keep the timestamps in column t in the
 data frame above. The t column is not strings. If I do:

 plot.ts(dat)

 I can see how the first graphics panel is indeed numbers not text. So I
 think scan converted the text correctly per the format string I put in.

 Much more difficult still. The datafiles I have contain invalid data,
 missing values and other none relevant information. I filter this out
 using subset which works brilliantly. However, how can I filter using
 subset and convert to a time series afterwards. Since after subsetting
 there will be 'holes' i.e. missing records. Can a ts object deal with
 missing records? If so, how? Just point me to a document. I can and will
 put in the work to figure it out myself.

 Thank you!
 Alex van der Spek

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.