[R] Estimation of skewness from quantiles of near-normal distribution
I have summary statistics from many sets (10,000's) of near-normal continuous data. From previously generated QQplots of these data I can visually see that most of them are normal with a few which are not normal. I have the raw data for a few (700) of these sets. I have applied several tests of normality, skew, and kurtosis to these sets to see which test might yield a parameter which identifies the sets which are visibly non-normal on the QQplot. My conclusions thus far has been that the skew is the best determinant of non-normality for these particular data. Given that I do not have ready access to the sets (10,000's) of data, only to summary statistics which have been calculated on these sets, is there a method by which I may estimate the skew given the following summary statistics: 0.1% 1% 5% 10% 25% 75% 90% 95% 99% 99.9% mean median N sigma N is usually about 900, and so I would discount the 0.1%, 1%, 99%, and 99.9% quantiles as unreliable due to noisiness in the distributions. I know that for instance there are general rules for calculated sigma of a normal distribution given quantiles, and so am wondering if there are any general rules for calculating skew given a set of quantiles, mean, and sigma. I am currently thinking of trying polynomial fits on the QQplot using the raw data I have and then empirically trying to derive a relationship between the quantiles and the skew. Thank you for any ideas. Leif Kirschenbaum Senior Yield Engineer Reflectivity, Inc. (408) 737-8100 x307 [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Combining plaintext and plotmath expressions
I searched the archives and did not find a solution, so I pose this question to those well-versed in the use of plotmath and expressions. I have a list of strings in an external CSV file which I wish to use sometimes as plot axis labels and sometimes as plot titles. These strings combine plaintext and a few mathematical expressions (Greek letters, subscripts). Moreover, I sometimes need to concatenate other plaintext with these strings. Thus far I have written my string list to look like this: "X translation ("*mu*"m)" "Rotation ("*degree*")" "Mean ("*mu*"m)" "Vb1-Vb2 "*(Omega) H[2]O "Used" Max P # Defects I read in the strings, and use one at a time in the variable "parmlabel". I have another variable called "testlabel" which I also need to concatenate, and can be things like "distance" or "E-test". Sometimes I also concatenate another variable "oplabel" when the variable "op" is not zero-length. I have kludged a solution using this function: expr.label<-function(title=""){ ## single quote the character # so that it is not interpreted as a comment parmlabel<-gsub("#","'#'",cfg$parmlabel[icfg]) ## make sure that a string like "Vb1-Vb2" is not interpreted as subtraction text1<-gsub("-","*'-'*",testlabel) ## if need to prepend title test, do that, and make multiple spaces single if(nchar(title)) text1<-paste(gsub(" "," ",title)," ",testlabel,sep="") ## substitute to avoid interpretation of "for" text1<-gsub("for","f*or",text1,fixed=TRUE) ## substitute ~ for spaces to avoid errors in parsing text1<-gsub(" ","~",text1,fixed=TRUE) ## see if we have a parsable expression text2<-try(parse(text=parmlabel),silent=TRUE) ## if not parsable then concatenate another way if(class(text2)=="try-error" | length(text2)==0){ text2<-gsub(" ","~",parmlabel) exprtitle<-paste(text1,text2,sep="~") } else{ exprtitle<-paste('"',text1,'~"*',text2,sep='') } exprtitle<-paste(text1,text2,sep="~") ## if the variable op is not zero-length, then use oplabel if(nchar(op)>1) exprtitle=paste(exprtitle,'~',gsub(' ','~',oplabel),sep="") return(exprtitle) } As an example: > parmlabel<-'"Vb1-Vb2 "*(Omega)' > testlabel<-'E-test' > op<-'dev lab' > oplabel<-'Development Lab' > expr.label("SPC Chart for") I then use exprtitle in: plot(1:10,title= if(plotnum==1) parse(text=exprtitle) else NULL) However: this code seems convoluted and ungainly I don't always get the results I want Any suggestions? Leif Kirschenbaum Senior Yield Engineer Reflectivity, Inc. (408) 737-8100 x307 [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How do I "normalise" a power spectral density
I have done a fair bit of spectral analysis, and hadn't finished collecting my thoughts for a reply, so hadn't replied yet. What exactly do you mean by normalize? I have not used the functons periodogram or spectrum, however from the description for periodogram it appears that it returns the spectral density, which is already normalized by frequency, so you don't have to worry about changing the appearance of your periodogram or power spectrum if you change the time intervals of your data. With a normal Fourier Transform, not only do you need to complex square the terms but you also need to divide by a normalizing factor to give power-per-frequency-bin, which the periodogram appears to do. However, if you look in various textbooks, the definition of the Fourier Transform (FT) varies from author to author in the magnitude of the prepended scaling factor. Since the periodogram is related to the FT (periodogram ultimately uses the function mvfft), without examination of the code for periodogram you cannot know the scaling factor, which is almost always one of the following: 1.0 1/2 1/(2*pi) SQRT(1/2) SQRT(1/(2*pi)) In fact, if you obtain an FT (or FFT or DFT) from a piece of electronics (say an electronic spectrum analyzer), the prepending factor can vary from manufacturer to manufacturer. Fortunately, there is a strict relationship between the variance of your signal and the integrated spectral density. If your time signal is x(t), the spectral density is S(f), and fc = frequency(x)/2 the Nyquist cutoff frequency, then this may be expressed as: variance(x(t)) = constant * {Integral from 0 to fc of S(f)} In R-code: let x be your time series, and "constant" be the unknown scaling factor (1/2, 1/2pi, etc.) p <- periodogram(x) var(x) == constant * sum(p[[1]])/length(p[[1]]) Or: constant = sum(p[[1]])/length(p[[1]])/var(x) and we find that the appropriate scaling constant is 1.0. As regards plotting versus period, the periodogram returns arrays of spectral amplitude and frequencies. The frequencies are in inverse units of the intervals of your time series. i.e. if your time series is 1-point per day, then the frequencies are in 1/day units. Thus if you wish to plot amplitude versus period in weeks you have a little math to do. I believe that plotting is usually versus frequency since most observers are interested in how things vary versus frequency: multiple evenly spaced peaks on a linear frequency scale indicates the presence of harmonics; this is not so simply seen in a plot versus period. ex. peaks of 10 Hz, 20 Hz, 30 Hz, 40 Hz,... in period would be at periods of 100 ms, 50 ms, 25 ms, 12.5 ms, and the peaks are not evenly spaced. Additionally, there are all kinds of typical responses versus frequency (1/f, 1/f^2) which are clearly seen in plots versus frequency as straight lines (log power vs log frequency), but would come out as curves in plots versus period. I can see how ecological studies may indeed be more interested in the periods. However, I would be wary of using the periodogram function, for if I calculate periodograms of the same sinewave but for differing lengths of the sample, the spectral density does not come out the same. All 4 of the plots produced by the code below should overlay, and yet as the time series becomes longer there appears to be an increasing offset of the magnitudes returned. (black-brown-blue-red) x0<-ts(data=sin(2*pi*1.1*(1:1000)/10),frequency=10) p0<-periodogram(x0) var(x0) x1<-ts(data=sin(2*pi*1.1*(1:1)/10),frequency=10) p1<-periodogram(x1) var(x1) x2<-ts(data=sin(2*pi*1.1*(1:10)/10),frequency=10) p2<-periodogram(x2) var(x2) x3<-ts(data=sin(2*pi*1.1*(1:100)/10),frequency=10) p3<-periodogram(x3) var(x3) plot(p3[[2]],p3[[1]],col="red",type="p",pch=19,cex=0.05,log="y", xlim=c(0.1,0.12),ylim=c(1e-30,1e-15)) points(p2[[2]],p2[[1]],type="l",col="blue") points(p1[[2]],p1[[1]],type="l",col="brown") points(p0[[2]],p0[[1]],type="o",col="black") > Message: 113 > Date: Mon, 30 Jan 2006 17:45:58 -0800 > From: Spencer Graves <[EMAIL PROTECTED]> > Subject: Re: [R] How do I "normalise" a power spectral density > analysis? > To: Tom C Cameron <[EMAIL PROTECTED]> > Cc: r-help@stat.math.ethz.ch > Message-ID: <[EMAIL PROTECTED]> > Content-Type: text/plain; charset=us-ascii; format=flowed > > Since I have not seen a reply to this post, I will > offer a comment, > even though I have not used spectral analysis myself and > therefore have > you intuition about it. First, from the definitions I read in the > results from, e.g., RSiteSearch("time series power spectral density") > [e.g., > http://finzi.psych.upenn.edu/R/library/GeneTS/html/periodogram > .html] and > "spectral analysis" in Venables and Ripley (2002) Modern Applied > Statistics with S (Springer), I see no reason why you > couldn't plot the > spectrum vs. the period rather than the frequency. Someone else
Re: [R] Scientific notation in plots
You could also write your numbers using SI suffixes. Below is a function I use for this purpose. The option "near" allows you to require that numbers between 0.001 and 1000 are written as decimal; i.e. "0.010" appears as "0.010" instead of "10.0m". ## ## num2SI converts numbers to SI suffixed numbers ## num2SI<-function(num,digits=4,near=TRUE){ power<-signif(log10(abs(num)),digits=6) power3=floor(power/3); power3[!is.finite(power3)]=0 powers=c("y","z","a","f","p","n","u","m","","k","M","G","T","P") powerstr=powers[power3+9] newnum=signif(num/10^(power3*3),digits=digits) newnum[is.nan(newnum)]=0 numstr=(paste(newnum,powerstr,sep="")) nearidx=near & (abs(num)>0.001) & (abs(num)<1000) nearidx[is.na(nearidx)]=FALSE if(sum(nearidx)) numstr[nearidx]=sprintf("%.*f",pmax(1,(digits-floor(power[nearidx])-1),na.rm=TRUE),num[nearidx]) numstr[!is.finite(num)] = paste(num[!is.finite(num)]) return(numstr) } Leif Kirschenbaum Senior Yield Engineer Reflectivity, Inc. (408) 737-8100 x307 [EMAIL PROTECTED] > Message: 6 > Date: Fri, 13 Jan 2006 12:49:54 +0100 (CET) > From: Oddmund Nordg??rd <[EMAIL PROTECTED]> > Subject: [R] Scientific notation in plots > To: r-help@stat.math.ethz.ch > Message-ID: <[EMAIL PROTECTED]> > Content-Type: TEXT/PLAIN; charset=ISO-8859-1 > > > Is it possible to use scientific notation of numbers on the > axis of plots > without using the xEy notation. That means: a beatiful 1x10^3 > instead of 1E3. > Logarithmic scale, in my case. > > Thank you very much! > > Oddmund > > ** > > Oddmund Nordg?rd > > Department of Haematology and Oncology > Stavanger University Hospital > P.O. Box 8100 > 4068 STAVANGER > Phone: 51 51 89 34 > Email: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Find last row (observation) for each combination of variables
Let's say I have a data.frame like A B C TS other columns 1 1 1 12345 1 1 1 56789 1 2 1 23456 1 2 2 23457 2 4 7 23458 2 4 7 34567 2 4 7 45678 and I want the last row for each unique combination of A/B/C, where by "last" I mean greatest TS. A B C TS other columns 1 1 1 56789 1 2 1 23456 1 2 2 23457 2 4 7 45678 I did this simply in SAS: proc sort data=DF; by A B C descending TS run; proc sort data=DF NODUPKEY; by A B C; run; I tried using "aggregate" to find the maximum TS for each combination of A/B/C, but it's slow. I also tried "by" but it's also slow. My current (faster) solution is: DF$abc<-paste(DF$A,DF$B,DF$C,sep="") abclist<-unique(DF$ABC) numtest<-length(abclist) maxTS<-rep(0,numtest) for(i in 1:numtest){ maxTS[i]<-max(DF$TS[DF$abc==abclist[i]],na.rm=TRUE) } maxTSdf<-data.frame(device=I(abc),maxTS=maxTS ) DF<-merge(DF,maxTSdf,by="abc",all.x=TRUE) DF<-Df[DF$TS==DF$maxTS,,drop=TRUE] DF$maxTS<-NULL This seems a bit lengthy for such a simple task. Any simpler suggestions? -Leif K. Leif Kirschenbaum Senior Yield Engineer Reflectivity, Inc. (408) 737-8100 x307 [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] "Missing value representation in Excel before
I reproduce from memory my exhaustive look into this issue. RODBC uses the Microsoft ODBC DLL's developed by Microsoft. These DLL's perform an automatic determination of column type based on the contents of the first N rows of cells in each column, where N [0,16]. N may be set in the Windows system registry, and there are a few other things that may be set in the system registry which control how the DLL parses an Excel spreadsheet. Unfortunately, the Microsoft DLL's do not always pay attention to the registry settings and do not always interpret them in the same manner. The end result is that no matter what you do with RODBC, and no matter how the authors of RODBC re-write it, some Excel spreadsheets will always be unreadable via RODBC given particular insidious combinations of data in some columns of your spreadsheet. (until such time as Microsoft fixes their DLL bugs, I mean features) I have some faint recollection that the Microsoft DLL incorrectly parses a column with non-empty rows due to some formatting issue of those particular columns, which I was unable to cure by re-formatting the source worksheet. I have had to resort to using the gdata package which runs a Perl script "xls2csv.pl", which converts an Excel spreadsheet to CSV, for a few Excel spreadsheets which exhibit the particular anomalies preventing use of RODBC. Leif Kirschenbaum Senior Yield Engineer Reflectivity, Inc. (408) 737-8100 x307 [EMAIL PROTECTED] > Message: 21 > Date: Mon, 9 Jan 2006 18:06:49 +0100 > From: "Fredrik Lundgren" <[EMAIL PROTECTED]> > Subject: Re: [R] "Missing value representation in Excel before > extraction to R with RODBC" > To: "Prof Brian Ripley" <[EMAIL PROTECTED]>, "Petr Pikal" > <[EMAIL PROTECTED]> > Cc: R-help > Message-ID: <[EMAIL PROTECTED]> > Content-Type: text/plain; format=flowed; charset="iso-8859-1"; > reply-type=response > > Dear list, > > Well, those columns in Excel that starts with NA (actually 8 > NA's in my > case) is imported as all NA in R but if the columns starts > with at least > 3 cells with values (i.e not NA) the are imported correctly > to R. When > as.is=TRUE is used a simular conversion takes place but now > as all > and dates are represented as date-and-time. > Is there any way to get this correct even when the Excel > columns start > with several NA's? > > Sincerely > Fredrik > > > - Original Message - > From: "Prof Brian Ripley" <[EMAIL PROTECTED]> > To: "Petr Pikal" <[EMAIL PROTECTED]> > Cc: "Fredrik Lundgren" <[EMAIL PROTECTED]>; "R-help" > > Sent: Monday, January 09, 2006 9:36 AM > Subject: Re: [R] "Missing value representation in Excel before > extraction to R with RODBC" > > > > On Mon, 9 Jan 2006, Petr Pikal wrote: > > > >> Hi > >> > >> I believe it has something to do with the column identification > >> decision. When R decides what is in a column it uses only > some values > >> from the beginning of a file. > > > > Not R, Excel. Excel tells ODBC what the column types are. > > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Wikis etc.
To avoid spam on the R wikis pages: If we assume that anyone who we would want to be empowered to modify the R wiki pages is an R-user, would it be possible to somehow incorporate a function into the next R release which provides a user with a key/password? A new R function would generate a day-of-the year dependent key: if you want to modify an R wiki page you need to enter the key for that day (This is not a proposal to make keys user specific: every R user worldwide would have the same key each day). Then only a person who has installed R would be able to run the function to get a key to modify R wiki pages. Of course anyone could read the wikis. I supposed that if we wanted, that the key provided could somehow encode the O/S and R version being run, and then the wiki page modified would note which O/S and version the annotator is running, however for ease of use I suggest that the key generated each day be short for simplicy in typing it in. I suppose a more complex solution would be for an R function to make a call to open a web-browser with a cookie or something set which thus allows the user to modify R wiki pages. Leif Kirschenbaum Senior Yield Engineer Reflectivity, Inc. (408) 737-8100 x307 [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] A comment about R:
A few thoughts about R vs SAS: I started learning SAS 8 years ago at IBM, I believe it was version 6.10. I started with R 7 months ago. Learning curve: I think I can do everything in R after 7 months that I could do in SAS after about 4 years. Bugs: I suffered through several SAS version changes, 7.0, 7.1, 7.2, 8.0, 9.0 (I may have misquoted some version numbers). Every version change gave me headaches, as every version release (of an expensive commercially produced software set) had bugs which upset or crashed previously working code. I had code which ran fine under Windows 2000 and terribly under Windows XP. Most bugs I found were noted by SAS, but never fixed. With R I have encounted very few bugs, except for an occasional crash of R, which I usually ascribe to some bug in Windows XP. Help: SAS help was OK. As others have mentioned, there is too much. I even had the set of printed manuals on my desk (stretching 4 feet or so), which were quote impenetrable. I had almost no support from colleagues: even within IBM the number of advanced SAS users was small. With R this mailing list has been of great help: almost every issue I copy some program and save it as a "R hint " file. --> A REQUEST I would say that I would appreciate a few more program examples with the help pages for some functions. For instance, "?Control" tells me about "if(cond) cons.expr else alt.expr", however an example of if(i==1) { print("one") } else if(i==2) { print("two") } else if(i>2) { print("bigger than two") } at the end of that help section would have been very helpful for me a few months ago. Functions: Writing my own functions in SAS was by use of macros, and usually depended heavily on macro substitution. Learning SAS's macro language, especially macro substitution, was very difficult and it took me years to be able to write complicated functions. Quite different situation in R. Some functions I have written by dint of copying code from other people's packages, which has been very helpful. I wanted to generate arbitrary k-values (the k-multiplier of sigma for a given alpha, beta, and N to establish confidence limits around a mean for small populations). I had a table from a years old microfiche book giving values but wanted to generate my own. I had to find the correct integrals to approximate the k-values and then write two SAS macros which iterated to the desired level of tolerance to generate values. I would guess that there is either an R base function or a package which will do this for me (when I need to start generating AQL tables). Given the utility of these numbers, I was disappointed with SAS. Data manipulation: All SAS data is in 2-dimensional datasets, which was very frustrating after having used variables, arrays, and matrices in BASIC, APL, FORTRAN, C, Pascal, and LabVIEW. SAS allows you to access only 1 row of a dataset at a time which was terribly horribly incomprehensibly frustrating. There were so many many problems I had to solve where I had to work around this SAS paradigm. In R, I can access all the elements of a matrix/dataframe at once, and I can use >2 dimensional matrices. In fact, the limitations of SAS I had ingrained from 7.5 years has sometimes made me forget how I can do something so easily in R, like be able to know when a value in a column of a dataframe changes: DF$marker <- DF[1:(nrow(DF)-1),icol] != DF[2:nrow(DF),icol] This was hard to do in SAS...and even after years it was sometimes buggy, keeping variable values from previous iterations of a SAS program. One very nice advantage with SAS is that after data is saved in libraries, there is a GUI showing all the libraries and the datasets inside the libraries with sizes and dates. While we can save Rdata objects in an external file, the base package doesn't seem to have the same capabilities as SAS. Graphics: SAS graphics were quite mediocre, and generating customized labels was cumbersome. Porting code from one Windows platform to another produced unpredictable and sometimes unworkable results. It has been easier in R: I anticipate that I will be able to port R Windows code to *NIX and generate the same graphics. Batch commands: I am working on porting some of my R code to our *NIX server to generate reports and graphs on a scheduled basis. Although a few at IBM did this with SAS, I would have found doing this fairly daunting. -Leif - Leif Kirschenbaum, Ph.D. Senior Yield Engineer Reflectivity [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] how to specify dev.print target by a variable?
I want to do the following: DEVw=500 DEVh=350 fname="my_plot" dev.print(file=fname, device=FOO, width=DEVw, height=DEVh, bg="transparent") How do I do this such that I can specify FOO to be one of several choices? (GDD, PNG, postscript, etc.) If I make FOO a character variable, then "dev.print" complains. I tried a simpled "substitute" but didn't get it to work... I'm thinking it's going to involve a "do.call" and "substitute" but I'm not sure. Using: $platform[1] "i386-pc-mingw32" $arch[1] "i386" $os[1] "mingw32" $system[1] "i386, mingw32" $status[1] "" $major[1] "2" $minor[1] "2.0" $year[1] "2005" $month[1] "10" $day[1] "06" $"svn rev"[1] "35749" $language[1] "R" and also running the same code on: $platform[1] "i686-redhat-linux-gnu" $arch[1] "i686" $os[1] "linux-gnu" $system[1] "i686, linux-gnu" $status[1] "" $major[1] "2" $minor[1] "0.0" $year[1] "2004" $month[1] "10" $day[1] "04" $language[1] "R" -Leif S. Kirschenbaum, Ph.D. Yield Integration Engineer Reflectivity, Inc. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] qcc
If you examine the code for the function "violating.runs" in the qcc package (try "violating.runs") you can deconstruct it to find that the code flags runs where there are .qcc.options$run.length or more points in a row on one side of the center (process mean). However, the classic Shewhart rules dictate that a run of monotonically increasing or decreasing points of is also a run violation. I include here my modified version of the function "violating.runs" which also checks for these violations: ## ## Correct some typos in violating.runs from qcc package ## Added test for run.length of points monotonically increasing or decreasing ## The simplest way is to re-run the code but with "diffs" ## representing the sign of the difference from one point to the next ## violating.runs<-function (object, run.length = qcc.options("run.length")) { center <- object$center statistics <- c(object$statistics, object$new.statistics) cl <- object$limits violators <- numeric() for(i in 1:2){ diffs <- statistics - center if(i==2) { diffs <- c(0,diff(statistics)) ## need to decrement run.length since we're looking at differences between points run.length<-run.length-1 } diffs[diffs > 0] <- 1 diffs[diffs < 0] <- -1 runs <- rle(diffs) index.lengths <- (1:length(runs$lengths))[runs$lengths >= run.length] index.stats <- 1:length(statistics) vruns <- rep(runs$lengths >= run.length, runs$lengths) vruns.above <- (vruns & (diffs > 0)) vruns.below <- (vruns & (diffs < 0)) rvruns.above <- rle(vruns.above) rvruns.below <- rle(vruns.below) vbeg.above <- cumsum(rvruns.above$lengths)[rvruns.above$values]-(rvruns.above$lengths - run.length)[rvruns.above$values] vend.above <- cumsum(rvruns.above$lengths)[rvruns.above$values] vbeg.below <- cumsum(rvruns.below$lengths)[rvruns.below$values]-(rvruns.below$lengths - run.length)[rvruns.below$values] vend.below <- cumsum(rvruns.below$lengths)[rvruns.below$values] if (length(vbeg.above)) { for (i in 1:length(vbeg.above)) violators <- c(violators, vbeg.above[i]:vend.above[i]) } if (length(vbeg.below)) { for (i in 1:length(vbeg.below)) violators <- c(violators, vbeg.below[i]:vend.below[i]) } } ## ENDOF for i in 1:2 return(violators) } > 3)Is there any more criterias made somewhere ? The other criteria is found by the function "beyond.limits" which checks to see if any points are beyond the upper or lower control limits. I believe that Prof. Scrucca wrote the qcc package as a tool to demonstrate process control to his students taking his statistics classes. Hence the statistics are excellent. For example note the use of log-gamma and exponential functions in the function "sd.xbar" to calculate the constant "c4", which avoids the divison of extremely large [~ 1E20] numbers: c4 <- function(n) sqrt(2/(n-1)) * exp(lgamma(n/2) - lgamma((n-1)/2))) where a textbook might write it as: c4 <- function(n) sqrt(2/(n-1)) * (gamma(n/2) / gamma((n-1)/2))) Thus I suggest that although qcc provides an excellent basis for constructing statistical control reports, that you should be prepared to modify it for your purposes (as I am heavily doing). P.S. I change .qcc.options using syntax such as ".qcc.options$violating.runs<-7". Leif S. Kirschenbaum, Ph.D. Senior Yield Engineer Reflectivity, Inc. 408-737-8100 x307 408-737-8153 (Fax) > Date: Tue, 29 Nov 2005 02:08:57 +0200 > From: Tommi Viitanen <[EMAIL PROTECTED]> > Subject: [R] qcc > > violating.runs [deleted] > > that the criteria for the violating is 5 but > 1)I cannot find "5" in the code of the function. Where is the "5" ? > 2)What is the easiest way to change it ? > 3)Is there any more criterias made somewhere ? > > Yours sincerelly, Tommi Viitanen > [deleted] > -- > > Date: Tue, 29 Nov 2005 08:58:24 +0100 > From: Uwe Ligges <[EMAIL PROTECTED]> > Subject: Re: [R] qcc > To: Tommi Viitanen <[EMAIL PROTECTED]> > > > See ?violating.runs: > violating.runs(object, run.length = qcc.options("run.length")) > ^ ^^^ ^^^ ^^ ^^^ ^^^ > [deleted] > > Uwe Ligges __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Listing object sizes without RODBC objects: contributed function
I include my workaround for object.size failing on RODBC objects. Code adapted from Petr Pikal's code (http://tolstoy.newcastle.edu.au/R/help/04/06/1228.html). ls.objects<-function (pos = 1, order.by,...) { napply <- function(names, fn) sapply(names, function(x) fn(get(x,pos=pos))) names <- ls(pos = pos,...) obj.class <- napply(names, function(x) as.character(class(x))[1]) obj.drop <- match("RODBC",obj.class) obj.class <- obj.class[-obj.drop] names <- names[-obj.drop] obj.mode <- napply(names, mode) obj.type <- ifelse(is.na(obj.class), obj.mode, obj.class) obj.size <- napply(names, object.size) obj.dim <- t(napply(names, function(x) as.numeric(dim(x))[1:2])) vec <- is.na(obj.dim)[, 1] & (obj.type != "function") obj.dim[vec, 1] <- napply(names, length)[vec] out <- data.frame(obj.type, obj.size, obj.dim) names(out) <- c("Type", "Size", "Rows", "Columns") if (!missing(order.by)) out <- out[order(out[[order.by]]), ] out } Leif S. Kirschenbaum, Ph.D. Senior Yield Engineer Reflectivity, Inc. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] PowerPoint graph insertion
Yes: I have Tufte's monograph on my desk. (along with 4 statistics texts) Yes: I am not the biggest fan of PowerPoint. Yes: I am using R to generate charts, plots, trends, etc. I have to summarize them each week. When I consider how to organize this data my first thought is to generate an HTML file with links to the R-generated plots, which HTML file organizes the plots in the required order. However: * Each week we annotate one PowerPoint slide in the weekly presentation with action items -- we don't only use PP as a presentation tool. This is convenient, as then the action items resulting from particular data trends are associated in a single document with the plots of the data trends. * Other (non-R) users insert data into the weekly PP presentation: from other plotting software and images from various sources (microscope, SEM, TEM, etc.), which I cannot easily incorporate into a generated HTML file before-the-fact. * I'm not sure how to create an HTML file which allows one to page forward and backward through it easily, as with PowerPoint (a minor point: and there is probably a way to write HTML to respond to such) So: Can R insert plots into an existing PowerPoint presentation? (actually, I'd copy last week's presentation and then update with new plots) I'll guess that it cannot, as there probably is not a Microsoft supplied interface (ODBC or otherwise) with PowerPoint as there is with Excel. -Leif Kirschenbaum, Ph.D. Sr. Yield Integration Engineer (I'm a physicist) [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html