[R] Question About Repeat Random Sampling from a Data Frame
Good Morning: I've read many, many posts on the r-help system and I feel compelled to quickly admit that I am relatively new to R, I do have several reference books around me, but I cannot count myself among the fortunate who seem to strong programming intuition. I have a data set consisting of 1637 observations of five variables: tensile strength, yield strength, elongation, hardness and a character indicator with three levels: (Y)es, (N)o, and (F)ail. My objective is to randomly sample various subsets from this data set and then evaluate these subsets using simple parameters among them tests for normality, shape and skewness. The data set is ordered by the character variable prior to sampling, and the samples are weighted to mirror representation in an overall, physical process. I am sampling the data set using this code: sample - dataset[sample(1:1637, 500, prob=c(rep(163.7/1637,513),rep(245.5/1637,197),rep(1227.8/1637,927)),replace = TRUE),] What I would like to do is iterate this process to create many (say 500 or more) sampled sets of n=500 and then evaluate each set for the parameters of interest. I would actually be evaluating each variable within each subset for my characteristic of interest. I am familiar with sampling and saving single columns of data to do this sort of thing, but I am not sure how to accomplish this with a multiple-variable data set. For example, I am currently iterating this using a clunky process: mysamples-list() for (i in 1:10){ mysamples[[i]] - dataset[ sample(1:1637,100,prob=c(rep(163.7/1637,513),rep(245.5/1637,197),rep(1227.8/1637,927)),replace = TRUE), ] } But this leaves me with the additional task of defining each mysample[i] iteration and converting it to a form on which I can apply a standard statistical test like mean() or skewness() to the variable columns within each subset. I have attempted to iteratively convert these lists using this code: mat-matrix(nrow=100,ncol=5) for (i in 1:length(mysamples)) {mat[i]-do.call('rbind',mysamples[i])} but running the code generates the error message: number of items to replace is not a multiple of replacement length. I have tried unsuccessfully, by reading many, many helpful r-help emails on this error, to understand my probably obvious mistake. Based on the small amount that I think I know about R it seems to me that sampling the data frame and containing the samples in a list is likely a pretty inefficient way to do this task. Any help that any of you could provide to assist me in iteratively sampling the data frame, and storing the samples in a form on which I can apply other statistical tests would be greatly appreciated. Thank you very much for taking the time to consider my questions. Adam [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question About Repeat Random Sampling from a Data Frame
Good Afternoon Dr. Winsemius: You ask some very good questions and make excellent points; my responses are below. I've tried to extract your questions and provide answers just to reduce the clutter. 1. You might want to provide statistical justification for the otherwise puzzling sampling strategy. I assume you mean my overall process of random sampling from a large data set. The data set is comprised of observations collected over four years. Although the basis for sampling would make a good four-frame Dilbert cartoon if it could be condensed enough, my answer begins with the unfortunate truth that there is a great divide between the technical and marketing groups at the business where I am employed. Many powerful marketing executives, some with technical backgrounds, feel that there is something fundamentally wrong with the manufacturing process because the data generated over the long term is not approximately normally distributed. My task was to examine this set of data, trying to keep the representation of Y, N and F approximately equal in the sample when compared to the large set, to determine if any subset exhibits the holy grail-like normal distribution characteristics. I don't feel that this is statistical justification, but it is the reason why I am doing this. 2. It would help if you explained what you are attempting here in ordinary English. There are 10 elements in mysamples, each of which is a 100 x 5 dataframe, and mat is just one 100 x 5 matrix, which you seem to be referencing incorrectly, given the fact that it has two, rather than one, dimension. Furthermore, those dataframes may not be of a uniform class, since you said you had character variable. Do you really want these all in a character type matrix, which would be what is likely to happen given R's requirement that matrix element be of only one class? What you say above suggests not. It seems from your response that I incorrectly assumed that a list is not the same as a data frame. I started down this path after reading the questions and answers to a similar problem where the r-help responder suggested a two step process and said that the list must be converted to another form in order to be available for analysis. And you are absolutely correct that I do not want each sample in a character type matrix. In plain English, I hope, I am simply trying to iterate the process of removing random samples from the large data set, and then saving these samples in a format that is available for simple analysis. For example, if I remove five hundred mysample sets, each of which is composed of a 100 x 5 sample of the large data set I am interested in determining the skewness, kurtosis, mean and standard deviation of each of the four numeric variables in each of the five hundred mysample sets. 3. Sorting out such problems is best done with smaller test objects. I was surprised to see...type character. I agree. I began to do this with a small test data set but it was late last evening and I realized that I should ask for help before proceeding on what I thought might be incorrect assumptions. I clearly misunderstood that a list needed to be converted to a data frame in order to be available for analysis. Thank you for taking the time to respond. The discussion and suggestions are very helpful. Adam From: David Winsemius dwinsem...@comcast.net Cc: r-help@r-project.org Sent: Mon, December 21, 2009 11:23:43 AM Subject: Re: [R] Question About Repeat Random Sampling from a Data Frame On Dec 21, 2009, at 10:12 AM, Adam Carr wrote: Good Morning: I've read many, many posts on the r-help system and I feel compelled to quickly admit that I am relatively new to R, I do have several reference books around me, but I cannot count myself among the fortunate who seem to strong programming intuition. I have a data set consisting of 1637 observations of five variables: tensile strength, yield strength, elongation, hardness and a character indicator with three levels: (Y)es, (N)o, and (F)ail. My objective is to randomly sample various subsets from this data set and then evaluate these subsets using simple parameters among them tests for normality, shape and skewness. The data set is ordered by the character variable prior to sampling, and the samples are weighted to mirror representation in an overall, physical process. I am sampling the data set using this code: sample - dataset[sample(1:1637, 500, prob=c(rep(163.7/1637,513),rep(245.5/1637,197),rep(1227.8/1637,927)),replace = TRUE),] What I would like to do is iterate this process to create many (say 500 or more) sampled sets of n=500 and then evaluate each set for the parameters of interest. I would actually be evaluating each variable within each subset for my characteristic of interest. I am familiar with sampling and saving single columns of data to do this sort
Re: [R] Question About Repeat Random Sampling from a Data Frame
Thanks to both of you for the comments and suggestions. Over the next couple of days I plan to work through my simple problem using the help offered in this forum. From: David Winsemius dwinsem...@comcast.net To: Bert Gunter gunter.ber...@gene.com Sent: Mon, December 21, 2009 2:31:26 PM Subject: Re: [R] Question About Repeat Random Sampling from a Data Frame On Dec 21, 2009, at 1:01 PM, Bert Gunter wrote: Didn't read this thread in detail, so the following suggestion may just be nonsense... (caveat emptor), but: To sample from an data frame or matrix, sample from the row indices and then extract what you want from the sampled rows. Or sample directly from individual columns if that suffices. In general, ?sample on appropriate indices of object in question. Bert Gunter Genentech Nonclinical Biostatistics -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Adam Carr Sent: Monday, December 21, 2009 9:53 AM To: David Winsemius Cc: r-help@r-project.org Subject: Re: [R] Question About Repeat Random Sampling from a Data Frame Good Afternoon Dr. Winsemius: You ask some very good questions and make excellent points; my responses are below. I've tried to extract your questions and provide answers just to reduce the clutter. 1. You might want to provide statistical justification for the otherwise puzzling sampling strategy. I assume you mean my overall process of random sampling from a large data set. The data set is comprised of observations collected over four years. Although the basis for sampling would make a good four-frame Dilbert cartoon if it could be condensed enough, my answer begins with the unfortunate truth that there is a great divide between the technical and marketing groups at the business where I am employed. Many powerful marketing executives, some with technical backgrounds, feel that there is something fundamentally wrong with the manufacturing process because the data generated over the long term is not approximately normally distributed. My task was to examine this set of data, trying to keep the representation of Y, N and F approximately equal in the sample when compared to the large set, to determine if any subset exhibits the holy grail-like normal distribution characteristics. I don't feel that this is statistical justification, but it is the reason why I am doing this. 2. It would help if you explained what you are attempting here in ordinary English. There are 10 elements in mysamples, each of which is a 100 x 5 dataframe, and mat is just one 100 x 5 matrix, which you seem to be referencing incorrectly, given the fact that it has two, rather than one, dimension. Furthermore, those dataframes may not be of a uniform class, since you said you had character variable. Do you really want these all in a character type matrix, which would be what is likely to happen given R's requirement that matrix element be of only one class? What you say above suggests not. It seems from your response that I incorrectly assumed that a list is not the same as a data frame. I started down this path after reading the questions and answers to a similar problem where the r-help responder suggested a two step process and said that the list must be converted to another form in order to be available for analysis. A data.frame is a special type of list. You can also make lists of dataframes (just as you can make lists of lists), which I thought the first portion of your code would have done: mysamples-list() for (i in 1:10){ mysamples[[i]] - dataset[ sample(1:1637,100, prob=c(rep(163.7/1637,513), rep(245.5/1637,197), rep(1227.8/1637,927)), replace = TRUE), ] Each element in that list would have been a subset of your larger data.frame and would itself have been a data.frame. And you are absolutely correct that I do not want each sample in a character type matrix. In plain English, I hope, I am simply trying to iterate the process of removing random samples from the large data set, and then saving these samples in a format that is available for simple analysis. For example, if I remove five hundred mysample sets, each of which is composed of a 100 x 5 sample of the large data set I am interested in determining the skewness, kurtosis, mean and standard deviation of each of the four numeric variables in each of the five hundred mysample sets. So make a small dataframe with variables (columns) of the same type as in your real data, maybe 25-30 rows in extent (not length, since for a dataframe, the length() function returns the number of columns). 3. Sorting out such problems is best done with smaller test objects. I was surprised to see...type character. I agree. I began to do this with a small test data set but it was late last evening and I realized that I should ask for help before proceeding on what I thought
[R] Help With Custom QQ Plot
Good Morning: I have attached a text file with one hundred thirty six observations. I would like to create a qq plot with the following features: 1. Observed values on the y-axis. 2. Normal approximation line on the plot. 3. X-axis with vertical reference lines at the following percentiles of the data: 1, 10, 20, 50, 80, 90 and 99. 4. Data appearing on the plot as distinct points. I assume that qqmath (lattice) is the best approach to this although I have not been able to sort out the proper syntax to yield the plot I'm after. I understand how to determine the quantiles of the data, and I can use qqmath() to generate a plot which has the observed values on the y-axis, and the plot is based on a normal distribution, but beyond this I'm struggling. I do not have the R Graphics text by or Visualizing Data by Bill Cleveland but I have several other R books (Crawley, Ugarte et al, Braun/Murdoch, Rizzo, etc) but coverage of the lattice package seems light. I very much appreciate any help that could be offered. Thank you. Adam 113 116 120 120 124 114 118 117 118 119 116 113 118 112 129 117 118 112 114 125 125 116 123 121 113 118 121 127 125 127 125 125 115 115 114 118 128 121 118 115 114 117 120 112 131 131 127 115 112 116 116 111 115 120 113 117 124 119 112 114 116 115 126 124 121 112 121 116 117 115 115 112 116 115 118 119 118 114 115 112 115.5 110 120 119 111 121 118 118 119 117 113 115 117 110 113 113 113 119 116 115 116 TUS 114 120 118 117 116 124 113 117 114 115 116 114 124 118 115 112 115 115 115 117 117 130 116 113 116 114 117 114 117 119 113 113 114 112 119 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help With Custom QQ Plot
Hello Dennis: Thanks for the reply and for your help. I apologize for the errant TUS in the data. Your statement about the quantiles of the data belonging to the vertical axis is correct of course and it helped me realize an error of mine: the quantiles plotted as vertical reference lines are from a fitted distribution based on the data. I have included an example of the plot in the attachment. I ran the code you sent and it is a very good start. I simply need to understand how to include the fitted normal data as the x-axis or as a set of vertical reference points. About the distinct data points: you are correct about this as well. I realize that there are ties in the data. This is pretty typical for these kinds of material property measurements. I meant, but did not state clearly, that I wanted to plot the data as points and not as a line. Thanks again for your help. Adam From: Dennis Murphy djmu...@gmail.com To: Adam Carr adamlc...@yahoo.com Sent: Mon, December 28, 2009 4:12:54 PM Subject: Re: [R] Help With Custom QQ Plot Hi: This isn't precisely what you want, but it's a start. Both base graphics and lattice plot the normal quantiles on the horizontal axis and the observed values on the vertical axis, so it's the transpose of what you want. After reading in your data (I had to get rid of the stray TUS two-thirds of the way down the file) into an object I called qqdata, I did the following: qqq - quantile(qqdata, c(.01, .1, .2, .5, .8, .9, .99)) qqnorm(qqdata) qqline(qqdata) abline(h = qqq, lty = 'dotted') This is all using base graphics. Use xlab, ylab and main in the qqnorm() call to adjust the labels. The author of lattice, Deepayan Sarkar, has published a book called Lattice, available from Springer. If you were to use it, the appropriate function would be qqmath, whose default theoretical distribution is the normal. If you insist on having the theoretical quantiles on the vertical axis, then in R you would likely have to use ggplot2, but you would have to build up the plot from its elements. On Mon, Dec 28, 2009 at 6:49 AM, Adam Carr adamlc...@yahoo.com wrote: Good Morning: I have attached a text file with one hundred thirty six observations. I would like to create a qq plot with the following features: 1. Observed values on the y-axis. Check. 2. Normal approximation line on the plot. Check. 3. X-axis with vertical reference lines at the following percentiles of the data: 1, 10, 20, 50, 80, 90 and 99. If your data are on the Y-axis, the percentiles of the *data* would also have to be on the y-axis. This is shown on the plot. Check. 4. Data appearing on the plot as distinct points. qqnorm does what it can, but you have numerous tied values in your data. How do you expect them to be plotted as distinct points? You can jitter them, but that will have some impact on the corresponding quantiles and the position of the fitted 'normal approximation line' . I assume that qqmath (lattice) is the best approach to this although I have not been able to sort out the proper syntax to yield the plot I'm after. I understand how to determine the quantiles of the data, and I can use qqmath() to generate a plot which has the observed values on the y-axis, and the plot is based on a normal distribution, but beyond this I'm struggling. I do not have the R Graphics text by or Visualizing Data by Bill Cleveland but I have several other R books (Crawley, Ugarte et al, Braun/Murdoch, Rizzo, etc) but coverage of the lattice package seems light. I very much appreciate any help that could be offered. Thank you. Adam __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Example Plot for Recreation in R.pdf Description: Adobe PDF document __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Assistance with boot() Package
Good Evening R Community: I believe I understand the basics of using the boot() bootstrap resampling function in the boot() package. I have not had any trouble creating a boot.object to which I apply the boot.ci() function to calculate one or all of the available confidence intervals. What I am not sure about is if this set of functions can generate more than one confidence interval of one or all of the types available. I have a large data set (n=133,456) data set from which I would like to remove random samples of different sizes and then calculate 95% confidence intervals for the mean, 10% trimmed mean and median. I would like to determine how often the confidence intervals generated by boot.ci() contain the mean, 10% trimmed mean and median of the large data set. I have looked at some examples for using the boot() and boot.ci() functions to generate confidence intervals for the intercept and predictive variables from a regression model, but I do not, or cannot I suppose, determine how I can generate more than one set of normal, basic, percentile and BCa confidence intervals using these two functions. I am running R version 2.9.2 on an IBM T61 laptop. My OS is Win XP professional SP 3, and the machine has a 1.99 GHz processor with 2.99 GB of RAM. The version of the boot() package I am running is 1.2-41. Thanks in advance for taking the time to help me. Adam [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Assistance with boot() Package
Good Morning Prof. Ripley: Thanks very much for the help. I do have the Davison and Hinkley text but I continued to struggle with the proper syntax to make iteration work. The MASS package is already installed on my machine and I have the manual so I will check the example you mentioned. Thanks again. Adam From: Prof Brian Ripley rip...@stats.ox.ac.uk Cc: r-help@r-project.org Sent: Mon, January 4, 2010 1:46:51 AM Subject: Re: [R] Assistance with boot() Package On Sun, 3 Jan 2010, Adam Carr wrote: Good Evening R Community: I believe I understand the basics of using the boot() bootstrap resampling function in the boot() package. I have not had any trouble creating a boot.object to which I apply the boot.ci() function to calculate one or all of the available confidence intervals. What I am not sure about is if this set of functions can generate more than one confidence interval of one or all of the types available. I have a large data set (n=133,456) data set from which I would like to remove random samples of different sizes and then calculate 95% confidence intervals for the mean, 10% trimmed mean and median. I would like to determine how often the confidence intervals generated by boot.ci() contain the mean, 10% trimmed mean and median of the large data set. I have looked at some examples for using the boot() and boot.ci() functions to generate confidence intervals for the intercept and predictive variables from a regression model, but I do not, or cannot I suppose, determine how I can generate more than one set of normal, basic, percentile and BCa confidence intervals using these two functions. Well, it can be done easily. Studying the book for which 'boot' is support software would be a good start, but a hint is to look at the 'index' argument to boot.ci: basically boot() can be called with a 'statistic' which returns a vector, then boot.ci() called on each of the components of interest. There is an example in MASS (the book) on pp 225-6. I am running R version 2.9.2 on an IBM T61 laptop. My OS is Win XP professional SP 3, and the machine has a 1.99 GHz processor with 2.99 GB of RAM. The version of the boot() package I am running is 1.2-41. Thanks in advance for taking the time to help me. Adam [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. PLEASE do -- no HTML mail for a start, and please update your R. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Boot() Package Question: Multiple Confidence Interval Output
Good Morning: I posted an initial question a few days ago and I received some good advice from two R experts. I have re-examined the Davison-Hinkley text paying close attention to the examples of the boot() and boot.ci() in that text and the single example of a similar process in the MASS book (not the MASS package manual as I initially misunderstood). I think I understand how the stratified sampling works and how the index plays a role in the boot() object created prior to running the boot.ci() function. The example in Davison-Hinkley on page 528 and continued on page 536 also uses an interesting additional argument to select subsamples from the data set for input to the boot() and boot.ci() functions. I admit I may be missing the connection between these examples and what I am trying to accomplish so I apologize for the disconnect. Here is example code from Kerby Shedden at the University of Michigan that generates multiple bootstrapped confidence intervals from a normal data set and then examines the multiple confidence intervals to determine how many contain the characteristic calculated from the source data set: ## Sample sizes. N = c(10,20,40,60) nrep = 1000 ## Number of simulation replications per sample size value. nboot = 1000 ## The number of bootstrap data sets. ## Coverage probabilities. CP = NULL for (j in 1:length(N)) { ## Keep track of how many times the interval covers the true value. nc = 0 n = N[j] for (k in 1:nrep) { ## Simulate a data set. X = rnorm(n) ## Generate bootstrap data sets from X. ii = ceiling(n*runif(n*nboot)) B = X[ii] B = array(B, c(nboot,n)) ## Get the sample mean for each bootstrap data set. M = apply(B, 1, mean) M = sort(M) ## Get the confidence interval lower and upper bound. C = c(M[25], M[975]) ## Check for coverage. if ( (C[1] 0) (C[2] 0) ) { nc = nc+1 } } CP[j] = nc/nrep Viewing CP provides four ratios representing the percentage of confidence intervals that contain the value of interest. What I cannot determine how to do with boot() and boot.ci() is this: 1. Generate multiple bootstrap samples of univariate data using boot(). For example, five distinct sets of five hundred bootstrapped means. 2. Calculate one confidence interval for each distinct set of bootstrapped means using boot.ci(). If I restricted the type= to norm and basic the output would be five sets of norm and basic upper and lower confidence bounds based on the five bootstrapped data sets generated by boot(). I am running R version 2.9.2 on an IBM T61 laptop. My OS is Win XP professional SP 3, and the machine has a 1.99 GHz processor with 2.99 GB of RAM. The version of the boot() package I am running is 1.2-41. I wish I had better programming intuition because I realize that the path to obtain this kind of analysis may be obvious to many of you. I do very much appreciate any help I receive. Thanks again, Adam [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Trouble Loading doBy and coin Packages
Good Evening R-Help Community: I have attached a file that contains the output from sessionInfo() and a summary of my Win XP system. I am running R 2.12.0 and using Tinn-R 2.3.6.2 as my interface. When I attempt to call either the doBy or coin packages R generates an error that I do not understand and have so far not been able to resolve by searching R resources. I exchanged a couple of emails with Soren Hojsgaard who does not think the doBy error is directly related to the package itself, and he suggested that I post this problem for input from others. When the doBy package is loaded, the following error appears in the Tinn-R log: Error in length(label) : could not find function .extendsForS3 Error: package/namespace load failed for 'doBy' When the coin package is called, this error appears in the Tinn-R log: Error in length(sig) : could not find function .extendsForS3 Error: package 'stats4' could not be loaded No functions in either package work, and when I attempt to call them the same errors are generated in the log. Any help or direction would be appreciated. Thanks very much, Adam sessionInfo() results: R version 2.12.0 (2010-10-15) i386-pc-mingw32 locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] grDevices datasets splines graphics stats tcltk utils [8] methods base other attached packages: [1] mvtnorm_0.9-92 contrast_0.13 Design_2.3-0svSocket_0.9-50 [5] TinnR_1.0.3 R2HTML_2.2 Hmisc_3.8-3 survival_2.36-1 loaded via a namespace (and not attached): [1] cluster_1.12.3 grid_2.11.1 lattice_0.19-13 svMisc_0.9-60 [5] tools_2.11.1 System summary: OS Name Microsoft Windows XP Professional Version 5.1.2600 Service Pack 3 Build 2600 OS Manufacturer Microsoft Corporation System Name LOR-LA200807011 System Manufacturer LENOVO System Model64635BU System Type X86-based PC Processor x86 Family 6 Model 15 Stepping 11 GenuineIntel ~1994 Mhz BIOS Version/Date LENOVO 7LETB7WW (2.17 ), 4/25/2008 SMBIOS Version 2.4 Windows Directory C:\WINDOWS System DirectoryC:\WINDOWS\system32 Boot Device \Device\HarddiskVolume1 Locale United States Hardware Abstraction Layer Version = 5.1.2600.5512 (xpsp.080413-2111) User Name Adam_Carr Time Zone Eastern Standard Time Total Physical Memory 4,096.00 MB Available Physical Memory 1.99 GB Total Virtual Memory2.00 GB Available Virtual Memory1.96 GB Page File Space 8.69 GB Page File C:\pagefile.sys __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Trouble Loading doBy and coin Packages
Hi Tal: No I have not tried this. I will do it this evening and we'll see what happens. Thanks for the suggestion. Adam From: Tal Galili tal.gal...@gmail.com Cc: r-help@r-project.org Sent: Thu, December 9, 2010 12:29:20 PM Subject: Re: [R] Trouble Loading doBy and coin Packages I Adam, Have you tried deleting the package files and then reinstalling them from a different CRAN mirror? Tal Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- Good Evening R-Help Community: I have attached a file that contains the output from sessionInfo() and a summary of my Win XP system. I am running R 2.12.0 and using Tinn-R 2.3.6.2 as my interface. When I attempt to call either the doBy or coin packages R generates an error that I do not understand and have so far not been able to resolve by searching R resources. I exchanged a couple of emails with Soren Hojsgaard who does not think the doBy error is directly related to the package itself, and he suggested that I post this problem for input from others. When the doBy package is loaded, the following error appears in the Tinn-R log: Error in length(label) : could not find function .extendsForS3 Error: package/namespace load failed for 'doBy' When the coin package is called, this error appears in the Tinn-R log: Error in length(sig) : could not find function .extendsForS3 Error: package 'stats4' could not be loaded No functions in either package work, and when I attempt to call them the same errors are generated in the log. Any help or direction would be appreciated. Thanks very much, Adam __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] New Installs, Same Trouble Loading doBy and coin Packages
I tried Tal's suggestion of deleting the doBy and coin packages and then reinstalling them from a different mirror. The first install was from the Harvard mirror and the second was from the Case Western Univ. mirror. The new packages generate the same errors when I call them using the library() command. Also, I tried to load these packages using R and its script editor thinking that the problem may have something to do with Tinn-R, but the same errors are generated on the R terminal when I use the library() function. Any help would be appreciated. Again, the errors for these two packages: Error in length(label) : could not find function .extendsForS3 Error: package/namespace load failed for 'doBy' library(coin) Loading required package: mvtnorm Loading required package: modeltools Loading required package: stats4 #This is odd. I cannot find any reference for this package. AC Error in length(sig) : could not find function .extendsForS3 Error: package 'stats4' could not be loaded - Forwarded Message From: Adam Carr adamlc...@yahoo.com To: Tal Galili tal.gal...@gmail.com Cc: r-help@r-project.org Sent: Thu, December 9, 2010 1:12:21 PM Subject: Re: [R] Trouble Loading doBy and coin Packages Hi Tal: No I have not tried this. I will do it this evening and we'll see what happens. Thanks for the suggestion. Adam From: Tal Galili tal.gal...@gmail.com Cc: r-help@r-project.org Sent: Thu, December 9, 2010 12:29:20 PM Subject: Re: [R] Trouble Loading doBy and coin Packages I Adam, Have you tried deleting the package files and then reinstalling them from a different CRAN mirror? Tal Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- Good Evening R-Help Community: I have attached a file that contains the output from sessionInfo() and a summary of my Win XP system. I am running R 2.12.0 and using Tinn-R 2.3.6.2 as my interface. When I attempt to call either the doBy or coin packages R generates an error that I do not understand and have so far not been able to resolve by searching R resources. I exchanged a couple of emails with Soren Hojsgaard who does not think the doBy error is directly related to the package itself, and he suggested that I post this problem for input from others. When the doBy package is loaded, the following error appears in the Tinn-R log: Error in length(label) : could not find function .extendsForS3 Error: package/namespace load failed for 'doBy' When the coin package is called, this error appears in the Tinn-R log: Error in length(sig) : could not find function .extendsForS3 Error: package 'stats4' could not be loaded No functions in either package work, and when I attempt to call them the same errors are generated in the log. Any help or direction would be appreciated. Thanks very much, Adam __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] New R Install Worked to End Trouble Loading doBy and coin Packages
Hello Peter: The new R install seems to have worked. Both doBy and coin appear to load and run fine. Thanks for taking the time to help me. Adam From: Peter Ehlers ehl...@ucalgary.ca Cc: r-help@r-project.org r-help@r-project.org Sent: Fri, December 10, 2010 7:13:17 AM Subject: Re: [R] New Installs, Same Trouble Loading doBy and coin Packages On 2010-12-10 03:43, Adam Carr wrote: I tried Tal's suggestion of deleting the doBy and coin packages and then reinstalling them from a different mirror. The first install was from the Harvard mirror and the second was from the Case Western Univ. mirror. The new packages generate the same errors when I call them using the library() command. Also, I tried to load these packages using R and its script editor thinking that the problem may have something to do with Tinn-R, but the same errors are generated on the R terminal when I use the library() function. Any help would be appreciated. Again, the errors for these two packages: Error in length(label) : could not find function .extendsForS3 Error: package/namespace load failed for 'doBy' library(coin) Loading required package: mvtnorm Loading required package: modeltools Loading required package: stats4 #This is odd. I cannot find any reference for this package. AC Error in length(sig) : could not find function .extendsForS3 Error: package 'stats4' could not be loaded I would remove and re-install R. 'stats4' is a base package and if that can't be loaded, your installation may be broken. Try require(stats4) or help(package=stats4) Peter Ehlers - Forwarded Message To: Tal Galilital.gal...@gmail.com Cc: r-help@r-project.org Sent: Thu, December 9, 2010 1:12:21 PM Subject: Re: [R] Trouble Loading doBy and coin Packages Hi Tal: No I have not tried this. I will do it this evening and we'll see what happens. Thanks for the suggestion. Adam From: Tal Galilital.gal...@gmail.com Cc: r-help@r-project.org Sent: Thu, December 9, 2010 12:29:20 PM Subject: Re: [R] Trouble Loading doBy and coin Packages I Adam, Have you tried deleting the package files and then reinstalling them from a different CRAN mirror? Tal Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- - Good Evening R-Help Community: I have attached a file that contains the output from sessionInfo() and a summary of my Win XP system. I am running R 2.12.0 and using Tinn-R 2.3.6.2 as my interface. When I attempt to call either the doBy or coin packages R generates an error that I do not understand and have so far not been able to resolve by searching R resources. I exchanged a couple of emails with Soren Hojsgaard who does not think the doBy error is directly related to the package itself, and he suggested that I post this problem for input from others. When the doBy package is loaded, the following error appears in the Tinn-R log: Error in length(label) : could not find function .extendsForS3 Error: package/namespace load failed for 'doBy' When the coin package is called, this error appears in the Tinn-R log: Error in length(sig) : could not find function .extendsForS3 Error: package 'stats4' could not be loaded No functions in either package work, and when I attempt to call them the same errors are generated in the log. Any help or direction would be appreciated. Thanks very much, Adam __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pdf() Export Problem: Circles Interpreted as Fonts from ggplot2 Graphics
Neglected to reply to all. Sorry. - Forwarded Message From: Adam Carr adamlc...@yahoo.com To: David Winsemius dwinsem...@comcast.net Sent: Sat, January 1, 2011 8:58:26 AM Subject: Re: [R] pdf() Export Problem: Circles Interpreted as Fonts from ggplot2 Graphics Hello David: Thanks for the reply and for the suggestion on an alternative character. I will try this today and see what happens. As I searched for solutions to this a more experienced graphics editor recommended a package called Xara Photo and Graphic Designer 6. This package, which has an open-source version for Linux, imported the PDF without any font interpretation difficulties. The text editing process required about thirty seconds. Happy New Year, Adam From: David Winsemius dwinsem...@comcast.net Cc: r-help@r-project.org Sent: Thu, December 30, 2010 7:07:30 PM Subject: Re: [R] pdf() Export Problem: Circles Interpreted as Fonts from ggplot2 Graphics You could try using the Symbol font's solid circle as pch , octmode 267, if I am reading the output from the TestChars function on the points help page correctly. BTW, I opened your document in GIMP and it shows q's as well. --david. On Dec 30, 2010, at 5:59 PM, Adam Carr wrote: Good Evening: I am putting together a large report with plots created in R, V 2.12.0. Most of the plots are created using ggplot2 V0.8.9. I use R's pdf() command to export the plot to a pdf file. I am exporting the plots and attempting to edit the title text in Inkscape primarily because ggplot2 does not support superscript or subscript formatting in the title text. For the report I am working on these formats are essential. I am running the R version mentioned above and Inkscape 0.48 on a Windows XP machine with the following system details: OS Name Microsoft Windows XP Professional Version 5.1.2600 Service Pack 3 Build 2600 System Type X86-based PC Processor x86 Family 6 Model 15 Stepping 11 GenuineIntel ~1995 Mhz BIOS Version/Date LENOVO 7LETB7WW (2.17 ), 4/25/2008 Total Physical Memory 4,096.00 MB Available Physical Memory 1.62 GB Total Virtual Memory 2.00 GB Available Virtual Memory 1.96 GB Page File Space 8.69 GB I do not think this is a ggplot2-specific problem. I use a simple version of the pdf() command to export the file that includes the file name and path only. The PDF looks fine actually, it is the restriction on text editing caused by Adobe's intepretation of the graphic that is the problem. I have attached two files to this email: 1. An R-exported pdf file exactly as it looks as opened in Adobe Reader V9. This file is named exportforinkscapeforum.pdf. 2. An example of the way the plot appears after I import it into Inkscape. This file is named Example of How Imported File Appears in Inkscape.pdf. The problem I have is that when I import the pdf into Inkscape the solid, filled circles on the plot are converted to the lower case letter q. I read about similar problems on R-help.org and other R-related sites, but the descriptions I found seemed to indicate that the lower case q was visible in the pdf file when opened with Adobe or other viewers. This does not seem to be my problem. I posted this problem to the Inkscape forum and received a reply suggesting that Adobe is interpreting the solid, filled circles not as solid, filled circles but as font objects. The user who replied suggested that I look for the Zpf Dingbat font embedded in the PDF and it is in fact there. This is the font Adobe is applying to my solid, filled circles. Apparently there are known issues with Inkscape's ability to import fonts via PDF and the problem is documented on their bug list. The Inkscape user asked if there was any way that R could be coerced to use actual circles or paths for the points. I am not aware of a way to do this so any input from anyone here would be greatly appreciated. To briefly return to my main problem: if there is another way to edit the main title text to include a superscripted character (in my particular case it is Unicode character 00AE, the registered trademark sign) I would appreciate the insight. Any help on this issue would be appreciated. Adam Example of How Imported PDF Looks in Inkscape.pdfexportforinkscapeforum.pdf__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained
Re: [R] pdf() Export Problem: Circles Interpreted as Fonts from ggplot2 Graphics
Hello Hadley: Thanks for the reply. My apologies for overlooking an easy fix. The symbols on the exported document look fine. Adam From: Hadley Wickham had...@rice.edu Cc: r-help@r-project.org Sent: Thu, December 30, 2010 7:51:09 PM Subject: Re: [R] pdf() Export Problem: Circles Interpreted as Fonts from ggplot2 Graphics The Inkscape user asked if there was any way that R could be coerced to use actual circles or paths for the points. I am not aware of a way to do this so any input from anyone here would be greatly appreciated. pdf(..., useDingbats = F) Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pdf() Export Problem: Circles Interpreted as Fonts from ggplot2 Graphics
I am ashamed to admit that I missed the nondingbat option. I too tried the alternative character approach and it seemed to work fine. I am using a rather straightforward set of colors and, as far as I could tell anyway, the colors appeared to be reproduced correctly. Thanks again for taking to the time to help me. From: David Winsemius dwinsem...@comcast.net Cc: r-help@r-project.org Sent: Sat, January 1, 2011 11:09:23 AM Subject: Re: [R] pdf() Export Problem: Circles Interpreted as Fonts from ggplot2 Graphics I thought Hadley's response was more definitive, but I did go on to test my alternate character strategy in ggplot and it did succeed. Whether you could get coloring or sixing that was appropriate I cannot say, since I figured the non-dingbatting option was more general. --David On Jan 1, 2011, at 8:59 AM, Adam Carr wrote: Neglected to reply to all. Sorry. - Forwarded Message To: David Winsemius dwinsem...@comcast.net Sent: Sat, January 1, 2011 8:58:26 AM Subject: Re: [R] pdf() Export Problem: Circles Interpreted as Fonts from ggplot2 Graphics Hello David: Thanks for the reply and for the suggestion on an alternative character. I will try this today and see what happens. As I searched for solutions to this a more experienced graphics editor recommended a package called Xara Photo and Graphic Designer 6. This package, which has an open-source version for Linux, imported the PDF without any font interpretation difficulties. The text editing process required about thirty seconds. Happy New Year, Adam From: David Winsemius dwinsem...@comcast.net Cc: r-help@r-project.org Sent: Thu, December 30, 2010 7:07:30 PM Subject: Re: [R] pdf() Export Problem: Circles Interpreted as Fonts from ggplot2 Graphics You could try using the Symbol font's solid circle as pch , octmode 267, if I am reading the output from the TestChars function on the points help page correctly. BTW, I opened your document in GIMP and it shows q's as well. --david. On Dec 30, 2010, at 5:59 PM, Adam Carr wrote: Good Evening: I am putting together a large report with plots created in R, V 2.12.0. Most of the plots are created using ggplot2 V0.8.9. I use R's pdf() command to export the plot to a pdf file. I am exporting the plots and attempting to edit the title text in Inkscape primarily because ggplot2 does not support superscript or subscript formatting in the title text. For the report I am working on these formats are essential. I am running the R version mentioned above and Inkscape 0.48 on a Windows XP machine with the following system details: OS Name Microsoft Windows XP Professional Version 5.1.2600 Service Pack 3 Build 2600 System Type X86-based PC Processor x86 Family 6 Model 15 Stepping 11 GenuineIntel ~1995 Mhz BIOS Version/Date LENOVO 7LETB7WW (2.17 ), 4/25/2008 Total Physical Memory 4,096.00 MB Available Physical Memory 1.62 GB Total Virtual Memory 2.00 GB Available Virtual Memory 1.96 GB Page File Space 8.69 GB I do not think this is a ggplot2-specific problem. I use a simple version of the pdf() command to export the file that includes the file name and path only. The PDF looks fine actually, it is the restriction on text editing caused by Adobe's intepretation of the graphic that is the problem. I have attached two files to this email: 1. An R-exported pdf file exactly as it looks as opened in Adobe Reader V9. This file is named exportforinkscapeforum.pdf. 2. An example of the way the plot appears after I import it into Inkscape. This file is named Example of How Imported File Appears in Inkscape.pdf. The problem I have is that when I import the pdf into Inkscape the solid, filled circles on the plot are converted to the lower case letter q. I read about similar problems on R-help.org and other R-related sites, but the descriptions I found seemed to indicate that the lower case q was visible in the pdf file when opened with Adobe or other viewers. This does not seem to be my problem. I posted this problem to the Inkscape forum and received a reply suggesting that Adobe is interpreting the solid, filled circles not as solid, filled circles but as font objects. The user who replied suggested that I look for the Zpf Dingbat font embedded in the PDF and it is in fact there. This is the font Adobe is applying to my solid, filled circles. Apparently there are known issues with Inkscape's ability to import fonts via PDF and the problem is documented on their bug list. The Inkscape user asked if there was any way that R could be coerced to use actual circles or paths for the points. I am not aware of a way to do this so any input from anyone here would be greatly appreciated. To briefly return to my main problem: if there is another way to edit the main title text to include