Re: [R] problem with read.table
On May 22, 2007, at 9:41 PM, Alex Tsoi wrote: Dear all, I try to use read.table to get the data from a tab delimited file, and some of the data is shown below: [snip] and it means that whenever read.table reads ' , it skips the next line, until it reads ' again Could anyone show me how to solve this kind of problem ? I greatly appreciate for any suggestion. Thanks. You might want to have a look at ?read.table for more details, but the following should do it: test - read.table(data.txt, colClasses = character, sep=\t, quote=\) Essentially by default read.table sees both and ' as quote delimiters. In your data, you only want as a quote delimiter. Alex Tsoi- Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Questions about bwplot
On May 21, 2007, at 8:39 AM, Klaus Nordhausen wrote: Dear R-experts, I have some questions about boxplots with lattice. My data is similar as in the example below, I have two factors (Goodness of Fit and Algorithms) and data values but in each panels the scales are quite different, therefore the normal boxplots produced by set.seed(1) GOF - factor(rep(c(GOF1,GOF2,GOF3),each=40)) Alg - rep(factor(rep(c(A1,A2,A3,R1),each=10)),3) Value - c(runif(40),rnorm(40),rnorm(30,10,3),rnorm(10,20,3)) test.data - data.frame(Alg=Alg,GOF=GOF,Value=Value) library(lattice) bwplot(Value ~ Alg | GOF, data = test.data, as.table=T, layout=c(1,3)) are not very informative. Then I used bwplot(Value ~ Alg | GOF, data = test.data, scale=list (relation=free), as.table=T, layout=c(1,3)) from which my first question arises: Is it possible to have no vertical space between the panels though they have different y-scales when using the argument scale=list(relation=free)? Try this: bwplot(Value ~ Alg | GOF, data = test.data, scale=list(y=list (relation=free)), as.table=T, layout=c(1,3)) Sorry, don't have any thoughts about your other two questions off the top of my head. Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to convert a string vector to a numeric vector
On May 13, 2007, at 7:48 AM, Mihai Bisca wrote: Hello all, I'm new to R and I cannot find a simple answer to a simple question. If I have a character vector like v - c('1/50,'1/2','1/8'...) how can I convert it to a numeric vector like vn - c(0.02,0.5,0.125...). I tried as.numeric in various ways and failed miserably. Currently I use a function like: for (e in v) { if (e=='1/50') vn-c(vn,0.02) ...} but that feels bad because it needs to be (humanly) modified everytime a new fraction appears in v. The problem is that as.numeric does not expect to see expressions that would need evaluation, like 1/50 above, but instead it expects to see numbers. Assuming the entries have always the form a/b, with a and b numbers, then you could use this: vn - sapply(strsplit(v,/), function(x) as.numeric(x[1])/as.numeric (x[2])) If your entries are allowed to be more general expressions, like (1 +5)/10 or simply 0.2 or whatnot, you could perhaps use: sapply(parse(text=v), eval) But I prefer to avoid parse+eval whenever possible. Thanks in advance, -- Mihai. Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sequential for loop
Hi Michael, On Apr 20, 2007, at 12:31 AM, Michael Toews wrote: Hi all, I'm usually comfortable using the *apply functions for vectorizing loops in R. However, my particular problem now is using it in a sequential operation, which uses values evaluated in an offset of the loop vector. Here is my example using a for loop approach: dat - data.frame(year=rep(1970:1980,each=365),yday=1:365) dat$value - sin(dat$yday*2*pi/365)+rnorm(nrow(dat),sd=0.5) dat$ca - dat$cb - 0 # consecutive above and below 0 for(n in 2:nrow(dat)){ if(dat$value[n] 0) dat$ca[n] - dat$ca[n-1] + 1 else dat$cb[n] - dat$cb[n-1] + 1 } I'm inquiring if there is a straightforward way to vectorize this (or a similar example) in R, since it gets rather slow with larger data frames. If there is no straightforward method, no worries. Would this do what you want: dat - data.frame(year=rep(1970:1980,each=365),yday=1:365) dat$value - sin(dat$yday*2*pi/365)+rnorm(nrow(dat),sd=0.5) positives - dat$value 0 dat$ca - cumsum(c(0,positives[-1])) dat$cb - cumsum(c(0,!positives[-1])) Thanks in advance. +mt Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Two sample t.test, order of comparions
On Apr 18, 2007, at 8:46 AM, Helmut Schütz wrote: Dear group members, I want to compare response variables (logAUC) of two groups (treatment Test, Reference) of a subset (period == 1) in dataframe resp (below): [ snip ] The formula method of t.test result - t.test(logAUC ~ treatment, data = resp, subset = (period == 1), var.equal = FALSE, conf.level = 0.90) result gives Welch Two Sample t-test data: logAUC by treatment t = 1.1123, df = 21.431, p-value = 0.2783 alternative hypothesis: true difference in means is not equal to 0 90 percent confidence interval: -0.0973465 0.4542311 sample estimates: mean in group Reference mean in group Test 3.5622733.383831 Now I'm interested rather in the confidence interval of Test - Reference rather than Reference - Test which is given by t.test Do you know a more elegant way than the clumsy one I have tried? as.numeric(exp(result$estimate[2]-result$estimate[1])) as.numeric(exp(-result$conf.int[2])) as.numeric(exp(-result$conf.int[1])) First off, those three could probably be simplified slightly as: as.numeric(exp(-diff(result$estimate))) as.numeric(exp(-result$conf.int)) The simplest solution I think is to specify that resp$treatment should have the levels ordered in the way you like them using this first: resp$treatment - ordered(resp$treatment, levels=rev(levels(resp $treatment))) Then the t.test will show things in the order you want them. Best regards, Helmut Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data Manipulation using R
On Apr 17, 2007, at 8:03 PM, Anup Nandialath wrote: Dear Friends, I have data set with around 220,000 rows and 17 columns. One of the columns is an id variable which is grouped from 1000 through 9000. I need to perform the following operations. 1) Remove all the observations with id's between 6000 and 6999 I tried using this method. remdat1 - subset(data, ID6000) remdat2 - subset(data, ID=7000) donedat - rbind(remdat1, remdat2) I check the last and first entry and found that it did not have ID values 6000. Therefore I think that this might be correct, but is this the most efficient way of doing this? The rbind is a bit unnecessary probably. I think all you are missing for both questions is the or operator, |. ( ?| ) Simply: donedat - subset(data, ID 6000 | ID =7000) would do for this. Not sure about efficiency, but if the code is fast as it stands I wouldn't worry too much about it. 2) I need to remove observations within columns 3, 4, 6 and 8 when they are negative. For instance if the number in column 3 is -4, then I need to delete the entire observation. Can somebody help me with this too. The following should do it (untested, not sure if it would handle NA's): toremove - data[,3] 0 | data[,4] 0 | data[,6] 0 | data[,8] 0 data[!toremove,] If you want more columns than those 4, then we could perhaps look for a better line than the first line above. Thank and Regards Anup Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simplify simple code
On Apr 16, 2007, at 1:37 AM, Dong-hyun Oh wrote: Dear expeRts, I would simplify following code. - youtput - function(x1, x2){ n - length(x1) y - vector(mode=numeric, length=n) for(i in 1:n){ if(x1[i] =5 x1[i] = 10 x2[i] =5 x2[i] =10) y[i] - 0.631 * x1[i]^0.55 * x2[i]^0.65 if(x1[i] =10 x1[i] = 15 x2[i] =5 x2[i] =10) y[i] - 0.794 * x1[i]^0.45 * x2[i]^0.65 if(x1[i] =5 x1[i] = 10 x2[i] =10 x2[i] =15) y[i] - 1.259 * x1[i]^0.55 * x2[i]^0.35 if(x1[i] =10 x1[i] = 15 x2[i] =10 x2[i] =15) y[i] - 1.585 * x1[i]^0.45 * x2[i]^0.35 } y } -- Anyone can help me? I hope someone comes up with something better, but here is one way: youtput - function(x1, x2) { co1 - matrix(c(0.631,0.794,1.259,1.585), c(2,2)) co2 - c(0.55,0.45) co3 - c(0.65,0.35) p1 - findInterval(x1,c(5,10,15)) p2 - findInterval(x2,c(5,10,15)) return( diag(co1[p1,p2]) * x1^co2[p1] * x2^co3[p2] ) } It is not clear at all what you wanted to happen when x1 and/or x2 is not between 5 and 15, so I did not deal with those case. The above command will choke in that case, and should be modified accordingly depending on what you want. Sincerely, === Dong H. Oh Ph. D Candidate Techno-Economics and Policy Program College of Engineering, Seoul National University, Seoul, 151-050, Republic of Korea E-mail:[EMAIL PROTECTED] Mobile: +82-10-6877-2109 Office : +82-2-880-9142 Fax: +82-2-880-8389 Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.spss (package foreign) and SPSS 15.0 files
On Apr 16, 2007, at 10:41 AM, John Kane wrote: --- Charilaos Skiadas [EMAIL PROTECTED] wrote: It is not an export option, it is a save as option. I don't have a 14 to check, but on a 15 I just go to File - Save As, and change the Save as type field to Comma Delimited (.*.csv). (I suppose tab delimited would be another option). Then there are two check- boxes below the window that allow a bit further customizing, one of them is about using value labels where defined instead of data values. I'm now back on a machine with SPSS 14. No csv option that I can see. Perhaps an enhancement to v15. I don't have a 14, but I did check a 13 today and you are correct, no csv option is there, which in my opinion is quite unacceptable for a statistical package that is on its 13/14th version. But there was an option for Excel 97 and ..., and that seemed to allow using value labels instead of the values (again you have to check the corresponding box). So perhaps that would be an option. Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.spss (package foreign) and SPSS 15.0 files
On Apr 14, 2007, at 7:06 AM, John Kane wrote: --- Adrian Dusa [EMAIL PROTECTED] wrote: Charilaos Skiadas skiadas at hanover.edu writes: [...] I save as csv format all the time, and it offers me a choice to use the labels instead of the corresponding numbers. So you shouldn't have to lose that labelling. This is interesting and I tried to do this as well; I don't have access to an SPSS 15 (only to version 14 for the moment) but I cannot find the option to save as CSV. Is it a version 15 feature? Thanks, Adrian I cannot remember if I have been using 14 or 14, I think it was 14 and I'm not near the machine to check. There does not seem to be a csv export in 14 but it looks like you can achieve the same thing by using one of the Excel outputs and then dumping the file from there. It is not an export option, it is a save as option. I don't have a 14 to check, but on a 15 I just go to File - Save As, and change the Save as type field to Comma Delimited (.*.csv). (I suppose tab delimited would be another option). Then there are two check- boxes below the window that allow a bit further customizing, one of them is about using value labels where defined instead of data values. Haris Skiadas Department of Mathematics and Computer Science Hanover College __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Getting and using a function
On Apr 12, 2007, at 11:28 PM, Thomas L Jones wrote: library (gam) all I get is an error message. Error in library (gam) : there is no package called 'gam' Well, does this mean what it says, or does it mean something different? For example, does it mean that such-and-such computer program has not yet been downloaded? It means there is no package called 'gam' in your computer at this moment. You need to download it first. You can probably do this through the application menus. I am on a Mac, but the menus should be similar. I have a PackagesData menu, that has a Package Installer item. Alternatively, you can use the install.packages command I think: install.packages(gam) R has a bit of a learning curve, you'll probably want to read some of the basic guides first, if you haven't yet. Have a look at these: http://www.math.csi.cuny.edu/Statistics/R/simpleR/ http://cran.r-project.org/doc/manuals/R-intro.html http://cran.r-project.org/other-docs.html http://cran.r-project.org/manuals.html Haris Skiadas Department of Mathematics and Computer Science Hanover College __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] if/else construct
Corinna, On Apr 13, 2007, at 8:19 AM, Schmitt, Corinna wrote: Dear R-Experts, Since Monday I try to write the right if/else construct for the program. I was not successful yet. Important is the order of the if-cases! The code is running with the 2 if-cases. An if/else construction would look better, so can any one help me with the right if/else construction? There are three possible values for deletingDecision: 1) y 2) n 3) Something else If you are going to use an if/else construct, you better make sure you know what you want to do with the third option. In my opinion, you would want it to be the same as 2, in which case you really want the if decision==yes option first, and the other option in the else clause. Looking at ?if, it should look at something like this: if (deletingDecision == yes) { print(Yes!) } else { print(Not yes!) } Or if you really want a third option: if (deletingDecision == yes) { print(Yes!) } else if (deletingDecision == no) { print(No!) } else { print(Other!) } Thanks, Corinna Program: deletingDecision = userInput() yes - c(y) no - c(n) noAnswer - c(Current R workspace was not deleted!) # first if if (deletingDecision == no) { print(Current R workspace was not deleted!) } # second if if (deletingDecision == yes) { rm(list=ls(all=TRUE)) print(Current R workspace was deleted!) } Hope this helps. Haris Skiadas Department of Mathematics and Computer Science Hanover College __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] graphics question: tilted axis labels?
Your problem is different I think, it's the fact that LA$countries is a factor, and hence you see the factor levels instead of their labels. Try: # create data frame LA - data.frame(countries=c(Chile, Peru, Bolivia), values=c (10, 12, 13), stringsAsFactors = FALSE) # call barplot barplot(LA$values, names.arg=LA$countries) On Apr 13, 2007, at 9:02 AM, Christoph Heibl wrote: I´m sorry, I did not provide any code. Here is now a small example: # create data frame LA - data.frame(countries=c(Chile, Peru, Bolivia), values=c (10, 12, 13)) # call barplot barplot(LA$values, names.arg=LA$countries) # Countries names are not plotted, but their index numbers instead. # So again the question: # How can I tilt the angles in order to make whole names fit? Thanks Christoph Haris Skiadas Department of Mathematics and Computer Science Hanover College __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] graphics question: tilted axis labels?
On Apr 13, 2007, at 9:52 AM, Christoph Heibl wrote: Dear Charilaos, Thanks ... your were right. I now get the names. But the problems remains that the space (30 items) is insufficient to bear all the names and I am still looking for a way to accommodate them. Do you know of any solution? Frankly, if you have a barplot with 30 items, I would rethink the situation if I were you. As an audience, I would find it hard to process such a graph. Put it might just be me. I personally think that tilting them 45, or even 90 degrees is not a very good idea presentation-wise, and opt instead to have the barplots be horizontal when something like this happens (barplot (...,horiz=TRUE) ). If you look at ?par, you'll find the options crt and srt, which don't seem to work on the axes, and also have a big warning about not expecting a 45 degree tilt to always work. You can use las to turn it 90 degrees if you really want that. I think lattice and grid would allow you perhaps to do exactly what you want, though it might be somewhat more work. Sorry, perhaps I was more critical than helpful. Best of luck with it. PS: Why do drawing commands have different names for the horizontal attribute? boxplot - horizontal barplot - horiz Cheers, Christoph On 13.04.2007, at 15:27, Charilaos Skiadas wrote: # create data frame LA - data.frame(countries=c(Chile, Peru, Bolivia), values=c (10, 12, 13), stringsAsFactors = FALSE) # call barplot barplot(LA$values, names.arg=LA$countries) Haris Skiadas Department of Mathematics and Computer Science Hanover College __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problems in loading MASS
Weiwei, I have never had much success installing packages from within R.app on MacOSX, because the location that it is supposed to save things, / Library/Frameworks/, needs elevated priviledges, which the app doesn't seem to try to get. So it at best ends up saving it in some temporary location, and it has to be downloaded again next time R is restarted. As a result, I have always downloaded the tgz file from my browser, then go to the terminal in that folder and do a sudo R CMD INSTALL packagename.tgz. But perhaps I am doing something wrong and one can do this properly from within R.app, I would love to be wrong on this one. sessionInfo() R version 2.4.1 (2006-12-18) powerpc-apple-darwin8.8.0 locale: en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base Haris Skiadas Department of Mathematics and Computer Science Hanover College On Apr 12, 2007, at 6:04 PM, Weiwei Shi wrote: Hi, there: After I upgraded my R to 2.4.1, it is my first time of trying to use MASS and found the following error message: install.packages(MASS) --- Please select a CRAN mirror for use in this session --- trying URL 'http://cran.cnr.Berkeley.edu/bin/macosx/universal/ contrib/2.4/VR_7.2-33.tgz' Content type 'application/x-gzip' length 995260 bytes opened URL == downloaded 971Kb The downloaded packages are in /tmp/RtmpmAzBwa/downloaded_packages library(MASS) Error in dyn.load(x, as.logical(local), as.logical(now)) : unable to load shared library '/Library/Frameworks/R.framework/Versions/2.4/Resources/library/ MASS/libs/i386/MASS.so': dlopen(/Library/Frameworks/R.framework/Versions/2.4/Resources/ library/MASS/libs/i386/MASS.so, 6): Library not loaded: /usr/local/gcc4.0/i686-apple-darwin8/lib/libgcc_s.1.0.dylib Referenced from: /Library/Frameworks/R.framework/Versions/2.4/Resources/library/MASS/ libs/i386/MASS.so Reason: image not found Error: package/namespace load failed for 'MASS' sessionInfo() R version 2.4.1 (2006-12-18) i386-apple-darwin8.8.1 locale: en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: randomForestdprep 4.5-181.0 version _ platform i386-apple-darwin8.8.1 arch i386 os darwin8.8.1 system i386, darwin8.8.1 status major 2 minor 4.1 year 2006 month 12 day18 svn rev40228 language R version.string R version 2.4.1 (2006-12-18) Thanks -- Weiwei Shi, Ph.D Research Scientist GeneGO, Inc. __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Boxplot names format
On Apr 11, 2007, at 5:31 AM, Jose Sierra wrote: Thank you very much Peters. it runs Peter Danenberg escribió: I create a boxplot but the names are too longs and i cant see them complete. If you're referring to labels on the x-axis, Jose, I'll sometimes rotate them and increase the bottom margin: I personally prefer to just turn the boxplot horizontally, and use las=1 to display the labels on the y axis horizontally. Makes them easier to read in my opinion. Or perhaps I am committing a faux-pas? Are boxplots considered harder to read if they are horizontal? Haris Skiadas Department of Mathematics and Computer Science Hanover College PS: Please don't hijack threads. Your original email was in response to a message called Rserve and R to R communication by Ramon Diaz- Uriarte. (Point 4 of the Technical details of posting: section of the posting guide: http://www.r-project.org/posting-guide.html). PS2: To whoever is responsible for the posting guide: The link in the above mentioned section referring to General Instructions is missing the .html, and sending people to the non-existing http:// www.r-project.org/mail#instructions (or is it perhaps just my browser? ) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reasons to Use R
A new fortune candidate perhaps? On Apr 10, 2007, at 6:27 PM, Greg Snow wrote: Remember, everything is better than everything else given the right comparison. -- Gregory (Greg) L. Snow Ph.D. Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Memory management
Before you go down that road, I would recommend first seeing if it is really a problem. Premature code optimization is in my opinion never a good idea. Also, reading the Details on ?attach you will find this: The database is not actually attached. Rather, a new environment is created on the search path and the elements of a list (including columns of a data frame) or objects in a save file or an environment are copied into the new environment. If you use - or assign to assign to an attached database, you only alter the attached copy, not the original object. (Normal assignment will place a modified version in the user's workspace: see the examples.) For this reason attach can lead to confusion. So in fact it is the attaching that has to do copying, not the other way around. As for references, perhaps there is a better one, but searching for pass in Writing R Extensions I found the following on page 41: Some memory allocation is obvious in interpreted code, for example, y - x + 1 allocates memory for a new vector y. Other memory allocation is less obvious and occurs because R is forced to make good on its promise of ‘call-by-value’ argument passing. When an argument is passed to a function it is not immediately copied. Copying occurs (if necessary) only when the argument is modified. This can lead to surprising memory use. Perhaps a better source, section 4.3.3 of The R language definition, on Argument Evaluation. On Apr 11, 2007, at 8:25 AM, yoo wrote: I guess I have more reading to do Are there any website that I can read up on memory management, or specifically what happen when we 'pass in' variables, which strategy is better at which situation? Thanks~ - y On Tue, 10 Apr 2007, yoo wrote: Hi all, I'm just curious how memory management works in R... I need to run an optimization that keeps calling the same function with a large set of parameters... so then I start to wonder if it's better if I attach the variables first vs passing them in (coz that involves a lot of copying.. ) Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Boxplot with quartiles generated from different algorithms
On Apr 11, 2007, at 7:59 AM, Pietrzykowski, Matthew (GE, Research) wrote: R users: I am trying to replicate the boxplot output I achieve with Minitab in R. I realize that R gives the user many more options on the algorithm used to calculate the IQR than Minitab, so I concentrated on type=6 when using the quantile() function in R. The problem I am having is setting the upper and lower limit of the whisker based on the nearest actual data that should be included. If the last sentence is unclear, setting the boxplot$stats rows 1 and 5 to the right values based on the IQR from the type=6 setting of the quantile function. Is there an easy way to do this for a data frame or matrix? Seeing as noone else answered this (at least not onlist), I'll give it a go I think. If I understand your question correctly, you know how to find the values you want for boxplot$stats rows 1 and 5, your question is how to get boxplot to accept them. If so, you should be able to simply do the following three steps: pl - boxplot() pl$stats[1] - bxp(pl) I suppose the question that remains then is whether you can do this by a single direct call to boxplot. I had this question a couple of months ago, because I wanted to make the output of boxplot be what my students were expecting from what they had learned (Moore McCabe) and wasn't able to find an answer. I'd love to find out if there is one. Many thanks, Matt Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reasons to Use R
On Apr 9, 2007, at 1:45 PM, Greg Snow wrote: The licences keep changing, some have in the past but don't now, some you can get an additional licence for home at a discounted price. Some it depends on the type of licence you have at work (currently our SAS licence is such that the 3 people in my group can all have it installed, but at most 1 can be using it at any 1 time, how does that affect installing/using it at home). Hm, this intrigues me, it would seem to me that the only way for SAS to check that only one of your colleagues uses it at any given time would be to contact some sort of online server. Does that mean that SAS can only be run when you have internet access? Or is it simply a clause on the license, without any runtime checks? Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Computing the rank of a matrix.
On Apr 6, 2007, at 7:39 AM, José Luis Aznarte M. wrote: Hi! Maybe this is a silly question, but I need the column rank (http://en.wikipedia.org/wiki/Rank_matrix) of a matrix and R function 'rank()' only gives me the ordering of the elements of my matrix. How can I compute the column rank of a matrix? Is there not an R equivalent to Matlab's 'rank()'? I've been browsing for a time now and I can't find anything, so any help will be greatly appreciated. Best regards! Surprisingly, google searching for r matrix rank actually returns a R link: http://tolstoy.newcastle.edu.au/R/help/05/05/4000.html I suppose the point is that in R you usually need a bit more than just the rank, so instead you want an object that contains all that info and more. Like we have the various lm objects, so to speak. They do the hard work once, and then we can ask them more particular questions. ?qr -- -- Jose Luis Aznarte M. http://decsai.ugr.es/~jlaznarte Department of Computer Science and Artificial Intelligence Universidad de Granada Tel. +34 - 958 - 24 04 67 GRANADA (Spain) Fax: +34 - 958 - 24 00 79 Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.spss (package foreign) and SPSS 15.0 files
On Apr 6, 2007, at 12:32 PM, John Kane wrote: I have simply moved to exporting the SPSS file to a delimited file and loading it. Unfortunately I'm losing all the labelling which can be time-consuming to redo.Some of the data has something like 10 categories for a variable. I save as csv format all the time, and it offers me a choice to use the labels instead of the corresponding numbers. So you shouldn't have to lose that labelling. Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reading of a matrix
On Apr 5, 2007, at 6:50 AM, Schmitt, Corinna wrote: Dear R-experts, I still have problems with the reading of a matrix. Input: matrixData6.txt A-Paar B-Paar C-Paar D-Paar E-Paar A 1 3 5 7 9 B 2 4 6 8 10 R-commands: y=read.table(file=Z:/Software/R-Programme/matrixData6.txt) y Result: A.Paar B.Paar C.Paar D.Paar E.Paar A 1 3 5 7 9 B 2 4 6 8 10 If you look into the txt-file you can recognize that the column names are not the same. Why? Look at ?read.table for details, basically the variable names are turned into syntactically valid names via make.names (?make.names). This is, I pressume, so that you can later refer to them via: df $A.Paar (df$A-Paar means something very different). You can try to add: check.names=FALSE to the read.table call, not sure it will do what you want it to. If I add in the txt-file the line MyData: infront of all followed by a newline. The R-command as above response an error. How can I read the modified input and get the following result: Hm, if you really want the MyData to show up in the result, then you will have to do some more hard work, since data frames don't really have a room for that. But if you simply want MyData: to show up in the text file but not be read by R, then you would want to prepend the line with the comment character, #. MyData: A.Paar B.Paar C.Paar D.Paar E.Paar A 1 3 5 7 9 B 2 4 6 8 10 Another stupid question might be hows can I change the column and row names after I made read.table? I want to have the following result, for example: names(yourdf) - c(G,H,I,J,K) or perhaps better: names(yourdf) - LETTERS[6+1:5] That's for the columns. Use ?rownames for the rows. MyData: G H I J K M 1 3 5 7 9 N 2 4 6 8 10 I studied all manuals I could find and the help but could not understand the examples or interpret it right for my case. Thanks for help, Corinna Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Annotate a levelplot (using abline) - Difficulty with trellis.
On Apr 4, 2007, at 7:19 AM, Dan Bolser wrote: Hi, I am generating a beautiful plot with the 'levelplot' function over my square matrix of data. In order to help visualise the data I would like to draw a diagonal line on the matrix. Because the plot is actually a trellis object, I am having difficulty working out how to do this. I have been reading around, but I don't see any easy solution to the problem. (Most of the docs I have found are not of the type 'how to do it' but more like 'how to grok it'). After spending 1+ hour reading and trying various things I figure its time to ask some people who know ;-) So far I have the following (which almost works!)... levelplot( our.data, plot.xy = (abline(0,1,col=white)) ) 1) Please always provide a reproducible example. 2) Normal drawing commands, like abline, can't be used in trellis graphics, and vice versa. 3) Look into ?panel.functions, in particular panel.abline. My understanding is that this is how you customize a graph, providing your own panel function which calls other panel functions or direct grid drawing commands. Hope this helps. However the coordinate system / plot area being used are clearly not those of the square matrix. I guess I should point out that the axes of 'our.data' (the row and column names of the square matrix) are ordered categories of the form; seq(2,9,0.5) Thanks for any help ! Dan. Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading user input
I think the problem the OP may be having is that the code will not work if you put it in a document window in R and then tell R to source the document. At least not with R.app in MacOSX. This is what happened when I did it: source(/tmp/Rtmp0TQfA6/file10d63af1) enter the number of groups: unlink(/tmp/Rtmp0TQfA6/file10d63af1) Then the value of ANSWER is: ANSWER [1] unlink(\/tmp/Rtmp0TQfA6/file10d63af1\) If instead you select the lines and select the Execute option, then it does the right thing. (These options appear under the Edit menu in R.app, don't know about other platforms). On Apr 4, 2007, at 7:36 AM, jim holtman wrote: That is exactly what the code does. Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] substitute NA values
On Mar 30, 2007, at 10:56 AM, Gavin Simpson wrote: This is R, there is always a way (paraphrasing an R-Helper the name of whom I forget just now). Can't resist, it's one of my favorite fortunes ;) That would be Simon 'Yoda' Blomberg: library(fortunes) fortune(109) Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] E-Mail/Post Threading (was: Bonferroni p-value greater t
On Mar 29, 2007, at 3:21 PM, Marc Schwartz wrote: Since most e-mail systems (list managers, MUA's, etc.) thread based upon the headers and not the subject, as described in the above references, unless you generate a completely new e-mail, your reply will be linked to the e-mail and thread to which you are replying. It's pretty much a dichotomous situation. Use 'reply' and you get linked to the old thread. Use a 'new' e-mail and you start a new thread. If you are truly moving in a new direction, I would be tempted to start a new thread and perhaps to make it easier for readers, include a reference/link to the post in question. That way, you keep your new e-mail in a separate thread, while 'virtually' linking it back to the original that raised your interest. Perhaps moving a bit off topic here, but to elaborate a bit more on this: Each thread is really a tree, where your message is a child of the message you responded to. Since typically each person responds to the last message in the thread, this often ends up being linear. But if for instance three people respond to the same original message, this creates three children of the root node. You can see this in action here for instance: http://news.gmane.org/ gmane.emacs.ess.general It then depends on your software, how to show this tree. Most mail clients would just flatten it out into a single list, which is what we usually refer to as a thread I guess. But the richer structure is there. So based on this I would suggest simply responding to the message you want to, changing the subject appropriately. HTH, Marc Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Flipping a vector
On Mar 17, 2007, at 10:52 PM, [EMAIL PROTECTED] wrote: Hi all - A stupid question here, my apology. I would like to know how can I flip a vector in R? For example, I have a vector: a = c(1,2,3) I would like my vector b to have the following value b = c(1,2,3) But what operator I need to put to my original vector 'a' to obtain 'b'? Please let me know. Thank you. I can only assume you wanted b=c(3,2,1). In that case, try rev(a) - adschai Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Rownames are always character?
On Mar 16, 2007, at 6:26 PM, Mike Prager wrote: Mike Prager [EMAIL PROTECTED] wrote: Gurus, Can I rely on the rownames() function, when applied to a matrix, always returning either NULL or an object of type character? It seems that row names can be entered as integers, but as of now (R 2.4.1 on Windows) the function returns character vectors, not numbers (which is my desired result). (To clarify my point on this Friday afternoon: the observed behavior is my desired result. I'm just asking, can I count on it?) I would venture to guess that rownames() would always be returning something that you would then be able to use for indexing, to retrieve particular entries. The help page also implies that the return value will always be a character vector, or NULL: If do.NULL is FALSE, a character vector (of length NROW(x) or NCOL (x)) is returned in any case, prepending prefix to simple numbers, if there are no dimnames or the corresponding component of the dimnames is NULL. I would think you can count on this about as much as you can count the sum function to always add up its arguments, or something of that sort. Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Density estimation graphs
On Mar 15, 2007, at 12:37 PM, Mark Wardle wrote: Dear all, I'm struggling with a plot and would value any help! I'm attempting to highlight a histogram and density plot to show a proportion of cases above a threshold value. I wanted to cross- hatch the area below the density curve. The breaks and bandwidth are deliberate integer values because of the type of data I'm looking at. I've managed to do this, but I don't think it is very good! It would be difficult, for example, to do a cross-hatch using this technique. Don't know about a cross-hatch, but in general I use polygon for highlighting areas like that: allele.plot - function(x, threshold=NULL, hatch.col='black', hatch.border=hatch.col, lwd=par('lwd'),...) { h - hist(x, breaks=max(x), plot=F) d - density(x, bw=1) plot(d, lwd=lwd, ...) if (!is.null(threshold)) { d.t - d$xthreshold d.x - d$x[d.t] d.y - d$y[d.t] polygon(c(d.x[1],d.x,d.x[1]),c(0,d.y,0), col=hatch.col,lwd=1) } } # some pretend data s8 = rnorm(100, 15, 5) threshold = 19 # an arbitrary cut-off allele.plot(s8, threshold, hatch.col='grey',hatch.border='black') Perhaps this can help a bit. Btw, what was d.l for? Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] abs(U) 0 where U is a vector?
On Mar 14, 2007, at 7:05 AM, Alberto Monteiro wrote: prod(U 0) But this is not the most elegant solution, because there is a function to check if all [and another to check if any] component of a vector of booleans are [is] true: it's all(V) [resp. any(V)]. So: all(U 0) Just for the record, there is a actually a slight difference in the two calls of prod and all, which may or may not be important in the OP's case, in how they deal with NA's: x-c(-3,NA,2) all(x0) [1] FALSE prod(x0) [1] NA x-c(3,NA,2) all(x0) [1] NA prod(x0) [1] NA These are of course all as expected, just something to keep in mind. And in any case, all is as Alberto says more elegant, and semantically much more clear. (And not that it matters, but it is also somewhat faster). Alberto Monteiro Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rgeom for p (1-p)^(x-1)
On Mar 14, 2007, at 8:44 AM, Benjamin Dickgiesser wrote: Hi, is there a package available which lets me generate random data for the geometric distribution with density: p(x) = p (1-p)^(x-1) ? rgeom uses the density p(x) = p (1-p)^x. Why not just use rgeom, and then add 1 to all the values? Thank you, Benjamin Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Adding bars to the right of existing ones using barplot
On Mar 13, 2007, at 3:34 PM, Jason Horn wrote: I'm trying to create a barplot that has two sets of data next to each other. I'm using barplot with the add=TRUE option, but this simply adds the second dataset on top of the first, obscuring it. How do I add the new data to the right on the existing barplot so that both sets are visible? I've been playing with all sorts of option and reading the list archives with no luck. Something like this? a-rnorm(3,2,1) b-rnorm(3,2,1) barplot(rbind(a,b), beside=TRUE) ?barplot for more options ( in particular, the height argument) Thank you anyone who has the answer. - Jason Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] altering prefix to multiple variables in different locations within a command file
On Mar 12, 2007, at 4:54 PM, Bob Green wrote: Hello, I am seeking advice regarding how I might add the prefix kc$ to variables in a series of commands. The complication is that there is a large number of variables with different commands. Examples of the variables in typical commands follow. Maybe I've misunderstood what you want to do, but would with meet the case? a Error: object a not found x-list(a=5) with(x,a) [1] 5 x$a [1] 5 See ?with It is simple to use search replace for common variables such as group but I would appreciate advice about whether there is a way to readily alter the remaining variables. Bob Green Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Highlight overlapping area between two curves
On Mar 13, 2007, at 12:19 AM, Nguyen Dinh Nguyen wrote: Dear R helpers, I have a graph as following; I would like to highlight the overlapping area between the two curves. Do you know how to do this? Thank you in advance for your help. Perhaps not exactly what you wanted, but it might give you some ideas: p - seq(0.2,1.4,0.01) x1 - dnorm(p, 0.70, 0.12) x2 - dnorm(p, 0.90, 0.12) plot(range(p), range(x1,x2), type=n) lines(p, x1, col = red,lwd=4, lty=2) lines(p, x2, col = blue,lwd=4) polygon(c(p,p[1]),c(pmin(x1,x2),0), col=grey) Nguyen Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with paste()
On Mar 3, 2007, at 2:29 PM, Michael Kubovy wrote: Dear r-helpers, Could you please tell me what's missing: rbind(paste('txt.est',1:24, sep = '')) txt.est1, ... txt.est24 are vectors that I wish to rbind. the paste call just returns a vector of the strings txt.est1 and so on. Then you tell it to rbind this vector with nothing else. You might want to try something like this, though I hope someone else comes with a better solution: cmd - paste(rbind(,paste('txt.est',1:24, sep = '',collapse=, ), ), sep=) eval(parse(text=cmd)) Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] apply ? function doesnt create object
On Mar 3, 2007, at 4:28 PM, bunny , lautloscrew.com wrote: Please use - for assignments instead of = : getans = function(x=qids,bnr=1,type=block) { #generate name of matrix matnam=paste(ans,type,as.character(bnr),sep=) #display result matrix show(assign(matnam,matrix(as.numeric(as.matrix(allans[(allans[, 3] %in % x), , drop = FALSE])),ncol=dim(allans)[2]))) You are assigning things twice here. #create result matrix assign(matnam,matrix(as.numeric(as.matrix(allans[(allans[, 3] %in% x), , drop = FALSE])),ncol=dim(allans)[2])) The documentation for assign makes it pretty clear that the assignment happens by default in the current environment, so it will be local to the function unless you alter the call. The description there, along with the examples, and a study of environments, should provide you with the answer. #print info cat(the matrix,matnam,contains answers to,type,as.character (bnr)) } Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] from function to its name?
On Mar 2, 2007, at 9:42 AM, Ido M. Tamir wrote: Hi, I can get from a string to a function with this name: f1 - function(x){ mean(x) } do.call(f1,list{1:4}) get(f1) etc... But how do I get from a function to its name? funcVec - c(f1,median) funcVec [[1]] function(x){ mean(x) } I suppose you could do: funcVec but that's probably not what you want ;). Can you do this with any object in R? In what situation will you be wanting this name? I mean, how would you be given this object, but not know its name in advance? If it is passed as an argument in a function or something, then what would you consider to be its name? I.e. I don't really see where you would reasonably want to do something like this, without there being another way around it. Btw, perhaps this does what you want: as.character(quote(f)) Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] question about xtable and Hmisc
Sorry, meant for this to go to the whole list. On Mar 1, 2007, at 3:29 PM, Charilaos Skiadas wrote: On Mar 1, 2007, at 2:52 PM, steve wrote: Unfortunately, this applies to print.xtable, and not to latex. I want to know how to eliminate them using latex() 1) Why do you need to use latex() instead of print.xtable? 2) If you want to use latex(), then why are you using xtable at all, instead of latex(d) directly? Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R code for Statistical Models in S ?
I just acquired a copy of Statistical Models in S, I guess most commonly known as the white book, and realized to my dismay that most of the code is not directly executable in R, and I was wondering if there was a source discussing the things that are different and what the new ways of calling things are. For instance, the first obstacle was the solder.balance data set. I found a solder data set in rpart, which is very close to it except for the fact that the Panel variable is not a factor, but that's easily fixed. The first problem is the next two calls, on pages 2 and 3. One is plot(solder.balance), which is supposed to produce a very different plot than it does in R (I actually don't know the name of the plot, which is part of the problem I guess). Then one is supposed to call plot.factor(skips ~ Opening + Mask), which I took to mean: plot(skips ~ Opening + Mask, data=solder), and that worked, though I still haven't been able to make a direct call to plot.factor work (I keep getting a could not find function plot.factor error). Anyway, just wondered whether there is some page somewhere that discusses these little differences here and there, as I am sure there will be a number of other problems such as these along the way. Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R code for Statistical Models in S ?
On Mar 1, 2007, at 4:36 PM, Robert Duval wrote: You might want to start looking at the FAQ's http://cran.r-project.org/faqs.html in particular http://cran.r-project.org/doc/FAQ/R-FAQ.html#R-and-S Thanks I must admit that I had not looked at the FAQ's, but I have now and though it might become relevant a bit down the line, it doesn't seem to answer my questions, unless I've missed something in there. The only relevant bit I found was this phrase: Apart from lexical scoping and its implications, R follows the S language definition in the Blue and White Books as much as possible, and hence really is an “implementation” of S. The question (one part of it at least) had to do with the data sets and functions used, and the fact that some of these data sets are not there in R. In the book they refer to a data package for instance, which seems to contain things different than R's datasets package. So the question was if the necessary data sets are available somewhere. The second part was in particular about a call to plot, namely plot (solder.balance), which in S according to the white book is supposed to produce the graph in the top of page 3, for those having the book, the caption of the plot being: A plot of the mean of \textbf{skips} at each of the levels of the factors in the solder experiment. I have now found out, thanks to A handbook of statistical analyses using R, that the corresponding call in R would be: plot.design (solder). I understand of course that not every difference between implementations in S and R should be documented, but I was hoping that other people who have already gone through this book would have documented these differences. I guess not, and I will be doing so now. robert Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Generating R plots through Perl
On Mar 1, 2007, at 6:28 PM, [EMAIL PROTECTED] wrote: First off, if you are working in perl you might want to be aware of ruby and the r for ruby project: http://rubyforge.org/projects/r4ruby/ Hello, $R-send(qq (xVal - c(1,2,3,4,5,6))); $R-send(qq (yVal - c(3,5,2,6,1,5))); $R-send(qq (pdf(C:/Test Environment/R/perlPlotTest.pdf))); $R-send(qq (plot(xVal, yVal))); $R-send(qq (graphics.off())); I don't really know how to write this in perl, but could you perhaps put the last three lines all in one call to $R-send, using dev.off () then? Don't know if it would make a difference, but that's the only thing I could think of. I'm guessing something like this: $R-send(qq (pdf(C:/Test Environment/R/perlPlotTest.pdf); plot (xVal, yVal); dev.off())); As the code indicates, I am using R's pdf function to create a pdf file containing the plot of xVal and yVal. I am using the graphics.off() function rather than the dev.off() function as I get an error message of simpleError in dev.off(): cannot shut down device 1 (the null device) when dev.off() is used. Is there another way to generate and save a plot using the bridge connection that I described? If not, what would be an efficient way of generating and saving plots from within my Perl program? Any help would be greatly appreciated. Thank you, Ryan Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] someattributes
On Feb 26, 2007, at 10:35 AM, Thaden, John J wrote: I'd like to use someattributes(), as described in documentation for R version 2.4.1 (windows build) help(attributes) however, someattributes() does not seem to exist. someattributes() Error: could not find function someattributes Is this true or am I doing something wrong? My help shows it as moreattributes, not someattributes. (MacOSX, though doesn't sound like it should be platform-specific). -John Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] someattributes
On Feb 26, 2007, at 3:00 PM, Thaden, John J wrote: Thanks for correcting me. Actually, my windows R documentation says mostattributes(), but it makes no difference -- none of the three show up as function names or R objects. That's because there is no mostattributes function, it only works as an assignment: ?mostattributes- Example: x - c(2,3,4) mostattributes(x) - list(foo=bar) x [1] 2 3 4 attr(,foo) [1] bar -John Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] returns from dnorm and dmvnorm
On Feb 26, 2007, at 3:03 PM, A Hailu wrote: Hi All, Why would calls to dnorm and dmvnorm return values that are above 1? For example, dnorm(0.3,mean=0, sd=0.1) [1] 3.989423 Because dnorm gives you the density function, whose integral is the distribution function, which is likely what you want. Try: pnorm(0.3,mean=0, sd=0.1) This is happening on two different installations of R that I have. Thank you. Hailu Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] gsub: replacing a.*a if no occurence of b in .*
All these methods do assume that you don't have nested tag's, like so: tagtagfoo/taguseful stuff/tagsome garbage/tag For that you would really need a true parser. So I would double-check to make sure this doesn't happen. Do you have any control on where those XML files are generated though? It sounds to me it might be easier to fix the utility generating those XML files, since it clearly is doing something wrong. On Feb 24, 2007, at 11:07 AM, Gabor Grothendieck wrote: I assume tag is known. This removes any occurrence /tag.*/tag where .* does not contain tag or /tag. The regular expression, re, matches /tag, then does a greedy match (?U) for anything followed by /tag but uses a zero width lookahead subexpression (?=...) for the second /tag so that it it can be rematched again. gsubfn in package gsubfn is like the usual gsub except that instead of replacing the match with a string it passes the match to function f and then replaces the match with the output of f. See the gsubfn home page: http://code.google.com/p/gsubfn/ and vignette. Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] gsub: replacing a.*a if no occurence of b in .*
On Feb 24, 2007, at 11:37 AM, Gabor Grothendieck wrote: The _question_ assumed that, which is why the answers did too. Oh yes, I totally agree, the file snippet the OP provided did indeed assume that, though nothing in the text of his question did, so I wasn't entirely clear whether the actual file that is going to be processed has this form or not. So I just wanted to make sure the OP is aware of this limitation, in case the actual file is more problematic. But most importantly, I wanted to suggest a reevaluation, if possible, of the process that generates these XML's, and perhaps fixing that, instead of patching the problem after it has been created. Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Cross-tabulations next to each other
I have the following relatively simple problem. Say we have three factors, and we want to create a cross-tabulation against each of the other two: x - factor(rbinom(5, 1, 1/2)) y - factor(rbinom(5, 1, 1/2)) z - factor(rbinom(5, 1, 1/2)) table(x,y) table(x,z) This looks like: y x 0 1 0 2 0 1 1 2 z x 0 1 0 1 1 1 2 1 I would like to get (surely this will look a mess in non-monospaced fonts): yz x 0 1 0 1 0 2 0 1 1 1 1 2 2 1 Or something along those lines. Then I would like to convert this to a LaTeX table, in the obvious sort of way. I couldn't find an answer with a quick look through the documentation. Are these two things already done, before I try to roll my own? Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cross-tabulations next to each other
Hi Dimitris, On Feb 22, 2007, at 10:27 AM, Dimitris Rizopoulos wrote: maybe cbind() is close to what you're looking for, e.g., tb1 - table(x, y) tb2 - table(x, z) cbind(tb1, tb2) Yes, that was my first thought too, and it does place the values where I want them, but it completely destroys the names, which I'd like to keep, i.e. it doesn't treat it as a table any more. The resulting LaTeX table I would like to have a very top row, with multicolumn titles, one for each factor, then a second row with the levels for each factor, and then below those the data. I could I guess add that stuff separately, but I was hoping someone had already done that part. Best, Dimitris Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [MacOSX] Screencast for R and Sweave on TextMate
This will likely be of interest only (if at all) to MacOSX users. I use a particular editor called TextMate which I find particularly suitable for pretty much any task I have to do, from Ruby to LaTeX to R. Its R support is probably not quite up to par with ESS yet, but it is at a decent state I would say. Anyway the reason of the message is that I have prepared a small screencast showing how R and Sweave look and feel like in TextMate, prompted by a thread in the MacOSX TeX mailing list. I thought it might be of interest to some people here also, so here is the link: http://skiadas.dcostanet.net/afterthought/2007/02/21/r-and-sweave-in- textmate/ It's not much, and the sound is not all that great, but it mostly shows how things feel like. Hoping this is not too off topic... Haris Skiadas Department of Mathematics and Computer Science Hanover College PS: Yes, it is not a free product, though it is the best money I have ever spent. No, I am not the developer, and do not profit from it. But I am actively involved in its support for LaTeX, R and Sweave. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Proper way to typeset the symbol for R in LaTeX?
Hoping this is not off topic... I am in the process of writing some tutorials for my students for learning R, and naturally I'm using Sweave for this. So suddenly a question occurred to me: LaTeX has a recommended way of typesetting the TeX and LaTeX symbols, via the \TeX and \LaTeX commands. Is there a similar command for the R symbol, or in general are there any guidelines/recommendations on how to typeset the letter R when referring to the R language? Haris __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] make check failure, internet.Rout.fail, Error in strsplit
On Feb 12, 2007, at 6:28 PM, Paul Lynch wrote: I'm trying to build R on RedHat EL4. The compile went fine, but a make check ran into a problem and produced a file internet.Rout.fail. Judging by the last part of that file, it was trying to run an R routine called httpget to retrieve the URL http://www.stats.ox.ac.uk/pub/datasets/csb/ch11b.dat. The precise error it encountered was: Error in strsplit(grep(Content-Length, b, value = TRUE), :)[[1]] : subscript out of bounds So, it looks like the data it read from that URL was not what was expected. I tried mimicking the script's request of the header information for that URL, and got back the following header lines: HTTP/1.1 200 OK Date: Mon, 12 Feb 2007 23:22:06 GMT Server: Apache/2.0.40 (Red Hat Linux) Last-Modified: Fri, 19 May 1995 10:27:04 GMT ETag: 7bc27-836-39a78e00 Accept-Ranges: bytes Content-Type: text/plain; charset=ISO-8859-1 Content-length: 2102 Connection: Keep-Alive The script appears to be looking for a Content-Length field, but as you can see the returned header is Content-length with a lower-case l. I don't know R yet, so I'm not sure if the grep in the test code is case-sensitive or not, but if it is, that would seem to be the problem. But then, surely everyone would be hitting this error? The grep is indeed case sensitive, as a quick test can show. However, the header I got back when I tried the above address had Length in it: HTTP/1.1 200 OK Date: Tue, 13 Feb 2007 01:40:48 GMT Server: Apache/2.0.40 (Red Hat Linux) Last-Modified: Fri, 19 May 1995 10:27:04 GMT ETag: 7bc27-836-39a78e00 Accept-Ranges: bytes Content-Length: 2102 Content-Type: text/plain; charset=ISO-8859-1 X-Pad: avoid browser bug ( I used curl for this, if it makes a difference) Hope this helps in some way. --Paul Haris __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] merge words=data name
On Feb 11, 2007, at 8:57 PM, Mark W Kimpel wrote: Duncan, Both yours and Gabor's methods were far superior to mine. I am curious why you like Gabor's better than yours. Don't know if the following is why Duncan prefers Gabor's method, but here is why I would avoid the eval version: In general eval is very dangerous to call, unless you have full control over the expression you are asking it to evaluate. For instance imagine the following: txt- system('ls') eval(parse(text=txt)) (replace 'ls' with 'dir' on a windows system) With these two commands you will get a listing of everything in your home directory, or wherever the current path for R is. But suppose instead that the 'ls' was replaced by 'rm -rf *'. Then EVERYTHING in that directory will be DELETED, for ever, NO questions asked. (at least on a unix based system, perhaps even Windows with cygwin, I don't know. There is probably a similar call for windows). In other words, make sure you know EXACTLY what the thing you are evaluating is. I am not saying this is necessarily a danger here, but it brings up an important point that is good to be aware of. Haris __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SAS, SPSS Product Comparison Table
On Feb 10, 2007, at 4:26 PM, Muenchen, Robert A (Bob) wrote: Surely R can't do for free what [fill in a SAS or SPSS product here] does? To try to address those, I've compiled a table that is organized by the product categories SAS and SPSS offer. Keep in mind that I still know far more about SAS and SPSS than I do about R, so I could really use some help with this. The table is below in tabbed form. I would appreciate it if the many R gurus out there would look it over and send suggestions. I'll add it as an appendix when it's done (well, as done as a moving target like this ever is!) Great idea, this should come in handy! Here is a more readable version of Bob's table (don't know if I can post attachments like that to the list, so I figured I'll put it up like this): http://skiadas.dcostanet.net/uploads/StatsComparisonTable.pdf Thanks, Bob Haris __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] setting a number of values to NA over a data.frame.
Once again I forgot to reply to the whole list On Feb 9, 2007, at 8:39 AM, Charilaos Skiadas wrote: On Feb 9, 2007, at 8:13 AM, John Kane wrote: The problem is that my dataframe has 1,s in about 50% of the columns and I only want it to apply to a few specified columns. My explanation may not have been clear enough. Using your example,I want all values for tio2 set to 1 but not any values in al2o3 whereas zeta[zeta==1]-NA is also changing al2o3[3] to NA. You need to index the zeta in zeta==1 in the same way as you do with the zeta outside. I think the point is that if you do zeta[,cols][zeta==1] - NA, then the recycling of NA to obtain the correct number of elements is done based on the elements in zeta[,cols]. But since zeta==1 is a much longer vector than zeta[,cols], then zeta[,cols][zeta==1] has a number of NA objects attached to its end, and hence has now a longer length than the recycled NA that is supposed to replace it. But perhaps someone more expert in the internals can explain it in greater detail, if the above is not right. In the mean time, the following seems to work: y - rbinom(20, 1, 1/2) dim(y) - c(5,4) colnames(y) - c(one, two, three, four) x - as.data.frame(y) cl - c(two, three) x[,cl][x[,cl]==1] - NA x one two three four 1 0 0NA0 2 0 0 00 3 1 0 00 4 0 NA 01 5 1 0 01 Thanks Haris Haris __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] NEWBIE: @BOOK help?
On Feb 8, 2007, at 11:07 AM, Zembower, Kevin wrote: In Henric's recent post, he included this output: @BOOK{R:Harrell:2001, AUTHOR = {Frank E. Harrell}, TITLE = {Regression Modeling Strategies, with Applications to Linear Models, Survival Analysis and Logistic Regression}, PUBLISHER = {Springer}, YEAR = 2001, NOTE = {ISBN 0-387-95232-2}, URL = {http://biostat.mc.vanderbilt.edu/twiki/bin/view/Main/RmS} } Can someone tell me how this is generated? I've noticed this in a few recent posts. I attempted: It is BibTeX: http://www.bibtex.org/ http://en.wikipedia.org/wiki/BibTeX Haris __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R in Industry
On Feb 8, 2007, at 12:48 PM, Ben Fairbank wrote: If my company came to depend heavily on a fairly obscure R package (as we are contemplating doing), what guarantee is there that it will be available next month/year/decade? I know of none, nor would I expect one. I would imagine that if there was a package that really needed updating, then your company could hire an R programmer for a short time to fix whatever needs fixing, and that would be a much smaller expense than licensing an expensive package like those other ones out there. But perhaps I am completely wrong in this, I am relatively far from the industry world. Haris __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Scope
On Feb 8, 2007, at 4:52 PM, Roger Bivand wrote: Assigning to the global environment will overwrite objects unless one is careful, and even with years of experience only seems worth considering when no feasible alternative exists; on consideration, alternatives usually appear. Or to paraphrase fortune(106): If the answer is global variables, then you should usually rethink the question. Haris __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Reading expressions from character vectors
Greetings, I have a problem that I am sure is very straightforward, but I just can't wrap my head around it. I've read the help pages on text, plotmath, expression, substitute, but somehow I can't find the answer to this simple question. Basically consider the following example: plot( NULL, xlim = c(0,2), ylim = c(0,2) ) expressions - expression( -infinity, infinity ) text( c(0.5,1.5), 1.5, expressions ) labels - c( -infinity, infinity ) text( c(0.5,1.5), 0.5, as.expression(labels) ) I want the character vector labels to be interpreted as an expression vector, and so to appear just like the expressions vector. Is this possible? I mean yes, it is probably possible, but how? I suppose the problem is that the result of as.expression(labels) is expression(-infinity, infinity) instead of expression(-infinity, infinity), as I would have liked. I just can't figure out how to convert it to the right thing. Haris __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading expressions from character vectors
I keep forgetting that this list doesn't default to reply-to-all ;). Sorry hadley, you'll get this twice. On Feb 4, 2007, at 11:39 AM, Charilaos Skiadas wrote: On Feb 4, 2007, at 11:19 AM, hadley wickham wrote: text( c(0.5,1.5), 0.5, parse(text=labels)) ? You need to parse the text to get to the expression I just love the response rate and speed of this list, it's one of the best lists I am subscribed to. Thank you all for your responses, it makes more sense now (though I'll probably still want to digest the whole thing in my head for a couple of days, to understand exactly what is going on under the hood ;) ). Hadley Haris Haris __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reference to dataframe and contents
On Feb 4, 2007, at 3:42 PM, Rene Braeckman wrote: My question is how to construct the equivalent of myDF$myCol that can be used as such. Or is there a better solution? I think what you want is ?with and wrapping the whole work you want to do in a function. Thanks. The help and discussions on this forum are the best. Rene Haris __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Double labels and scales
On Feb 2, 2007, at 8:07 AM, Lorenzo Isella wrote: Dear All, Say you want to plot, on the same figure two quantities, concentration and temperature, both as function of the same variable. I'd like to be able to put a certain label and scale on the y axis on the left of the figure (referring to the temperature) and another label and scale for the concentration on the right. Any suggestion about what to do? I am sure it is trivial, but I could not find what I needed on the net. I found some reference about a plot.double function by Anne York, but I do not think that I need anything so elaborate. Many thanks Sounds like you just need the following two commands probably: ?axis ?mtext Lorenzo Haris __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] gsub regexp question
On Jan 27, 2007, at 3:41 PM, Phillimore, Albert wrote: Dear R Users, I am trying to users gsub to remove multiple cases of square brackets and their different contents in a character string. A sample of such a string is shown below. However, I am having great difficulty understanding regexp syntax. Any help is greatly appreciated. Ally tree STATE_286000 [lnP=-12708.453945423369] = [R] ((15 [rate=0.009761226401396686]:7.040851727747465,17 [rate=0.011500289631135564]:7.040851727747465) [rate=0.010986570567484494]:2.257049446900292,(18 [rate=0.009123432243563103]:2.461289418776003,19 [rate=0.00981822432115329]:2.461289418776003) Is this what you want? I tend to prefer perl regular expressions: str - tree STATE_286000 [lnP=-12708.453945423369] = [R] ((15[rate=0.009761226401396686]:7.040851727747465,17 [rate=0.011500289631135564]:7.040851727747465) [rate=0.010986570567484494]:2.257049446900292,(18 [rate=0.009123432243563103]:2.461289418776003,19 [rate=0.00981822432115329]:2.461289418776003) gsub(\\[[^\\]]+\\],,str, perl=T) [1] tree STATE_286000 = ((15:7.040851727747465,17:7.040851727747465):2.257049446900292, (18:2.461289418776003,19:2.461289418776003) As an explanation, \\[ and \\] match the two square brackets you want. We need to escape the brackets with the backslashes because they have a special meaning in perl regular expressions. In perl regexps, [] stands for match a single character that is like what we have in the For instance [ab] will match an a or a b. [a-z] will match all lowercase characters. A ^ as a first character in there means match all but what follows. for instance [^a-z] means match anything but lowercase characters. So [^\\]] means match any character but a closing bracket. Finally the plus sign afterwards means: match at least one. So [^\\]] + means match any sequence of characters that does not contain a closing bracket. So the whole thing now matches an opening bracket, followed by all characters until a corresponding closing bracket. This will not work if you have nested pairs of brackets, [like [so]]. That is a tad more delicate, and we can discuss it if you really need to deal with it. Haris __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] efficient code. how to reduce running time?
On Jan 21, 2007, at 8:11 PM, John Fox wrote: Dear Haris, Using lapply() et al. may produce cleaner code, but it won't necessarily speed up a computation. For example: X - data.frame(matrix(rnorm(1000*1000), 1000, 1000)) y - rnorm(1000) mods - as.list(1:1000) system.time(for (i in 1:1000) mods[[i]] - lm(y ~ X[,i])) [1] 40.53 0.05 40.61NANA system.time(mods - lapply(as.list(X), function(x) lm(y ~ x))) [1] 53.29 0.37 53.94NANA Interesting, in my system the results are quite different: system.time(for (i in 1:1000) mods[[i]] - lm(y ~ X[,i])) [1] 192.035 12.601 797.094 0.000 0.000 system.time(mods - lapply(as.list(X), function(x) lm(y ~ x))) [1] 59.913 9.918 289.030 0.000 0.000 Regular MacOSX install with ~760MB memory. In cases such as this, I don't even find the code using *apply() easier to read. Regards, John Haris __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] efficient code. how to reduce running time?
On Jan 22, 2007, at 10:39 AM, John Fox wrote: One thing that seems particularly striking in your results is the large difference between elapsed time and user CPU time, making me wonder what else was going on when you ran these examples. Yes, indeed there were a lot of other things going on, this is the only machine I have and I use it continuously. I'll try to run another test tonight when the machine is not in use. It did seem a very striking difference though. But am I wrong in thinking that these measurements should be independent of what other applications are running at the same time, and should measure exactly the time in terms of CPU cycles needed to finish this task, regardless of how often the process got to use the CPU? I guess I was working under that assumption, which indeed makes the above comparison a very unfair one, because there was a lot more going on during the first system.time call. Still, the difference is quite large, which of course could simply have to do with the internals of the two commands, coupled with Prof. Ripley's comments about malloc in Mac OS X. Regards, John Haris __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Questions about xtable and print.xtable
I have been using the wonderful xtable package lately, in combination with Sweave, and I have a couple of general questions along with a more particular one. I'll start with the particular question. I basically have a 1x3 array with column names but no row names. I want to create a latex table with column setting set to |rrr|. I want the column names to appear, but the row names not to appear. The code I am trying is this: library(xtable) x - matrix(c(1:3), c(1,3), dimnames=list(NULL,c(1:3))) tab - xtable(x, align=||) print.xtable(tab, include.rownames=FALSE) print.xtable(tab) The problem here is that the xtable call requires an align value that has one extra row setting, I suppose to account for a possible row name. However, the first print.xtable call seems to ignore the align argument set in the xtable call, when include.rownames is included. Any workarounds will be most welcomed. More generally, I have the following questions: 1) Why are the include.rownames and include.colnames parameters not appearing in the xtable call, but only in the print.xtable call instead? Why do I need to specify n+1 arguments for things like align and digits, when I don't want the row names to be printed? In general, why are the align and digits calls not setable in print.xtable, but only in xtable? 2) I like to enclose my tabular environments in a center environment, instead of a table environment. Unless I've missed it, I don't see how I can do that from within the xtable package. Is this really not possible, and if so why not? The latex.environments setting seems to only be allowed when floating=TRUE, which is exactly what I want to avoid. Any particular reason it is not allowed when floating=FALSE as well? That's it really, thanks in advance for any responses. Haris __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] efficient code. how to reduce running time?
On Jan 21, 2007, at 5:55 PM, miraceti wrote: Thank you all for lookin at it. I'll fix the code to preallocate the objects. and I wonder if there is a way to call anova on all the columns at the same time.. Right now I am calling (Y~V1, data) from V1 to V50 thru a loop. I tried (Y~., data) but it gave me different values from the results I get when I call them separately, So I can't help but call them 25,000 times... Have you looked at lapply, sapply, apply and friends? Haris __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Integration + Normal Distribution + Directory Browsing Processing Questions
On Jan 21, 2007, at 2:27 PM, Nils Hoeller wrote: Now I want R to read.table all files within a given directory and process them one by the other. ?list.files ?for Haris __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to format R code in LaTex documents
On Jan 15, 2007, at 8:42 AM, Benjamin Dickgiesser wrote: Hi, I am planning on putting some R script in an appendix of a LaTex document. Can anyone recommend me a way of how to format it? Is there a way to keep all line breaks without having to insert \\ in every single line? I think the LaTeX environments lstlisting and/or verbatim might do what you want. Thank you! Benjamin Haris __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Controlling size of boxplot when it is added in a plot
Sorry, meant to send this to the list. On Jan 14, 2007, at 4:28 PM, Charilaos Skiadas wrote: On Jan 14, 2007, at 5:50 AM, Chuck Cleland wrote: Try setting the boxwex argument instead: Thanks Chuck, that does indeed seem to work pretty well. I'm not quite sure what the best way to determine an appropriate size for the boxplot would be, but the following kind of works, at least for the cases I tried. Though I'm not entirely happy with it. And I'm sure I've made a bunch of errors along the way, that someone more experienced in R could spot easily. Feel free to criticize the code. the boxwex default I guess is probably terribly named. One over it is supposed to be the size of the boxplot over the size of the histogram. force.odd - function(x) { x + 1 - x %% 2; } boxhist - function(x, boxwex = 8, ...) { hs - hist(x, breaks = 20, plot = F) space - force.odd(max(floor(hs$counts / boxwex), 1)) plot(hs, main = NULL, ylim = c(-space, max(hs$counts)), ...) boxplot(x, horizontal = T, axes = T, add = T, at = -space/2, boxwex = space) } x - rweibull(300,1,1); boxhist(x) Haris Haris __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Controlling size of boxplot when it is added in a plot
Greetings, I am trying to add a boxplot to the bottom of a histogram, right between the histogram bars and the x axis. Here is the code I am using at the moment (the par line is probably not relevant for our discussion): hs - hist(x, breaks = 20, plot = F) par(mar = c(3,3,2,1)) hist(x, breaks = 20, main = NULL, ylim = c(-2, max(hs$counts))) boxplot(x, horizontal = T, axes = T, add = T, at = -1) The problem is the following. As it is, the boxplot restricts itself to the -1 line. I would like it to occupy both the -1 and the -2 lines ( I guess more generally I would like to control how much vertical space the embedded boxplot occupies). I tried to set the width parameter in the boxplot, but that seemed to have no effect at all. On an OT note, I haven't seen this way of combining a histogram with a boxplot (perhaps I haven't looked really hard). I thought it would be useful for my students to see them next to each other, to develop a feeling for what histograms might correspond to what boxplots. Is there perhaps some reason why I should avoid showing those graphs to them like that, that I am not aware of? Or just a reason why I haven't seen them combined like this much? TIA Charilaos Skiadas Department of Mathematics Hanover College P.O.Box 108 Hanover, IN 47243 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How do I create an object in the Global environment from a function
On Dec 14, 2006, at 7:42 AM, Rainer M Krug wrote: myfunc - function() b - 34 I would add a warning here. It is generally not a good idea for a function to have side-effects. In this case, if there is a globally defined value for b already, it will be overwritten. If this function is in a package say, and someone else uses it, or you use it after a very long time and have forgotten its internals and the fact that it's messing with the Global Environment, this might lead to some bugs that are really hard to spot. Rainer Haris __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] tests for NULL objects
On Nov 29, 2006, at 7:58 PM, Benilton Carvalho wrote: But that's okay, just a matter of adding an extra test (already done). I'm guessing it would be something like: all(is.na(v) | v==2) When I asked for clarification about the reasons for this, I assumed that: if all(v) is TRUE == any(v) is TRUE; for all (logical) v... When actually: if all(v) is TRUE == any(v) is TRUE; for all (logical) v of length = 1. I like to think of it as: in order for the statement all elements of v are TRUE to be false, one would need to provide an element of v which is false. Since that is not possible if v has no elements, then the statement all elements of v are TRUE can't be false, hence has to be true. Similarly, the statement any(v) is TRUE translates to there is an element of v that is true. Since there are no elements in v, this can't possibly happen. So this statement is false. As for the sums and products, the explanation that Martin linked to sums it up nicely. Basically, suppose we have two vectors, v,w, and v is actually numeric(0). Then sum(c(v,w)) ought to be sum(v) + sum(w) (since that would be the case for any other honest vectors). But since c(v,w) is really w, the only way this will happen is if sum(v) = 0. Same for products, since in that case prod(c(v,w))=prod(v) times prod(w). Thank you, benilton Haris __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Making a case for using R in Academia
As a addendum to all this, this is one of the responses I got from one of my colleagues: The problem with R is that our students in many social science fields, are expected to know SPSS when they go to graduate school. Not having a background in SPSS would put these students at a disadvantage. Is this really the case? Does anyone have any such statistics? Charilaos Skiadas Department of Mathematics Hanover College P.O.Box 108 Hanover, IN 47243 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Making a case for using R in Academia
John (and everyone else), On Nov 9, 2006, at 4:20 PM, John Fox wrote: Dear Charilaos, It's very difficult to give definitive answers to the questions that you pose because we don't have any good data (at least as far as I know) about how widely R is used. Yes it certainly isn't an easy question to answer, and I don't necessarily need complete data. The situation as presented to me by my colleagues in the Social Sciences is really that SPSS is the standard, so I am basically hoping for evidence to just shake this view (unless it is true, but I have to say I doubt it). I am more hoping for particular examples of cases in the Social Sciences, where SPSS is far from the standard, and the programs and schools you mention below are exactly the sort of thing I was looking for! For now unfortunately we will be sticking with SPSS, despite the considerable cost (which was mainly our problem at the moment, so SAS is not even being considered for that reason), but I am hoping to slowly build enough evidence of the extensive use of R for when all this comes up again. Even just a list of the universities and departments that use it would be very helpful, so any of you who would like to send such information about your departments or other departments you might know about, off the list, it would be extremely helpful to me. Perhaps it would be useful for such a list to exist somewhere online? (I guess you could say google, but I find it hard to use google to look up such information on R, for the obvious reason of the shortness of the name. [snip] Among social scientists the picture is not as clear. My impression is that SPSS is used very widely for low-levels methods courses taught to undergraduates, and not very extensively in the best social-science graduate programmes. I would expect that, at present, Stata use in social- science graduate programmes exceeds R, and that SAS and R would also be used fairly widely. In my opinion, these are the only reasonable choices -- I don't think that SPSS is sufficiently capable to compete with R, Stata, or SAS. There are, for example, several different packages used at the ICPSR Summer Program in Quantitative Methods for Social Research, but several relatively advanced courses now use R. Likewise, the Oxford Spring School, hosted by the Department of Politics and International Relations at Oxford, has mostly employed R and Stata. Thanks, I will be looking into those. I basically just need to look at various universities and their social sciences departments, and see what they use there. As other suggested, I will be looking into the number of books and papers in R and how it is increasing every year. Once again thank you all for your comments, this has been a very helpful discussion for me, and it's a great pleasure to find such a helpful and friendly mailing list. Of course, my own preference is for R. Regards, John Haris __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Making a case for using R in Academia
Hello, new to the list, first message. This question perhaps might be more appropriate to R-sig-teaching, and I'd be happy to take it there if this is not the right place for it. I am teaching applied statistics at a small liberal arts college with limited resources, and we are currently using SPSS for our courses. Mainly the reason for this, as I understand it, is that this is what is used out in the real world, or at least this is our perception of it. I have only used R for my own stuff for about six months, and my training is not in statistics, so I am not very aware of what it can do in other disciplines, especially Sociology and Psychology. I would like to make a case to the other departments here for using R instead, so I was hoping that there might be some resources out there that talk about the extend in which R is being used outside of academia, or in general any other resources that talk about R as a practical alternative to the other non-free statistical packages. Perhaps some statistics, or particular examples of use? Any links would be greatly appreciated. Thanks for any thoughts/input into this. Charilaos Skiadas Department of Mathematics Hanover College P.O.Box 108 Hanover, IN 47243 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.