[R] auto clustering with Rgraphviz: possible?
I am working with about 600 nodes in an Rgraphviz graph. Within this graph there are, when plotted, about 8 obvious clusters that are highly connected within them but do not share connections between them. I have a wrapper function that handles a lot of tasks automatically for me like setting various node and edge attributes. What I would like to do is be able to auto-generate plots for each of these independent clusters. Is there a way to programatically identify these clusters and use this identificaiton to create either subgraphs or clusters? #For example library(graph) library(Rgraphviz) g1_gz - gzfile(system.file(GXL/graphExample-01.gxl.gz,package=graph), open=rb) g11_gz - gzfile(system.file(GXL/graphExample-11.gxl.gz,package=graph), open=rb) g1 - fromGXL(g1_gz) g11 - fromGXL(g11_gz) g1_11 - join(g1, g11) plot(g1_g11) # yields 2 obvious clusters plus 8 nodes with no edges. What I would like to be able to do is automatically identify the 2 clusters, so that they can be separately plotted. Thanks, Mark Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 399-1219 Skype No Voicemail please [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ascii or regex code for alt-enter for Excel
I need to write a table that can be opened in Excel or OpenOffice such that there are newlines embedded within cells. After much Googling and futzing, I can't figure out how to do this. The way to do this within Excel is alt-Enter and I've tried '/n', '/n/r', '/r/n' per some web suggestions without luck. Anybody know what character or ASCII code to use for this? Thanks, Mark Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 399-1219 Skype No Voicemail please [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ascii or regex code for alt-enter for Excel
Thanks to all for your helpful replies. In my initial email I did mistakenly write \n when I had correctly been using \n in my code. The key for me seem to be using the approach Bill suggested, i.e., writing to a binary file. If I simply do write.csv(d, d.text.csv, row.names = FALSE, col.names = FALSE) Then the newlines are not represented within the cell, but create new cells, which is the problem I was originally having. I do wonder what ASCII character is represented in Windows with alt-Enter. I'm actually working in a Linux environment, its my boss who uses Windows. I was trying to find a text only output that would do the trick. From my web search I learned that using alt, especially with the number pad, allows for the entry of all sorts on unusual ASCII characters. I found a couple tables of them, but none referenced alt-Enter. Someone on a MS help site suggested using the ASCII numerics 10 or 13, which I believe are just new-line and line-feed. Those didn't work for me. I guess I'll be content with having a working solution and leave the mystery unsolved. Thanks Bill! Mark Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 399-1219 Skype No Voicemail please On Wed, Oct 20, 2010 at 2:13 PM, William Dunlap wdun...@tibco.com wrote: From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of William Dunlap Sent: Wednesday, October 20, 2010 10:47 AM To: Duncan Murdoch; Mark Kimpel Cc: r-help@r-project.org Subject: Re: [R] ascii or regex code for alt-enter for Excel I think Excel wants a \n for newlines in a text cell entry but \r\n to separate rows of a csv file. You may have to open the file in binary mode and put in the \r\n at line ends by hand to achieve this from R, as it tranlates all \ns to \r\ns when writing them to a file. (\n is not the same as /n in R.) I omitted an example: d - data.frame(nLines=c(3,2,1), entry=c(three\nline\nentry, two line\nentry, one line entry)) theFile - file(c:/temp/d.csv, open=wb) # write in binary mode write.csv(d, theFile, eol=\r\n) close(theFile) Now c:/temp/d.csv in Excel and you should see the multiline text entries. (Expand the cells and/or formula entry area to see all the lines in an entry.) Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Duncan Murdoch Sent: Wednesday, October 20, 2010 10:29 AM To: Mark Kimpel Cc: r-help@r-project.org Subject: Re: [R] ascii or regex code for alt-enter for Excel On 20/10/2010 1:04 PM, Mark Kimpel wrote: I need to write a table that can be opened in Excel or OpenOffice such that there are newlines embedded within cells. After much Googling and futzing, I can't figure out how to do this. The way to do this within Excel is alt-Enter and I've tried '/n', '/n/r', '/r/n' per some web suggestions without luck. You may need to ask an Excel expert or MS tech support. What character is Excel looking for? (Or it is possible that you have what you need, but used forward slashes when you should have used backslashes. The newline character is \n, not /n, in R.) Duncan Murdoch Anybody know what character or ASCII code to use for this? Thanks, Mark Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 399-1219 Skype No Voicemail please [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting
[R] model set up question
I need to compare gene expression differences between multiple line pairs of alcohol preferring and non-preferring rat lines. I have 5 such line pairs, 3 are unrelated but two were derived independently from the same parent stock. For each line, there are 10 samples. I'll be testing multiple genes, but for simplicity assume just one gene whose expression is measures as geneExpression. Alcohol Preferring Alcohol Non-Preferring Line Pair X or Non-X Line 1a Line 1b 1 Non-X Line 2a Line 2b 2 Non-X Line 3a Line 3b 3 Non-X Line X4aLine X4b X4 X Line X5aLine X5b X5 X If all the line pairs were independently derived, a model could be geneExpression ~ Line.Pair + AlcoholPreference with the factor of interest being Alcohol Preference but, there is the X factor, with the 2 X strain-pairs being related, whereas the others are unrelated to each other and also to the 2 X strain-pairs. We want to take into account the fact that there are really only 4 parent populations of these 5 strain-pairs so as to decrease the weighting put on the X strains in the model. What would the most appropriate approach to this be and how would the model be written? Thanks, Mark Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 399-1219 Skype No Voicemail please [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] vectorizing ANOVA over a vectorized linear model
Is it possible to vectorize anova over the output of a vectorized lm? I have a gene expression matrix with each row being a gene and columns for samples. There are several factors with interactions. I can get p values by looping over the matrix with lm and anova, but I would like to make this as computationally efficient as possible. I am able to vectorize the lm command, but when I try to use anova on the resultant model object I get just one anova result. Is what I want to do possible? And, yes, I am quite conversant with Limma and other BioC packages, I have my reasons for wanting to use lm and anova. Thanks, Mark Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 399-1219 Skype No Voicemail please [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] vectorizing ANOVA over a vectorized linear model
Hadley, Thanks for pointing me to some good articles. Unfortunately, I have already read Holger's and my main concern is computational efficiency. The buzzword on this list regarding efficient code is vectorization. I am, frankly, surprised that there is a way to vectorize analysis of complex models but not to extract p values from them. Dieter's reply points one towards using lapply, which in my experience allows for compact code but not an increase in efficiency (one of Holger's examples demonstrates this). Anyway, I cannot see how to go from Holger's fairly simple examples to one that involves a complex model with several factors and interactions. Limma, which does provide p values if contrasts are used, is blindingly fast but I believe Gordon Smyth has hard-coded most of this excellent package in C. I was hoping to achieve something similar without the use of the moderated t-statistics that Limma uses. Looks like I am stuck using loops with mcapply. Thank goodness for my Corei7! Mark Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 399-1219 Skype No Voicemail please On Sun, Mar 7, 2010 at 2:08 PM, hadley wickham h.wick...@gmail.com wrote: Hi Mark, If efficiency is a concern you might want to read Computing Thousands of Test Statistics Simultaneously in R by Holger Schwender and Tina Müller, http://stat-computing.org/newsletter/issues/scgn-18-1.pdf. If you just want to do it, see the examples in http://had.co.nz/plyr/plyr-intro-090510.pdf. Hadley On Sun, Mar 7, 2010 at 7:03 PM, Mark Kimpel mwkim...@gmail.com wrote: Is it possible to vectorize anova over the output of a vectorized lm? I have a gene expression matrix with each row being a gene and columns for samples. There are several factors with interactions. I can get p values by looping over the matrix with lm and anova, but I would like to make this as computationally efficient as possible. I am able to vectorize the lm command, but when I try to use anova on the resultant model object I get just one anova result. Is what I want to do possible? And, yes, I am quite conversant with Limma and other BioC packages, I have my reasons for wanting to use lm and anova. Thanks, Mark Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 399-1219 Skype No Voicemail please [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R CMD SHLIB requesting makefile. Is a makefile required?
A few years ago I used the following to compile a shared object that I wanted to call from R and it worked just fine. R CMD SHLIB -o ~/my_C/R.shared.so/cocite.mat.so cocite.mat.c Now when it is executed I receive the following error message: make: *** No rule to make target `cocite.mat.o', needed by `/home/mkimpel/my_C/R.shared.so/cocite.mat.so'. Stop. I've consulted R CMD SHLIB --help and R-exts.pdf and neither indicates that a makefile is now required by R CMD SHLIB, yet Googling the error message leads me to lots of threads regarding makefiles. Has something changed? Am I doing something wrong? Following is my gcc version and sessionInfo(). Thanks, Mark kimpel-desktop ~/my_C: gcc --version gcc (Ubuntu 4.4.1-4ubuntu8) 4.4.1 Copyright (C) 2009 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. sessionInfo() R version 2.10.0 Patched (2009-10-27 r50222) x86_64-unknown-linux-gnu locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices datasets utils methods base Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 399-1219 Skype No Voicemail please __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] problem with split eating giga-bytes of memory
I'm having trouble using split on a very large data-set with ~1400 levels of the factor to be split. Unfortunately, I can't reproduce it with the simple self-contained example below. As you can see, splitting the artificial dataframe of size ~13MB results in a split dataframe of ~ 144MB, with an increase memory allocation of ~10 fold for the split object. If split scales linearly, then my actual 52MB dataframe should be easily handled by my 12GB of RAM, but it is not. instead, when I try to split selectSubAct.df on one of its factors with 1473 levels, my memory is slowly gobbled up (plus 3 GB of swap) until I cancel the operation. Any ideas on what might be happening? Thanks, Mark myDataFrame - data.frame(matrix(LETTERS, ncol = 7, nrow = 399000)) mySplitVar - factor(as.character(1:1400)) myDataFrame - cbind(myDataFrame, mySplitVar) object.size(myDataFrame) ## 12860880 bytes # ~ 13MB myDataFrame.split - split(myDataFrame, myDataFrame$mySplitVar) object.size(myDataFrame.split) ## 144524992 bytes # ~ 144MB object.size(selectSubAct.df) ## 52,348,272 bytes # ~ 52MB sessionInfo() R version 2.10.0 Patched (2009-10-27 r50222) x86_64-unknown-linux-gnu locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices datasets utils methods base loaded via a namespace (and not attached): [1] tools_2.10.0 Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 399-1219 Skype No Voicemail please [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with split eating giga-bytes of memory
Charles, I suspect your are correct regarding copying of the attributes. First off, selectSubAct.df is my real data, which turns out to be of the same dim() as myDataFrame below, but each column is make up of strings, not simple letters, and there are many levels in each column, which I did not properly duplicate in my first example. I have ammended that below and with the split the new object size is now not 10X the size of the original, but 100X. My real data is even more complex than this, so I suspect that is where the problem lies. I need to search for a better solution to my problem than split, for which I will start a separate thread if I can't figure something out. Thanks for pointing me in the right direction, Mark myDataFrame - data.frame(matrix(paste(The rain in Spain, as.character(1:1400), sep = .), ncol = 7, nrow = 399000)) mySplitVar - factor(paste(Rainy days and Mondays, as.character(1:1400), sep = .)) myDataFrame - cbind(myDataFrame, mySplitVar) object.size(myDataFrame) ## 12860880 bytes # ~ 13MB myDataFrame.split - split(myDataFrame, myDataFrame$mySplitVar) object.size(myDataFrame.split) ## 1,274,929,792 bytes ~ 1.2GB object.size(selectSubAct.df) ## 52,348,272 bytes # ~ 52MB Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 399-1219 Skype No Voicemail please On Tue, Dec 8, 2009 at 10:22 PM, Charles C. Berry cbe...@tajo.ucsd.eduwrote: On Tue, 8 Dec 2009, Mark Kimpel wrote: I'm having trouble using split on a very large data-set with ~1400 levels of the factor to be split. Unfortunately, I can't reproduce it with the simple self-contained example below. As you can see, splitting the artificial dataframe of size ~13MB results in a split dataframe of ~ 144MB, with an increase memory allocation of ~10 fold for the split object. If split scales linearly, then my actual 52MB dataframe should be easily handled by my 12GB of RAM, but it is not. instead, when I try to split selectSubAct.df on one of its factors with 1473 levels, my memory is slowly gobbled up (plus 3 GB of swap) until I cancel the operation. Any ideas on what might be happening? Thanks, Mark Each element of myDataFrame.split contains a copy of the attributes of the parent data.frame. And probably it does scale linearly. But the scaling factor depends on the size of the attributes that get copied, I guess. myDataFrame - data.frame(matrix(LETTERS, ncol = 7, nrow = 399000)) mySplitVar - factor(as.character(1:1400)) myDataFrame - cbind(myDataFrame, mySplitVar) object.size(myDataFrame) ## 12860880 bytes # ~ 13MB myDataFrame.split - split(myDataFrame, myDataFrame$mySplitVar) object.size(myDataFrame.split) ## 144524992 bytes # ~ 144MB Note: only.attr - lapply(myDataFrame.split,function(x) sapply(x,attributes)) (object.size(myDataFrame.split)-object.size(myDataFrame))/object.size(only.attr) 1.03726179240978 bytes object.size(selectSubAct.df) ## 52,348,272 bytes # ~ 52MB What was this?? Chuck sessionInfo() R version 2.10.0 Patched (2009-10-27 r50222) x86_64-unknown-linux-gnu locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices datasets utils methods base loaded via a namespace (and not attached): [1] tools_2.10.0 Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 399-1219 Skype No Voicemail please [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with split eating giga-bytes of memory
Hadley, Just as you were apparently writing I had the same thought and did exactly what you suggested, converting all columns except the one that I want split to character. Executed almost instantaneously without problem. Thanks! Mark Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 399-1219 Skype No Voicemail please On Tue, Dec 8, 2009 at 10:48 PM, hadley wickham h.wick...@gmail.com wrote: Hi Mark, Why are you using factors? I think for this case you might find characters are faster and more space efficient. Alternatively, you can have a look at the plyr package which uses some tricks to keep memory usage down. Hadley On Tue, Dec 8, 2009 at 9:46 PM, Mark Kimpel mwkim...@gmail.com wrote: Charles, I suspect your are correct regarding copying of the attributes. First off, selectSubAct.df is my real data, which turns out to be of the same dim() as myDataFrame below, but each column is make up of strings, not simple letters, and there are many levels in each column, which I did not properly duplicate in my first example. I have ammended that below and with the split the new object size is now not 10X the size of the original, but 100X. My real data is even more complex than this, so I suspect that is where the problem lies. I need to search for a better solution to my problem than split, for which I will start a separate thread if I can't figure something out. Thanks for pointing me in the right direction, Mark myDataFrame - data.frame(matrix(paste(The rain in Spain, as.character(1:1400), sep = .), ncol = 7, nrow = 399000)) mySplitVar - factor(paste(Rainy days and Mondays, as.character(1:1400), sep = .)) myDataFrame - cbind(myDataFrame, mySplitVar) object.size(myDataFrame) ## 12860880 bytes # ~ 13MB myDataFrame.split - split(myDataFrame, myDataFrame$mySplitVar) object.size(myDataFrame.split) ## 1,274,929,792 bytes ~ 1.2GB object.size(selectSubAct.df) ## 52,348,272 bytes # ~ 52MB Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 399-1219 Skype No Voicemail please On Tue, Dec 8, 2009 at 10:22 PM, Charles C. Berry cbe...@tajo.ucsd.edu wrote: On Tue, 8 Dec 2009, Mark Kimpel wrote: I'm having trouble using split on a very large data-set with ~1400 levels of the factor to be split. Unfortunately, I can't reproduce it with the simple self-contained example below. As you can see, splitting the artificial dataframe of size ~13MB results in a split dataframe of ~ 144MB, with an increase memory allocation of ~10 fold for the split object. If split scales linearly, then my actual 52MB dataframe should be easily handled by my 12GB of RAM, but it is not. instead, when I try to split selectSubAct.df on one of its factors with 1473 levels, my memory is slowly gobbled up (plus 3 GB of swap) until I cancel the operation. Any ideas on what might be happening? Thanks, Mark Each element of myDataFrame.split contains a copy of the attributes of the parent data.frame. And probably it does scale linearly. But the scaling factor depends on the size of the attributes that get copied, I guess. myDataFrame - data.frame(matrix(LETTERS, ncol = 7, nrow = 399000)) mySplitVar - factor(as.character(1:1400)) myDataFrame - cbind(myDataFrame, mySplitVar) object.size(myDataFrame) ## 12860880 bytes # ~ 13MB myDataFrame.split - split(myDataFrame, myDataFrame$mySplitVar) object.size(myDataFrame.split) ## 144524992 bytes # ~ 144MB Note: only.attr - lapply(myDataFrame.split,function(x) sapply(x,attributes)) (object.size(myDataFrame.split)-object.size(myDataFrame))/object.size(only.attr) 1.03726179240978 bytes object.size(selectSubAct.df) ## 52,348,272 bytes # ~ 52MB What was this?? Chuck sessionInfo() R version 2.10.0 Patched (2009-10-27 r50222) x86_64-unknown-linux-gnu locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices datasets utils methods base loaded via a namespace (and not attached): [1] tools_2.10.0 Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 399-1219 Skype No Voicemail please [[alternative HTML version deleted
Re: [R] problem with split eating giga-bytes of memory
Jim, could you provide a code snippit to illustrate what you mean? Hadley, good point, I did not know that. Mark Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 399-1219 Skype No Voicemail please On Tue, Dec 8, 2009 at 11:00 PM, jim holtman jholt...@gmail.com wrote: Also instead of 'splitting' the data frame, I split the indices and then use those to access the information in the original dataframe. On Tue, Dec 8, 2009 at 9:54 PM, Mark Kimpel mwkim...@gmail.com wrote: Hadley, Just as you were apparently writing I had the same thought and did exactly what you suggested, converting all columns except the one that I want split to character. Executed almost instantaneously without problem. Thanks! Mark Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 399-1219 Skype No Voicemail please On Tue, Dec 8, 2009 at 10:48 PM, hadley wickham h.wick...@gmail.com wrote: Hi Mark, Why are you using factors? I think for this case you might find characters are faster and more space efficient. Alternatively, you can have a look at the plyr package which uses some tricks to keep memory usage down. Hadley On Tue, Dec 8, 2009 at 9:46 PM, Mark Kimpel mwkim...@gmail.com wrote: Charles, I suspect your are correct regarding copying of the attributes. First off, selectSubAct.df is my real data, which turns out to be of the same dim() as myDataFrame below, but each column is make up of strings, not simple letters, and there are many levels in each column, which I did not properly duplicate in my first example. I have ammended that below and with the split the new object size is now not 10X the size of the original, but 100X. My real data is even more complex than this, so I suspect that is where the problem lies. I need to search for a better solution to my problem than split, for which I will start a separate thread if I can't figure something out. Thanks for pointing me in the right direction, Mark myDataFrame - data.frame(matrix(paste(The rain in Spain, as.character(1:1400), sep = .), ncol = 7, nrow = 399000)) mySplitVar - factor(paste(Rainy days and Mondays, as.character(1:1400), sep = .)) myDataFrame - cbind(myDataFrame, mySplitVar) object.size(myDataFrame) ## 12860880 bytes # ~ 13MB myDataFrame.split - split(myDataFrame, myDataFrame$mySplitVar) object.size(myDataFrame.split) ## 1,274,929,792 bytes ~ 1.2GB object.size(selectSubAct.df) ## 52,348,272 bytes # ~ 52MB Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 399-1219 Skype No Voicemail please On Tue, Dec 8, 2009 at 10:22 PM, Charles C. Berry cbe...@tajo.ucsd.edu wrote: On Tue, 8 Dec 2009, Mark Kimpel wrote: I'm having trouble using split on a very large data-set with ~1400 levels of the factor to be split. Unfortunately, I can't reproduce it with the simple self-contained example below. As you can see, splitting the artificial dataframe of size ~13MB results in a split dataframe of ~ 144MB, with an increase memory allocation of ~10 fold for the split object. If split scales linearly, then my actual 52MB dataframe should be easily handled by my 12GB of RAM, but it is not. instead, when I try to split selectSubAct.df on one of its factors with 1473 levels, my memory is slowly gobbled up (plus 3 GB of swap) until I cancel the operation. Any ideas on what might be happening? Thanks, Mark Each element of myDataFrame.split contains a copy of the attributes of the parent data.frame. And probably it does scale linearly. But the scaling factor depends on the size of the attributes that get copied, I guess. myDataFrame - data.frame(matrix(LETTERS, ncol = 7, nrow = 399000)) mySplitVar - factor(as.character(1:1400)) myDataFrame - cbind(myDataFrame, mySplitVar) object.size(myDataFrame) ## 12860880 bytes # ~ 13MB myDataFrame.split - split(myDataFrame, myDataFrame$mySplitVar) object.size(myDataFrame.split) ## 144524992 bytes # ~ 144MB Note: only.attr - lapply(myDataFrame.split,function(x) sapply(x,attributes)) (object.size(myDataFrame.split)-object.size(myDataFrame))/object.size(only.attr) 1.03726179240978 bytes object.size(selectSubAct.df) ## 52,348,272 bytes # ~ 52MB What was this?? Chuck sessionInfo() R version 2.10.0 Patched (2009-10-27 r50222) x86_64-unknown-linux-gnu locale
Re: [R] package tm fails to remove the with remove stopwords
Thanks Ingo. Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 399-1219 Skype No Voicemail please On Sun, Nov 15, 2009 at 11:05 AM, Ingo Feinerer feine...@logic.at wrote: On Thu, Nov 12, 2009 at 11:29:50AM -0500, Mark Kimpel wrote: I am using code that previously worked to remove stopwords using package tm. Thanks for reporting. This is a bug in the removeWords() function in tm version 0.5-1 available from CRAN: require(tm) myDocument - c(the rain in Spain, falls mainly on the plain, jack and jill ran up the hill, to fetch a pail of water) text.corp - Corpus(VectorSource(myDocument)) # text.corp - tm_map(text.corp, stripWhitespace) text.corp - tm_map(text.corp, removeNumbers) text.corp - tm_map(text.corp, removePunctuation) ## text.corp - tm_map(text.corp, stemDocument) text.corp - tm_map(text.corp, removeWords, c(the, stopwords(english))) dtm - DocumentTermMatrix(text.corp) dtm dtm.mat - as.matrix(dtm) dtm.mat dtm.mat Terms Docs falls fetch hill jack jill mainly pail plain rain ran spain the water 1 0 0000 00 01 0 1 1 0 2 1 0000 10 10 0 0 0 0 3 0 0111 00 00 1 0 0 0 4 0 1000 01 00 0 0 0 1 The function removeWords() fails to remove patterns at the beginning or at the end of a line. This bug is fixed in the latest development version on R-Forge, and the fix will be included in the next CRAN release. Please see https://r-forge.r-project.org/plugins/scmsvn/viewcvs.php/pkg/inst/NEWS?root=tmview=markup for a list of all bug fixes and changes between each tm version. Best regards, Ingo Feinerer -- Ingo Feinerer Vienna University of Technology http://www.dbai.tuwien.ac.at/staff/feinerer [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] package tm fails to remove the with remove stopwords
Sam, Thanks for the example. Removing stop words after the DocumentTermMatrix has been created works fine if one is working with single words, but what if one is creating a dtm of possible combinations of words? Wouldn't one want to remove them from the corpus? Mark Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 399-1219 Skype No Voicemail please On Thu, Nov 12, 2009 at 12:04 PM, Sam Thomas sam.tho...@revelanttech.comwrote: I'm not sure what's wrong with your approach, but this seems to strip the require(tm) params - list(minDocFreq = 1, removeNumbers = TRUE, stemming = TRUE, stopwords = TRUE, weighting = weightTf) myDocument - c(the rain in Spain, falls mainly on the plain, jack and jill ran up the hill, to fetch a pail of water) text.corp - Corpus(VectorSource(myDocument)) dtm - DocumentTermMatrix(text.corp, control = params) dtm dtm.mat - as.matrix(dtm) dtm.mat *From:* Mark Kimpel [mailto:mwkim...@gmail.com] *Sent:* Thursday, November 12, 2009 11:30 AM *To:* r-help@r-project.org; feine...@logic.at; Sam Thomas *Subject:* package tm fails to remove the with remove stopwords I am using code that previously worked to remove stopwords using package tm. Even manually adding the to the list does not work to remove the. This package has undergone extensive redevelopment with changes to the function syntax, so perhaps I am just missing something. Please see my simple example, output, and sessionInfo() below. Thanks! Mark require(tm) myDocument - c(the rain in Spain, falls mainly on the plain, jack and jill ran up the hill, to fetch a pail of water) text.corp - Corpus(VectorSource(myDocument)) # text.corp - tm_map(text.corp, stripWhitespace) text.corp - tm_map(text.corp, removeNumbers) text.corp - tm_map(text.corp, removePunctuation) ## text.corp - tm_map(text.corp, stemDocument) text.corp - tm_map(text.corp, removeWords, c(the, stopwords(english))) dtm - DocumentTermMatrix(text.corp) dtm dtm.mat - as.matrix(dtm) dtm.mat dtm.mat Terms Docs falls fetch hill jack jill mainly pail plain rain ran spain the water 1 0 0000 00 01 0 1 1 0 2 1 0000 10 10 0 0 0 0 3 0 0111 00 00 1 0 0 0 4 0 1000 01 00 0 0 0 1 R version 2.10.0 Patched (2009-10-27 r50222) x86_64-unknown-linux-gnu locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices datasets utils methods base other attached packages: [1] chron_2.3-33 RWeka_0.3-23 tm_0.5-1 loaded via a namespace (and not attached): [1] grid_2.10.0 rJava_0.8-1 slam_0.1-6 tools_2.10.0 Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 399-1219 Skype No Voicemail please [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] package tm fails to remove the with remove stopwords
I am using code that previously worked to remove stopwords using package tm. Even manually adding the to the list does not work to remove the. This package has undergone extensive redevelopment with changes to the function syntax, so perhaps I am just missing something. Please see my simple example, output, and sessionInfo() below. Thanks! Mark require(tm) myDocument - c(the rain in Spain, falls mainly on the plain, jack and jill ran up the hill, to fetch a pail of water) text.corp - Corpus(VectorSource(myDocument)) # text.corp - tm_map(text.corp, stripWhitespace) text.corp - tm_map(text.corp, removeNumbers) text.corp - tm_map(text.corp, removePunctuation) ## text.corp - tm_map(text.corp, stemDocument) text.corp - tm_map(text.corp, removeWords, c(the, stopwords(english))) dtm - DocumentTermMatrix(text.corp) dtm dtm.mat - as.matrix(dtm) dtm.mat dtm.mat Terms Docs falls fetch hill jack jill mainly pail plain rain ran spain the water 1 0 0000 00 01 0 1 1 0 2 1 0000 10 10 0 0 0 0 3 0 0111 00 00 1 0 0 0 4 0 1000 01 00 0 0 0 1 R version 2.10.0 Patched (2009-10-27 r50222) x86_64-unknown-linux-gnu locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices datasets utils methods base other attached packages: [1] chron_2.3-33 RWeka_0.3-23 tm_0.5-1 loaded via a namespace (and not attached): [1] grid_2.10.0 rJava_0.8-1 slam_0.1-6 tools_2.10.0 Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 399-1219 Skype No Voicemail please [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to model a numeric factor as a non-ordinal factor
I am analyzing an experiment in which time is a factor, represented by numbers indicating time since last treatment, but in this particular case there is no reason to think that time has a numeric meaning in the sense that 24 would be greater than 6. We have no idea which genes will be increasing or decreasing at different times. So I have the following model (to be applied over many genes): mod - lm(gene.expression ~ Treatment + Time) If I wanted to just hack my way around this I could just paste a character to all of the times, but I'm curious as to the right way to do this. What would be the correct syntax or transformation? Would Time - factor(as.character(Time)) do it? I want to make sure that it does not get coerced back to numeric. Thanks, Mark Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 399-1219 Skype No Voicemail please [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help with the use of mtext to create main title over multiple plots
I'm trying to use mtext to create a main title over multiple plots. Below is a simple self-contained example and my sessionInfo (I should note I've also tried this with R-2.8.1 with the same results). When I execute the code chunk below, I get the plots, but no title. I've tried this using the screen driver, pdf, and postscript. I've used different sizes of paper. I suspect I am making an elementary error but searching the help files and help archives hasn't provided me an answer. Thanks for any help, Mark # setwd(~/Desktop) pdf(my.test.plots.pdf, paper = letter) par(mfrow=c(2,2)) for (i in 1:4){ plot(1:6, 1:6) } mtext(text = my test plots, side = 3, outer = TRUE) dev.off() # R version 2.10.0 Under development (unstable) (2009-09-21 r49771) x86_64-unknown-linux-gnu locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] car_1.2-15 loaded via a namespace (and not attached): [1] tools_2.10.0 Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 399-1219 Skype No Voicemail please [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with the use of mtext to create main title over multiple plots
Thanks Tony (and others). Setting oma corrects the problem. Mark Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 399-1219 Skype No Voicemail please On Mon, Oct 12, 2009 at 1:41 PM, Tony Plate tpl...@acm.org wrote: Try playing around with the oma setting in par() -- it sets the outer margins, which by default are zero. The following shows the mtext label for me, using the windows device: par(mfrow=c(2,2)) par(oma) [1] 0 0 0 0 par(oma=c(0,0,2,0)) for (i in 1:4) plot(0:1,0:1) mtext(text = my test plots, side = 3, outer = TRUE) Mark Kimpel wrote: I'm trying to use mtext to create a main title over multiple plots. Below is a simple self-contained example and my sessionInfo (I should note I've also tried this with R-2.8.1 with the same results). When I execute the code chunk below, I get the plots, but no title. I've tried this using the screen driver, pdf, and postscript. I've used different sizes of paper. I suspect I am making an elementary error but searching the help files and help archives hasn't provided me an answer. Thanks for any help, Mark # setwd(~/Desktop) pdf(my.test.plots.pdf, paper = letter) par(mfrow=c(2,2)) for (i in 1:4){ plot(1:6, 1:6) } mtext(text = my test plots, side = 3, outer = TRUE) dev.off() # R version 2.10.0 Under development (unstable) (2009-09-21 r49771) x86_64-unknown-linux-gnu locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] car_1.2-15 loaded via a namespace (and not attached): [1] tools_2.10.0 Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 399-1219 Skype No Voicemail please [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] problems with strsplit using a split of ' \\\ ' : a regex problem
I have a vector of gene symbols, some of which have multiple aliases. In the case of an alias, they are separated by ' \\\ '. Here is a real world example, which would represent one element of my vector: Eif4g2 /// Eif4g2-ps1 /// LOC678831 What I would like to do is input the vector into a function and output a vector with just the first alias of each element (or, if there are no aliases, just the one symbol). So I wrote a simple little function to do this: get.first.id.func - function(vec, splitter){ vec.lst - strsplit(vec, splitter) first.func - function(vec1){vec1[1]} vec.out - sapply(vec.lst, first.func) vec.out } For a trivial example, this works: a - c(a_b, c_d) get.first.id.func(a, _) [1] a c I am running into problems, however, with the real world split of ' \\\ ' I'm not even able to construct a sample vector of my own! Here is what I get: a - c('a \\\ b', 'a \\\ b') a [1] a \\ b a \\ b a - c('a b', 'a b') a [1] a b a b I KNOW this is related to R's peculiarities with \ escapes, but I don't have the expertise to know how to get around it. I would be very interested to learn: 1. how to construct a vector such that a == c('a \\\ b', 'a \\\ b') 2. how to properly input my split into my function so that I get the split desired. Thanks regex experts! Mark Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail The real problem is not whether machines think but whether men do. -- B. F. Skinner ** [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problems with strsplit using a split of ' \\\ ' : a regex problem
Thanks Henrique. I had actually tried using 6 back-slashes but didn't know to use 'cat' to see the non-escaped representation (see below to see my original confusion). Your strsplit, of course, works great. Thanks again! a [1] a \\ b a \\ b cat(a) a \\\ b a \\\ b Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail The real problem is not whether machines think but whether men do. -- B. F. Skinner ** On Thu, Aug 27, 2009 at 9:15 PM, Henrique Dallazuanna www...@gmail.comwrote: You need a escape before each backslash: a - c('a \\ b', 'a \\ b') cat(a, \n) You can write in this form: strsplit(a, .*\\.* ) On Thu, Aug 27, 2009 at 10:03 PM, Mark Kimpel mwkim...@gmail.com wrote: I have a vector of gene symbols, some of which have multiple aliases. In the case of an alias, they are separated by ' \\\ '. Here is a real world example, which would represent one element of my vector: Eif4g2 /// Eif4g2-ps1 /// LOC678831 What I would like to do is input the vector into a function and output a vector with just the first alias of each element (or, if there are no aliases, just the one symbol). So I wrote a simple little function to do this: get.first.id.func - function(vec, splitter){ vec.lst - strsplit(vec, splitter) first.func - function(vec1){vec1[1]} vec.out - sapply(vec.lst, first.func) vec.out } For a trivial example, this works: a - c(a_b, c_d) get.first.id.func(a, _) [1] a c I am running into problems, however, with the real world split of ' \\\ ' I'm not even able to construct a sample vector of my own! Here is what I get: a - c('a \\\ b', 'a \\\ b') a [1] a \\ b a \\ b a - c('a b', 'a b') a [1] a b a b I KNOW this is related to R's peculiarities with \ escapes, but I don't have the expertise to know how to get around it. I would be very interested to learn: 1. how to construct a vector such that a == c('a \\\ b', 'a \\\ b') 2. how to properly input my split into my function so that I get the split desired. Thanks regex experts! Mark Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail The real problem is not whether machines think but whether men do. -- B. F. Skinner ** [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help with regular expressions in R
I'm having trouble achieving the results I want using a regular expression. I want to eliminate all characters that fall within square brackets as well as the brackets themselves, returning an . I'm not sure if it's R's use of double slash escapes or something else that is tripping me up. If I only use one slash I get 1: '\[' is an unrecognized escape in a character string 2: '\]' is an unrecognized escape in a character string 3: unrecognized escapes removed from \[*.\] Below is my self-contained code followed by sessionInfo(). Thanks in advance for your help. I'm going to be doing a lot of text mining in the near future. I have an excellent O'Reilly book on regex's. What is the best reference for R's special treatment of these animals? Mark myCharVec - c([the rain in spain], (the rain in spain)) gsub('\\[*.\\]', '', myCharVec) #what I get # [1] [the rain in spai (the rain in spain) #what I want [1](the rain in spain) sessionInfo() R version 2.10.0 Under development (unstable) (2009-08-12 r49193) x86_64-unknown-linux-gnu locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices datasets utils methods base other attached packages: [1] RWeka_0.3-20 tm_0.4 loaded via a namespace (and not attached): [1] grid_2.10.0 rJava_0.6-3 slam_0.1-3 Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail The real problem is not whether machines think but whether men do. -- B. F. Skinner ** [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with regular expressions in R
Well, I guess I'm not quite there yet. What I gave earlier was a simplified example, and did not accurately reflect the complexity of the task. This is my real world example. As you can see, what I need to do is delete an arbitrary number of characters, including brackets and parens enclosing them, multiple times within the same string. Help? myCharVec - medicare [link 220.30.05] ssa (1-800-772-1213). 2008 [link 145.30.05] amounts (2d) gross income (magi) here. (2e) myCharVec myCharVec - gsub('\\[.*\\]', '', myCharVec) myCharVec myCharVec - gsub('\\(.*\\)', '', myCharVec) myCharVec #what I want # medicare ssa . 2008 amounts gross income here. myCharVec - medicare [link 220.30.05] ssa (1-800-772-1213). 2008 [link 145.30.05] amounts (2d) gross income (magi) here. (2e) myCharVec [1] medicare [link 220.30.05] ssa (1-800-772-1213). 2008 [link 145.30.05] amounts (2d) gross income (magi) here. (2e) myCharVec - gsub('\\[.*\\]', '', myCharVec) myCharVec [1] medicare amounts (2d) gross income (magi) here. (2e) myCharVec - gsub('\\(.*\\)', '', myCharVec) myCharVec [1] medicare amounts #what I want # medicare ssa . 2008 amounts gross income here. Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail The real problem is not whether machines think but whether men do. -- B. F. Skinner ** On Thu, Aug 20, 2009 at 11:39 AM, William Dunlap wdun...@tibco.com wrote: -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Mark Kimpel Sent: Thursday, August 20, 2009 8:31 AM To: r-help@r-project.org Subject: [R] help with regular expressions in R ... myCharVec - c([the rain in spain], (the rain in spain)) gsub('\\[*.\\]', '', myCharVec) Change the '*.' to '.*'. Your expression matches 0 or more left square brackets, followed by 1 character, followed by a right squared bracket. \\[.*\]] matches a left square bracket, followed by 0 or more characters, followed by a right square bracket. Bill Dunlap TIBCO Software Inc - Spotfire Division wdunlap tibco.com #what I get # [1] [the rain in spai (the rain in spain) #what I want [1](the rain in spain) sessionInfo() R version 2.10.0 Under development (unstable) (2009-08-12 r49193) x86_64-unknown-linux-gnu locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices datasets utils methods base other attached packages: [1] RWeka_0.3-20 tm_0.4 loaded via a namespace (and not attached): [1] grid_2.10.0 rJava_0.6-3 slam_0.1-3 Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail The real problem is not whether machines think but whether men do. -- B. F. Skinner ** [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with regular expressions in R
Thanks guys. I've pulled my O'Reilly book and will begin reviewing it. Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail The real problem is not whether machines think but whether men do. -- B. F. Skinner ** On Thu, Aug 20, 2009 at 12:37 PM, Phil Spector spec...@stat.berkeley.eduwrote: Mark - It looks like you're running into the greediness of regular expressions. When R sees .* it tries to find the longest match, which also grabs some of the stuff you want. You can either replace .* with something like [^\\])]* (i.e. one or more of any character *except* ] or ) ), or use perl=TRUE, which allows the question mark (?) to mean the shortest match instead of the longest. Here's what I'd use: gsub('[\\[(].*?[\\])]','',myCharVec,perl=TRUE) In English: substitute the shortest string starting with [ or ( and ending with ] or ) with nothing. Hope this helps. - Phil On Thu, 20 Aug 2009, Mark Kimpel wrote: Well, I guess I'm not quite there yet. What I gave earlier was a simplified example, and did not accurately reflect the complexity of the task. This is my real world example. As you can see, what I need to do is delete an arbitrary number of characters, including brackets and parens enclosing them, multiple times within the same string. Help? myCharVec - medicare [link 220.30.05] ssa (1-800-772-1213). 2008 [link 145.30.05] amounts (2d) gross income (magi) here. (2e) myCharVec myCharVec - gsub('\\[.*\\]', '', myCharVec) myCharVec myCharVec - gsub('\\(.*\\)', '', myCharVec) myCharVec #what I want # medicare ssa . 2008 amounts gross income here. myCharVec - medicare [link 220.30.05] ssa (1-800-772-1213). 2008 [link 145.30.05] amounts (2d) gross income (magi) here. (2e) myCharVec [1] medicare [link 220.30.05] ssa (1-800-772-1213). 2008 [link 145.30.05] amounts (2d) gross income (magi) here. (2e) myCharVec - gsub('\\[.*\\]', '', myCharVec) myCharVec [1] medicare amounts (2d) gross income (magi) here. (2e) myCharVec - gsub('\\(.*\\)', '', myCharVec) myCharVec [1] medicare amounts #what I want # medicare ssa . 2008 amounts gross income here. Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail The real problem is not whether machines think but whether men do. -- B. F. Skinner ** On Thu, Aug 20, 2009 at 11:39 AM, William Dunlap wdun...@tibco.com wrote: -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Mark Kimpel Sent: Thursday, August 20, 2009 8:31 AM To: r-help@r-project.org Subject: [R] help with regular expressions in R ... myCharVec - c([the rain in spain], (the rain in spain)) gsub('\\[*.\\]', '', myCharVec) Change the '*.' to '.*'. Your expression matches 0 or more left square brackets, followed by 1 character, followed by a right squared bracket. \\[.*\]] matches a left square bracket, followed by 0 or more characters, followed by a right square bracket. Bill Dunlap TIBCO Software Inc - Spotfire Division wdunlap tibco.com #what I get # [1] [the rain in spai (the rain in spain) #what I want [1](the rain in spain) sessionInfo() R version 2.10.0 Under development (unstable) (2009-08-12 r49193) x86_64-unknown-linux-gnu locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices datasets utils methods base other attached packages: [1] RWeka_0.3-20 tm_0.4 loaded via a namespace (and not attached): [1] grid_2.10.0 rJava_0.6-3 slam_0.1-3 Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail The real problem is not whether machines think but whether men do. -- B. F. Skinner ** [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read
[R] reading in MS Word files
I am familiar with packages that read and write Excel files on both Windows and Linux platforms. Do any packages provide similar functionality for MS Word files? I have a lot of text processing to do and the text is embedded in ~200 different Word files (.doc format Office 2003). All I need to do is read, not write. Thanks, Mark Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail The real problem is not whether machines think but whether men do. -- B. F. Skinner ** [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reading in MS Word files
Thanks guys, as I wanted to do a little preprocessing before importing into tm (the files have all sorts of stuff in them that I don't need), I used a system to invoke Abiword and do the batch conversions. Mark Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail The real problem is not whether machines think but whether men do. -- B. F. Skinner ** On Tue, Aug 18, 2009 at 10:56 AM, Ingo Feinerer feine...@logic.at wrote: On Tue, Aug 18, 2009 at 12:00:07PM +0200, Mark Kimpel wrote: I am familiar with packages that read and write Excel files on both Windows and Linux platforms. Do any packages provide similar functionality for MS Word files? I have a lot of text processing to do and the text is embedded in ~200 different Word files (.doc format Office 2003). All I need to do is read, not write. See readDOC in package tm. E.g., something like Corpus(DirSource(aDirectoryContainingTheWordFiles), readerControl = list(reader = readDOC)) Note that you need antiword (http://www.winfield.demon.nl/) in your path such that readDOC can use it. Best regards, Ingo -- Ingo Feinerer Vienna University of Technology http://www.dbai.tuwien.ac.at/staff/feinerer [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] using package tm to find phrases
Thanks, the pointer to the tokenizer helped. Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail The real problem is not whether machines think but whether men do. -- B. F. Skinner ** On Thu, Aug 13, 2009 at 6:11 PM, Ingo Feinerer feine...@logic.at wrote: On Thu, Aug 13, 2009 at 03:36:22PM -0400, Mark Kimpel wrote: I am using the package tm for text-mining of abstracts and would like to use it to find instances of gene names that may contain white space. For instance gene regulatory protein 1. The default behavior of tm is to parse this into 4 separate words, but I would like to use the class constructor dictionary to define phrases such as just mentioned. Is this possible? If so, how? Yes. * In case you only need to find instances, you could use full text search on your corpus, e.g. R tmIndex(yourCorpus, gene regulatory protein 1) would return the indices of all documents in your corpus containing this phrase. * If you need tokens (in a term-document matrix) of length 4, you could use a n-gram tokenizer (n = 4). See e.g., http://tm.r-forge.r-project.org/faq.html#Bigrams. Then you can use the dictionary argument to store only your selection of gene names. I.e., something like R yourTokenizer - function(x) RWeka::NGramTokenizer(x, Weka_control(min = 4, max = 4)) R TermDocumentMatrix(crude, control = list(tokenize = yourTokenizer, dictionary = yourDictionary)) where yourDictionary contains the gene names (a character vector suffices) to be included in the term-document matrix. * If you want to extract arbitrary patterns of different length that could match some gene names (and build a dictionary from that), you need some custom functionality. Regular expressions might be a good starting point ... Best regards, Ingo -- Ingo Feinerer Vienna University of Technology http://www.dbai.tuwien.ac.at/staff/feinerer [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] using package tm to find phrases
I am using the package tm for text-mining of abstracts and would like to use it to find instances of gene names that may contain white space. For instance gene regulatory protein 1. The default behavior of tm is to parse this into 4 separate words, but I would like to use the class constructor dictionary to define phrases such as just mentioned. Is this possible? If so, how? Thanks, Mark Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail The real problem is not whether machines think but whether men do. -- B. F. Skinner ** [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] problem with heatmap.2 in package gplots generating non-finite breaks
I have written a wrapper for heatmap.2 called heatmap.w.row.and.col.clust which auto-generates breaks using breaks-round((c(seq(from=(-20 * stddev), to=(20 * stddev/20, digits = 2) #(stddev in this case = 2.5) This has always worked well in the past but now I am getting an error that non-finite breaks are being generated. Drilling down, it seems that my wrapper is generating finite breaks but for some reason heatmap.2 is putting a NaN into the first and last positions in the vector. Is it obvious using the breaks my wrapper has generated why this should be so? My sessionInfo() follows. Thanks, Mark Browse[1] c Enter a frame number, or 0 to exit 1: heatmap.w.row.and.col.clust(iqa.corp.sparse.rem) 2: heatmap.func.R#29: heatmap.2(as.matrix(dataframe), col = color.palette, bre 3: image(z = matrix(z, ncol = 1), col = col, breaks = tmpbreaks, xaxt = n, y 4: image.default(z = matrix(z, ncol = 1), col = col, breaks = tmpbreaks, xaxt Selection: 1 Called from: eval(expr, envir, enclos) Browse[1] ls() [1] breaks col.labels color.palette [4] dataframe dendrogram.options remove.mean [7] row.labels stddev Browse[1] breaks [1] -2.50 -2.45 -2.40 -2.35 -2.30 -2.25 -2.20 -2.15 -2.10 -2.05 -2.00 -1.95 [13] -1.90 -1.85 -1.80 -1.75 -1.70 -1.65 -1.60 -1.55 -1.50 -1.45 -1.40 -1.35 [25] -1.30 -1.25 -1.20 -1.15 -1.10 -1.05 -1.00 -0.95 -0.90 -0.85 -0.80 -0.75 [37] -0.70 -0.65 -0.60 -0.55 -0.50 -0.45 -0.40 -0.35 -0.30 -0.25 -0.20 -0.15 [49] -0.10 -0.05 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 [61] 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00 1.05 [73] 1.10 1.15 1.20 1.25 1.30 1.35 1.40 1.45 1.50 1.55 1.60 1.65 [85] 1.70 1.75 1.80 1.85 1.90 1.95 2.00 2.05 2.10 2.15 2.20 2.25 [97] 2.30 2.35 2.40 2.45 2.50 Browse[1] is.finite(breaks) [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE [16] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE [31] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE [46] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE [61] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE [76] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE [91] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE Browse[1] c Enter a frame number, or 0 to exit 1: heatmap.w.row.and.col.clust(iqa.corp.sparse.rem) 2: heatmap.func.R#29: heatmap.2(as.matrix(dataframe), col = color.palette, bre 3: image(z = matrix(z, ncol = 1), col = col, breaks = tmpbreaks, xaxt = n, y 4: image.default(z = matrix(z, ncol = 1), col = col, breaks = tmpbreaks, xaxt Selection: 2 Called from: eval(expr, envir, enclos) Browse[1] heatmap.func.R Error during wrapup: object 'heatmap.func.R' not found Browse[1] ls() [1] add.expr breakscellnote cexCol [5] cexRowcol colIndcolsep [9] ColSideColors Colv ddc ddr [13] dendrogramdensadj denscol density.info [17] didistfun hcc hclustfun [21] hcr hline iykey [25] keysize labCollabRowlhei [29] linecol lmat lwid main [33] margins max.breaksmax.raw max.scale [37] min.breaksmin.raw min.scale mmat [41] na.color na.rm nbr nc [45] ncol notecex notecol nr [49] opretvalrevC rm [53] rowIndrowsepRowSideColors Rowv [57] scale scale01 sepcolor sepwidth [61] sxsymbreaks symkeysymm [65] tmpbreaks trace tracecol vline [69] x xlab x.scaled x.unscaled [73] ylab z Browse[1] tmpbreaks [1] NaN -2.45 -2.40 -2.35 -2.30 -2.25 -2.20 -2.15 -2.10 -2.05 -2.00 -1.95 [13] -1.90 -1.85 -1.80 -1.75 -1.70 -1.65 -1.60 -1.55 -1.50 -1.45 -1.40 -1.35 [25] -1.30 -1.25 -1.20 -1.15 -1.10 -1.05 -1.00 -0.95 -0.90 -0.85 -0.80 -0.75 [37] -0.70 -0.65 -0.60 -0.55 -0.50 -0.45 -0.40 -0.35 -0.30 -0.25 -0.20 -0.15 [49] -0.10 -0.05 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 [61] 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00 1.05 [73] 1.10 1.15 1.20 1.25 1.30 1.35 1.40 1.45 1.50 1.55 1.60 1.65 [85] 1.70 1.75 1.80 1.85 1.90 1.95 2.00 2.05 2.10 2.15 2.20 2.25 [97] 2.30 2.35 2.40 2.45 NaN Browse[1] sessionInfo() R version 2.10.0 Under development (unstable) (2009-05-31 r48697) x86_64-unknown-linux-gnu locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base
Re: [R] problem with heatmap.2 in package gplots generating non-finite breaks
Never mind, the problem seems to be that I have ignored the warning Using scale=row or scale=column when breaks arespecified can produce unpredictable results.Please consider using only one or the other. I just stop specifying the breaks and it works fine. Mark Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail The real problem is not whether machines think but whether men do. -- B. F. Skinner ** On Tue, Jul 21, 2009 at 10:28 AM, Mark Kimpelmwkim...@gmail.com wrote: I have written a wrapper for heatmap.2 called heatmap.w.row.and.col.clust which auto-generates breaks using breaks-round((c(seq(from=(-20 * stddev), to=(20 * stddev/20, digits = 2) #(stddev in this case = 2.5) This has always worked well in the past but now I am getting an error that non-finite breaks are being generated. Drilling down, it seems that my wrapper is generating finite breaks but for some reason heatmap.2 is putting a NaN into the first and last positions in the vector. Is it obvious using the breaks my wrapper has generated why this should be so? My sessionInfo() follows. Thanks, Mark Browse[1] c Enter a frame number, or 0 to exit 1: heatmap.w.row.and.col.clust(iqa.corp.sparse.rem) 2: heatmap.func.R#29: heatmap.2(as.matrix(dataframe), col = color.palette, bre 3: image(z = matrix(z, ncol = 1), col = col, breaks = tmpbreaks, xaxt = n, y 4: image.default(z = matrix(z, ncol = 1), col = col, breaks = tmpbreaks, xaxt Selection: 1 Called from: eval(expr, envir, enclos) Browse[1] ls() [1] breaks col.labels color.palette [4] dataframe dendrogram.options remove.mean [7] row.labels stddev Browse[1] breaks [1] -2.50 -2.45 -2.40 -2.35 -2.30 -2.25 -2.20 -2.15 -2.10 -2.05 -2.00 -1.95 [13] -1.90 -1.85 -1.80 -1.75 -1.70 -1.65 -1.60 -1.55 -1.50 -1.45 -1.40 -1.35 [25] -1.30 -1.25 -1.20 -1.15 -1.10 -1.05 -1.00 -0.95 -0.90 -0.85 -0.80 -0.75 [37] -0.70 -0.65 -0.60 -0.55 -0.50 -0.45 -0.40 -0.35 -0.30 -0.25 -0.20 -0.15 [49] -0.10 -0.05 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 [61] 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00 1.05 [73] 1.10 1.15 1.20 1.25 1.30 1.35 1.40 1.45 1.50 1.55 1.60 1.65 [85] 1.70 1.75 1.80 1.85 1.90 1.95 2.00 2.05 2.10 2.15 2.20 2.25 [97] 2.30 2.35 2.40 2.45 2.50 Browse[1] is.finite(breaks) [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE [16] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE [31] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE [46] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE [61] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE [76] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE [91] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE Browse[1] c Enter a frame number, or 0 to exit 1: heatmap.w.row.and.col.clust(iqa.corp.sparse.rem) 2: heatmap.func.R#29: heatmap.2(as.matrix(dataframe), col = color.palette, bre 3: image(z = matrix(z, ncol = 1), col = col, breaks = tmpbreaks, xaxt = n, y 4: image.default(z = matrix(z, ncol = 1), col = col, breaks = tmpbreaks, xaxt Selection: 2 Called from: eval(expr, envir, enclos) Browse[1] heatmap.func.R Error during wrapup: object 'heatmap.func.R' not found Browse[1] ls() [1] add.expr breaks cellnote cexCol [5] cexRow col colInd colsep [9] ColSideColors Colv ddc ddr [13] dendrogram densadj denscol density.info [17] di distfun hcc hclustfun [21] hcr hline iy key [25] keysize labCol labRow lhei [29] linecol lmat lwid main [33] margins max.breaks max.raw max.scale [37] min.breaks min.raw min.scale mmat [41] na.color na.rm nbr nc [45] ncol notecex notecol nr [49] op retval revC rm [53] rowInd rowsep RowSideColors Rowv [57] scale scale01 sepcolor sepwidth [61] sx symbreaks symkey symm [65] tmpbreaks trace tracecol vline [69] x xlab x.scaled x.unscaled [73] ylab z Browse[1] tmpbreaks [1] NaN -2.45 -2.40 -2.35 -2.30 -2.25 -2.20 -2.15 -2.10 -2.05 -2.00 -1.95 [13] -1.90 -1.85 -1.80 -1.75 -1.70 -1.65 -1.60 -1.55 -1.50 -1.45 -1.40 -1.35 [25] -1.30 -1.25 -1.20 -1.15 -1.10 -1.05 -1.00 -0.95 -0.90 -0.85 -0.80 -0.75 [37] -0.70 -0.65 -0.60 -0.55 -0.50 -0.45 -0.40
[R] can't get rJava to install on Linux
Having difficulties getting rJava to install on my Debian Squeeze box. Perused the R-help list and tried some things that have worked for others but not for me. Below is the output of my attempted build, R CMD javareconf -e, and sessionInfo(). Note I tried the R CMD javareconf also as root, restarted R after each of these, all no help. * Installing *source* package ‘rJava’ ... checking for gcc... gcc -std=gnu99 checking for C compiler default output file name... a.out checking whether the C compiler works... yes checking whether we are cross compiling... no checking for suffix of executables... checking for suffix of object files... o checking whether we are using the GNU C compiler... yes checking whether gcc -std=gnu99 accepts -g... yes checking for gcc -std=gnu99 option to accept ISO C89... none needed checking how to run the C preprocessor... gcc -std=gnu99 -E checking for grep that handles long lines and -e... /bin/grep checking for egrep... /bin/grep -E checking for ANSI C header files... yes checking for sys/wait.h that is POSIX.1 compatible... yes checking for sys/types.h... yes checking for sys/stat.h... yes checking for stdlib.h... yes checking for string.h... yes checking for memory.h... yes checking for strings.h... yes checking for inttypes.h... yes checking for stdint.h... yes checking for unistd.h... yes checking for string.h... (cached) yes checking sys/time.h usability... yes checking sys/time.h presence... yes checking for sys/time.h... yes checking for unistd.h... (cached) yes checking for an ANSI C-conforming const... yes checking whether time.h and sys/time.h may both be included... yes configure: checking whether gcc -std=gnu99 supports static inline... yes checking Java support in R... present: interpreter : '/usr/bin/java' archiver: '/usr/bin/jar' compiler: '/usr/lib/jvm/java-1.5.0-gcj-4.3-1.5.0.0/jre/../bin/javac' header prep.: '/usr/lib/jvm/java-1.5.0-gcj-4.3-1.5.0.0/jre/../bin/javah' cpp flags : '' java libs : '-L/usr/lib/../lib/gcj-4.3-90 -L/usr/lib/jni -ljvm' configure: error: One or more Java configuration variables are not set. Make sure R is configured with full Java support (including JDK). Run R CMD javareconf as root to add Java support to R. If you don't have root privileges, run R CMD javareconf -e to set all Java-related variables and then install rJava. ERROR: configuration failed for package ‘rJava’ * Removing ‘/home/mkimpel/R_HOME/site-library-2.9.0/rJava’ The downloaded packages are in ‘/tmp/Rtmpfp9kiG/downloaded_packages’ Warning message: In install.packages(rJava) : installation of package 'rJava' had non-zero exit status ./R CMD javareconf -e Java interpreter : /usr/bin/java Java version : 1.5.0 Java home path : /usr/lib/jvm/java-1.5.0-gcj-4.3-1.5.0.0/jre Java compiler: /usr/lib/jvm/java-1.5.0-gcj-4.3-1.5.0.0/jre/../bin/javac Java headers gen.: /usr/lib/jvm/java-1.5.0-gcj-4.3-1.5.0.0/jre/../bin/javah Java archive tool: /usr/bin/jar Java library path: /usr/lib/../lib/gcj-4.3-90:/usr/lib/jni JNI linker flags : -L/usr/lib/../lib/gcj-4.3-90 -L/usr/lib/jni -ljvm JNI cpp flags: The following Java variables have been exported: JAVA_HOME JAVA JAVAC JAVAH JAR JAVA_LIBS JAVA_CPPFLAGS JAVA_LD_LIBRARY_PATH sessionInfo() R version 2.9.1 (2009-06-26) x86_64-unknown-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices datasets utils methods base loaded via a namespace (and not attached): [1] tcltk_2.9.1 tools_2.9.1 Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail The real problem is not whether machines think but whether men do. -- B. F. Skinner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] can't get rJava to install on Linux
Switching to Sun definitely did not help, still no build with rJava, below is the output of R CMD javareconf. mkimpel-m90 /home/mkimpel/bin# ./R CMD javareconf *** JAVA_HOME is not a valid path, ignoring Java interpreter : /usr/bin/java Java version : 1.6.0_14 Java home path : /usr/lib/jvm/java-6-sun-1.6.0.14/jre /home/mkimpel/R_HOME/R-2.9.1/R-build/lib64/R/bin/javareconf: line 150: /usr/lib/jvm/java-1.5.0-gcj-4.3-1.5.0.0/jre/../bin/javac: No such file or directory Java compiler: not functional Java headers gen.: /usr/lib/jvm/java-1.5.0-gcj-4.3-1.5.0.0/jre/../bin/javah Java archive tool: /usr/bin/jar Java library path: /usr/lib/../lib/gcj-4.3-90:/usr/lib/jni JNI linker flags : -L/usr/lib/../lib/gcj-4.3-90 -L/usr/lib/jni -ljvm JNI cpp flags: Updating Java configuration in /home/mkimpel/R_HOME/R-2.9.1/R-build/lib64/R Done. Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail The real problem is not whether machines think but whether men do. -- B. F. Skinner ** On Tue, Jul 7, 2009 at 9:43 PM, Godmar Backgod...@gmail.com wrote: This is just a guess: looks like you have GNU's Java version in your path (aka gcj). Perhaps rJava relies on Sun's Java version. If so, install Sun's Java first. apt-get install sun-java6-jdk might do it. - Godmar On Tue, Jul 7, 2009 at 9:28 PM, Mark Kimpelmwkim...@gmail.com wrote: Having difficulties getting rJava to install on my Debian Squeeze box. Perused the R-help list and tried some things that have worked for others but not for me. Below is the output of my attempted build, R CMD javareconf -e, and sessionInfo(). Note I tried the R CMD javareconf also as root, restarted R after each of these, all no help. * Installing *source* package ‘rJava’ ... checking for gcc... gcc -std=gnu99 checking for C compiler default output file name... a.out checking whether the C compiler works... yes checking whether we are cross compiling... no checking for suffix of executables... checking for suffix of object files... o checking whether we are using the GNU C compiler... yes checking whether gcc -std=gnu99 accepts -g... yes checking for gcc -std=gnu99 option to accept ISO C89... none needed checking how to run the C preprocessor... gcc -std=gnu99 -E checking for grep that handles long lines and -e... /bin/grep checking for egrep... /bin/grep -E checking for ANSI C header files... yes checking for sys/wait.h that is POSIX.1 compatible... yes checking for sys/types.h... yes checking for sys/stat.h... yes checking for stdlib.h... yes checking for string.h... yes checking for memory.h... yes checking for strings.h... yes checking for inttypes.h... yes checking for stdint.h... yes checking for unistd.h... yes checking for string.h... (cached) yes checking sys/time.h usability... yes checking sys/time.h presence... yes checking for sys/time.h... yes checking for unistd.h... (cached) yes checking for an ANSI C-conforming const... yes checking whether time.h and sys/time.h may both be included... yes configure: checking whether gcc -std=gnu99 supports static inline... yes checking Java support in R... present: interpreter : '/usr/bin/java' archiver : '/usr/bin/jar' compiler : '/usr/lib/jvm/java-1.5.0-gcj-4.3-1.5.0.0/jre/../bin/javac' header prep.: '/usr/lib/jvm/java-1.5.0-gcj-4.3-1.5.0.0/jre/../bin/javah' cpp flags : '' java libs : '-L/usr/lib/../lib/gcj-4.3-90 -L/usr/lib/jni -ljvm' configure: error: One or more Java configuration variables are not set. Make sure R is configured with full Java support (including JDK). Run R CMD javareconf as root to add Java support to R. If you don't have root privileges, run R CMD javareconf -e to set all Java-related variables and then install rJava. ERROR: configuration failed for package ‘rJava’ * Removing ‘/home/mkimpel/R_HOME/site-library-2.9.0/rJava’ The downloaded packages are in ‘/tmp/Rtmpfp9kiG/downloaded_packages’ Warning message: In install.packages(rJava) : installation of package 'rJava' had non-zero exit status ./R CMD javareconf -e Java interpreter : /usr/bin/java Java version : 1.5.0 Java home path : /usr/lib/jvm/java-1.5.0-gcj-4.3-1.5.0.0/jre Java compiler : /usr/lib/jvm/java-1.5.0-gcj-4.3-1.5.0.0/jre/../bin/javac Java headers gen.: /usr/lib/jvm/java-1.5.0-gcj-4.3-1.5.0.0/jre/../bin/javah Java archive tool: /usr/bin/jar Java library path: /usr/lib/../lib/gcj-4.3-90:/usr/lib/jni JNI linker flags : -L/usr/lib/../lib/gcj-4.3-90 -L/usr/lib/jni -ljvm JNI cpp flags : The following Java variables have been exported: JAVA_HOME JAVA JAVAC JAVAH JAR JAVA_LIBS JAVA_CPPFLAGS JAVA_LD_LIBRARY_PATH sessionInfo() R version 2.9.1 (2009-06-26) x86_64-unknown-linux-gnu
[R] help with dealing with integer(0) returns from grep used within a conditional loop
I am using grep to locate colnames to automate a report build and have run into a problem when a colname is not found. The use of integer(0) in a conditional statement seems to be a no no as it has length 0. Below is a self-contained trivial example. I would like to get something like NA or -1 for the position when it is not found OR learn a way to use integer(0) or some cast of it in a logical statement. Example, output, and sessionInfo follow. Thanks, Mark 3 vec.1 - c(a, b, c) vec.2 - c(a, c, d) for (i in 1:length(vec.1)){ for (j in 1:length(vec.2)){ print(paste(i:, i, j:, j, sep = )) pos - grep(vec.1[i], vec.2[j]) if (pos 0){ print(pos identified) } else{ print(pos not found) } } } #3 vec.1 - c(a, b, c) vec.2 - c(a, c, d) for (i in 1:length(vec.1)){ + for (j in 1:length(vec.2)){ + print(paste(i:, i, j:, j, sep = )) + pos - grep(vec.1[i], vec.2[j]) + if (pos 0){ + print(pos identified) + } + else{ + print(pos not found) + } + } + } [1] i:1 j:1 [1] pos identified [1] i:1 j:2 Error in if (pos 0) { : argument is of length zero No suitable frames for recover() ### sessionInfo() R version 2.9.1 (2009-06-26) x86_64-unknown-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C attached base packages: [1] tcltk stats graphics grDevices datasets utils methods [8] base other attached packages: [1] sfsmisc_1.0-7 KEGG.db_2.2.11 GO.db_2.2.11 [4] rat2302.db_2.2.11 GOstats_2.10.0 RSQLite_0.7-1 [7] DBI_0.2-4 graph_1.22.2Category_2.10.1 [10] AnnotationDbi_1.6.1 qvalue_1.18.0 limma_2.18.2 [13] affy_1.22.0 Biobase_2.4.1 loaded via a namespace (and not attached): [1] affyio_1.12.0annotate_1.22.0 genefilter_1.24.2 [4] GSEABase_1.6.1 preprocessCore_1.6.0 RBGL_1.20.0 [7] splines_2.9.1survival_2.35-4 tools_2.9.1 [10] XML_2.5-3xtable_1.5-5 Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail The real problem is not whether machines think but whether men do. -- B. F. Skinner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with dealing with integer(0) returns from grep used within a conditional loop
Thanks, embarrased that I didn't think of that myself :) Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail The real problem is not whether machines think but whether men do. -- B. F. Skinner ** On Sat, Jul 4, 2009 at 2:47 PM, Allan Engelhardtall...@cybaea.com wrote: On 04/07/09 18:56, Mark Kimpel wrote: I am using grep to locate colnames to automate a report build and have run into a problem when a colname is not found. The use of integer(0) in a conditional statement seems to be a no no as it has length 0. Below is a self-contained trivial example. I would like to get something like NA or -1 for the position when it is not found OR learn a way to use integer(0) or some cast of it in a logical statement. Example, output, and sessionInfo follow. Thanks, Mark 3 vec.1- c(a, b, c) vec.2- c(a, c, d) for (i in 1:length(vec.1)){ for (j in 1:length(vec.2)){ print(paste(i:, i, j:, j, sep = )) pos- grep(vec.1[i], vec.2[j]) if (pos 0){ Try: if ( length(pos) 0 ) { Allan. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] problem with scan recognizing newline '\n'
I'm using R to do some file processing in Linux and am trying to read in the output of find . -type f -print ~/Music_Archives_search_problem/ls.output.find.txt This command yields a text file with each line representing the full path name of all files in the directory and subdirs. Unfortunately, there seem to be some special characters that interfere with scan recognizing '\n' as newline. At least that's what I assume the problem is, but I can't identify which those might be or how to correct the problem. Below is my code and the problem output followed by sessionInfo(). This is executed in a loop, with i starting from zero. I also tried with 'allowEscapes = TRUE', but that made no difference. As you can see, the first FLAC file is followed by a '\n', which is ignored. This seems to happen about once in every 20 file names, so it does work properly most of the time. Also, when the file is opened in emacs, the newlines are recognized. current.line - scan(~/Music_Archives_search_problem/ls.output.find.txt, skip = i, nlines = 1, what = 'character', sep = @, allowEscapes = FALSE) [1] ./Christian/Christian Gospel/Chanticleer/Chanticleer - How Sweet the Sound; Spirituals Traditional Gosp - 04 - Soon One Mornin Medley; Soon One Mornin-What You Gon Do When the flac\n./Christian/Christian Gospel/Chanticleer/Chanticleer - How Sweet the Sound; Spirituals Traditional Gosp - 05 - Didnt It Rain.flac sessionInfo() R version 2.9.0 (2009-04-17) x86_64-unknown-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices datasets utils methods base Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 399-1219 Home Skype: mkimpel The real problem is not whether machines think but whether men do. -- B. F. Skinner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 'options=utils::recover' not working in .Rprofile or within R
options(error=utils::recover) Does indeed work, at least with the new install of R-devel (to be 2.10.0) that I am running right now. I was sure I checked this with 2.9.0 last night, but I am probably mistaken. One point, the ?options help page is misleading in that the example is Note that these need to specified as e.g. 'options=utils::recover' in startup files such as '.Rprofile'. Since the use of utils:: is a new requirement, I think stemming from when utils is loaded, this help page should be corrected as the example is confusing/incorrect. So, stick with what is in the first line above and, for now, ignore the help page. Mark Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 399-1219 Home Skype: mkimpel The real problem is not whether machines think but whether men do. -- B. F. Skinner ** On Sat, May 30, 2009 at 10:49 PM, David Winsemius dwinsem...@comcast.netwrote: You are wiping out all of the default options with that approach. Try (after restarting R to get the other options back to what they should be): op=options() # so you can reset back to baseline options(error=utils::recover) # do not think the utils:: is needed my.func - function(x){ y - x + 12 nonsense y } my.func(14) Error in my.func(14) : object nonsense not found Enter a frame number, or 0 to exit 1: my.func(14) Selection: On May 30, 2009, at 10:24 PM, Mark Kimpel wrote: Duncan, I've pared down my .Rprofile so that it has just the options line, started R from terminal (instead of using ESS-emacs) and I still have the problem. Am I specifying the options incorrectly? I believe I took this directly from the help page. Not what the examples look like on my machine. See my output of .Rprofile, the code example that doesn't work as we think it ought, and my sessionInfo(). Thanks, Mark Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. read.table(~/.Rprofile) V1 1 options=utils::recover my.func - function(x){ + y - x + 12 + nonsense + y + } my.func(14) Error in my.func(14) : object 'nonsense' not found sessionInfo() R version 2.9.0 (2009-04-17) x86_64-unknown-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base David Winsemius, MD Heritage Laboratories West Hartford, CT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 'options=utils::recover' not working in .Rprofile or within R
Duncan, I've pared down my .Rprofile so that it has just the options line, started R from terminal (instead of using ESS-emacs) and I still have the problem. Am I specifying the options incorrectly? I believe I took this directly from the help page. See my output of .Rprofile, the code example that doesn't work as we think it ought, and my sessionInfo(). Thanks, Mark Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. read.table(~/.Rprofile) V1 1 options=utils::recover my.func - function(x){ + y - x + 12 + nonsense + y + } my.func(14) Error in my.func(14) : object 'nonsense' not found sessionInfo() R version 2.9.0 (2009-04-17) x86_64-unknown-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 399-1219 Home Skype: mkimpel The real problem is not whether machines think but whether men do. -- B. F. Skinner ** On Sat, May 30, 2009 at 5:44 AM, Duncan Murdoch murd...@stats.uwo.cawrote: [Sent this before completing my last sentence; here's another attempt] On 29/05/2009 11:45 PM, Mark Kimpel wrote: For years I have been using options(error = recover) either in .Rprofile or from within R for debugging purposes. The functionality of this appears to have changed and I can't recover it (no pun intended) using the ?options help page. How can I get the old functionality back, particularly from within .Rprofile? A specific line entry would be appreciated. An example, the help page, and sessionInfo() follow. Thanks, Mark I don't think there were any substantial changes in 2.9.0, so I would guess that you have a local object named recover or options, and it is causing your problems. When I run options(error=recover) and your two lines below, I get this output: options(error=recover) b.func - function(x) {y - x + 2; nonsense; y} b.func(3) Error in b.func(3) : object 'nonsense' not found Enter a frame number, or 0 to exit 1: b.func(3) Selection: 0 which is what you wanted. To put this into your .Rprofile, you need to use utils::recover (the utils package hasn't been attached yet). That also works for me. There have been changes to recover in R-devel (to become 2.10.0), and will likely be more, but what you did shouldn't appear much different than what I showed from 2.9.0 above. If you had sourced the code from a file, 2.10.0 should tell you which line of the file contained the error. Duncan Murdoch b.func - function(x) {y - x + 2; nonsense; y} b.func(3) Error in b.func(3) : object 'nonsense' not found ## in the past this would be a menu with numbers for what level I want to go to (in this case just 1) This help page states: 'error': either a function or an expression governing the handling of non-catastrophic errors such as those generated by 'stop' as well as by signals and internally detected errors... The functions 'dump.frames' and 'recover' provide alternatives that allow post-mortem debugging. Note that these need to specified as e.g. 'options=utils::recover' in startup files such as '.Rprofile'. sessionInfo() R version 2.9.0 (2009-04-17) x86_64-unknown-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices datasets utils methods base Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 399-1219 Home Skype: mkimpel The real problem is not whether machines think but whether men do. -- B. F. Skinner ** [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version
[R] 'options=utils::recover' not working in .Rprofile or within R
For years I have been using options(error = recover) either in .Rprofile or from within R for debugging purposes. The functionality of this appears to have changed and I can't recover it (no pun intended) using the ?options help page. How can I get the old functionality back, particularly from within .Rprofile? A specific line entry would be appreciated. An example, the help page, and sessionInfo() follow. Thanks, Mark b.func - function(x) {y - x + 2; nonsense; y} b.func(3) Error in b.func(3) : object 'nonsense' not found ## in the past this would be a menu with numbers for what level I want to go to (in this case just 1) This help page states: 'error': either a function or an expression governing the handling of non-catastrophic errors such as those generated by 'stop' as well as by signals and internally detected errors... The functions 'dump.frames' and 'recover' provide alternatives that allow post-mortem debugging. Note that these need to specified as e.g. 'options=utils::recover' in startup files such as '.Rprofile'. sessionInfo() R version 2.9.0 (2009-04-17) x86_64-unknown-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices datasets utils methods base Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 399-1219 Home Skype: mkimpel The real problem is not whether machines think but whether men do. -- B. F. Skinner ** [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] question about the Y of R article in the latest R news
I found the article the Y of R in the latest R news to be very interesting. It is certainly challenging me to learn more about how R works under the hood as the author states. What is less clear to me is whether this approach is primarily for teaching purposes or has a real world application. What is meant by fragility of reliance on the function name defined as a global variable as a downside to the classical recursive formulation of function s? How can that impact the average R programmer? Beyond that, empiricist that I am, I decided to put the examples to the test. My source code and output is below, but the bottom line consists of 2 observations: - The Y function approach using csum is consistently slower on my machine that the s function approach - The Y function using csum gives recursive error with high input values just like the s function does - The Y function in fact reaches the limit of recursion BEFORE the s function does Given that it is slower, is more cumbersome to write, and has a lower nesting limit than the classical approach, I wonder about its utility for the average programmer (or somewhat below average programmer like me). Okay, here's my code, output, and sessionInfo() s - function(n) { if (n == 1) return(1) return(s(n-1)+n) } Y - function(f) { g - function(h) function(x) f(h(h))(x) g(g) } csum - function(f) function(n) { if (n 2) return(1); return(n+f(n-1)) } recurs.time - matrix(0, ncol = 3, nrow = 100) Y.time - matrix(0, ncol = 3, nrow = 100) for (i in 1:100) recurs.time[i,] - unclass(system.time(a - s(996)))[1:3] ave.recurs.time - colSums(recurs.time) ave.recurs.time for (i in 1:100) Y.time[i,] - unclass(system.time(b - Y (csum)(996)))[1:3] ave.Y.time - colSums(Y.time) ave.Y.time u - s(1000) u v - Y (csum)(1000) v sessionInfo() s - function(n) { + if (n == 1) return(1) + return(s(n-1)+n) + } Y - function(f) { + g - function(h) function(x) f(h(h))(x) + g(g) + } csum - function(f) function(n) { + if (n 2) return(1); + return(n+f(n-1)) + } recurs.time - matrix(0, ncol = 3, nrow = 100) Y.time - matrix(0, ncol = 3, nrow = 100) for (i in 1:100) recurs.time[i,] - unclass(system.time(a - s(996)))[1:3] ave.recurs.time - colSums(recurs.time) ave.recurs.time [1] 0.356 0.004 0.355 for (i in 1:100) Y.time[i,] - unclass(system.time(b - Y (csum)(996)))[1:3] ave.Y.time - colSums(Y.time) ave.Y.time [1] 0.652 0.000 0.640 u - s(1000) u [1] 500500 v - Y (csum)(1000) Error: evaluation nested too deeply: infinite recursion / options(expressions=)? Error during wrapup: evaluation nested too deeply: infinite recursion / options(expressions=)? v Error: object v not found No suitable frames for recover() Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 399-1219 Home Skype: mkimpel The real problem is not whether machines think but whether men do. -- B. F. Skinner ** [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to list variables enclosed in an environment
I'm having trouble with a Bioconductor package, an variable expected in an environment does not seem to be there. As part of my investigation of the problem (most likely on my end) I'd like to list the variables contained in an environment. If you have an environment loaded, lets call it pkgEnv', how does one find what it does contain? Mark Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 399-1219 Home Skype: mkimpel The real problem is not whether machines think but whether men do. -- B. F. Skinner ** [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to list variables enclosed in an environment
Never mind, I got the brilliant idea to ls(pkgEnv) and of course it worked. Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 399-1219 Home Skype: mkimpel The real problem is not whether machines think but whether men do. -- B. F. Skinner ** On Fri, Oct 17, 2008 at 11:03 AM, Mark Kimpel [EMAIL PROTECTED] wrote: I'm having trouble with a Bioconductor package, an variable expected in an environment does not seem to be there. As part of my investigation of the problem (most likely on my end) I'd like to list the variables contained in an environment. If you have an environment loaded, lets call it pkgEnv', how does one find what it does contain? Mark Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 399-1219 Home Skype: mkimpel The real problem is not whether machines think but whether men do. -- B. F. Skinner ** [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] XML_1.98-0 fails to build on Debian Lenny with gcc 4.3.2 and R-beta 2.8.0
Subject pretty much says it all. Wonder if there is there is some code in XML that the new gcc doesn't like? See output below: * Installing *source* package 'XML' ... checking for gcc... gcc checking for C compiler default output file name... a.out checking whether the C compiler works... yes checking whether we are cross compiling... no checking for suffix of executables... checking for suffix of object files... o checking whether we are using the GNU C compiler... yes checking whether gcc accepts -g... yes checking for gcc option to accept ISO C89... none needed checking how to run the C preprocessor... gcc -E checking for sed... /bin/sed checking for xml2-config... no Cannot find xml2-config ERROR: configuration failed for package 'XML' ** Removing '/home/mkimpel/R_HOME/site-library-2.8.0/XML' Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 399-1219 Home Skype: mkimpel The real problem is not whether machines think but whether men do. -- B. F. Skinner ** [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] XML_1.98-0 fails to build on Debian Lenny with gcc 4.3.2 and R-beta 2.8.0
Dirk, Please let me know when you will support nightly builds of R-devel with all R and BioConductor packages. I would also need your help syncing my home setup with the one that I use on a remote Linux cluster using RHEL 5. For my needs, the packages on these two setups need to be exactly the same. I would also need your help being able to load either the current release of R and the current R-devel with 2 different site-libraries. If you are set up to help with this, let me know and I'll get started. Lastly, do you think non-Debian Linux users would NOT benefit to the answers to my question? It does seem that the problem was my lack of understanding of the need for the xlm2-dev libraries, not a Debian specific issue. Or do you believe that if I were using another distribution this problem would not have occurred? I don't think so, but if you believe that to be the case, it would be of interest. Ready to get started when you are, Mark Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 399-1219 Home Skype: mkimpel The real problem is not whether machines think but whether men do. -- B. F. Skinner ** On Tue, Oct 14, 2008 at 3:00 PM, Dirk Eddelbuettel [EMAIL PROTECTED] wrote: On Tue, Oct 14, 2008 at 02:34:57PM -0400, Mark Kimpel wrote: Subject pretty much says it all. Wonder if there is there is some code in XML that the new gcc doesn't like? See output below: You are wondering wronly. * Installing *source* package 'XML' ... checking for gcc... gcc checking for C compiler default output file name... a.out checking whether the C compiler works... yes checking whether we are cross compiling... no checking for suffix of executables... checking for suffix of object files... o checking whether we are using the GNU C compiler... yes checking whether gcc accepts -g... yes checking for gcc option to accept ISO C89... none needed checking how to run the C preprocessor... gcc -E checking for sed... /bin/sed checking for xml2-config... no Cannot find xml2-config ERROR: configuration failed for package 'XML' ** Removing '/home/mkimpel/R_HOME/site-library-2.8.0/XML' You seem to be a) missing the libxml2-dev package for Debian: sudo apt-get install libxml2-dev b) once again ignoring the fact that XML is available for you as a binary Debian package via sudo apt-get install r-cran-xml' c) also ignoring the fact that, should you still insist on building it yourself, that sudo apt-get build-dep r-cran-xml would do step a) for you d) forgetting that we repeatedly recommended r-sig-debian as a more suitable mailing list to you. Stunned, Dirk Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 399-1219 Home Skype: mkimpel The real problem is not whether machines think but whether men do. -- B. F. Skinner ** [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Three out of two people have difficulties with fractions. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] using assign with lists
I am performing many permutations on a data-set with each permutation producing a variable number of results. I thought that the best way to keep track of all this in one object would be with a list ('res.lst'). To address these variable results for each permutation I attempted to construct this list using 'assign'. There is even more nesting than indicated below, but this is a simple example that, if addressed, will fit answer my question. The below code chunk clearly does not produce the desired results because, instead of assigning a new vector to the list, it creates a new variable 'res.list$contrast.i.j' . In the last two lines I show what I really want to happen. Can I use assign in this context by using it differently? Thanks, Mark res.lst - list() for (i in 1:2){ for (j in 1:2){ assign(paste(res.lst$contrast, i, j, sep = .), paste(i,j,sep=.)) } } res.lst ls(pattern = res.lst..?) res.lst$contrast.5.5 - 5.5 res.lst Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 399-1219 Home Skype: mkimpel ** [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] efficient use of lm over a matrix vs. using apply over rows
I have a large matrix, each row of which needs lm applied. I am certain than I read an article in R-news about this within the last year or two that discussed the application of lm to matrices but I'll be darned if I can find it with Google. Probably using the wrong search terms. Can someone steer me to this article of just tell me if this is possible and, if so, how to do it? My simplistic attempts have failed. Mark Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 399-1219 Home Skype: mkimpel ** [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] efficient use of lm over a matrix vs. using apply over rows
Sorry for the vagueness of my question, your interpretation, however, was spot on. Correct me if I am wrong, but my impression is that apply is a more compact way of a for loop, but that the way R handles them computationally are the same. In the article I seem to remember, there was a significant increase in speed with your second approach, presumably because function calls are avoided in R and the heavy lifting is done in C. I will use your second approach anyway, but can I expect increased computational efficiency with it and, if so, is my reasoning in the prior sentence correct? BTW, it appears as though my own attempt was almost correct, but I did not transpose the matrix. In genomics, our response variables (genes) are the rows and the predictor values are the column names. The BioConductor packages I routinely use are very good at hiding this and I just didn't come to mind. Mark Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 399-1219 Home Skype: mkimpel ** On Sun, Oct 5, 2008 at 10:28 AM, Duncan Murdoch [EMAIL PROTECTED]wrote: On 05/10/2008 10:08 AM, Mark Kimpel wrote: I have a large matrix, each row of which needs lm applied. I am certain than I read an article in R-news about this within the last year or two that discussed the application of lm to matrices but I'll be darned if I can find it with Google. Probably using the wrong search terms. Can someone steer me to this article of just tell me if this is possible and, if so, how to do it? My simplistic attempts have failed. You don't give a lot of detail on what you mean by applying lm to a row of a matrix, but I'll assume you have fixed predictor variables, and each row is a different response vector. Then you can use apply() like this: x - 1:10 mat - matrix(rnorm(200), nrow=20, ncol=10) resultlist - apply(mat, 1, function(y) lm(y ~ x)) resultcoeffs - apply(mat, 1, function(y) lm(y ~ x)$coefficients) resultlist will contain a list of 20 different lm() results, resultcoeffs will be a matrix holding just the coefficients. lm() also allows the response to be a matrix, where the columns are considered different components of a multivariate response. So if you transpose your matrix you can do it all in one call: resultmulti - lm(t(mat) ~ x) The coefficients of resultmulti will match resultcoeffs. Duncan Murdoch Duncan Murdoch [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] creating overall title for plots made with par(mfrow=c(2,2))
I'm making some plots on the same page and would like to include an overall title instead of individual main titles as they are similar and their x and y axis labels are sufficient to distinguish them. Is there a way to assign an overall main to this page of plots? Mark -- Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 663-0513 Home (no voice mail please) ** [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] creating overall title for plots made with par(mfrow=c(2, 2))
Ouch! I had searched the archives and read over ?par and ?plot, but sure missed the post of today. Thanks, Mark On Tue, Aug 5, 2008 at 7:27 PM, Marc Schwartz [EMAIL PROTECTED]wrote: on 08/05/2008 06:05 PM Mark Kimpel wrote: I'm making some plots on the same page and would like to include an overall title instead of individual main titles as they are similar and their x and y axis labels are sufficient to distinguish them. Is there a way to assign an overall main to this page of plots? Mark Mark, Hate to do this, but see this post from Prof. Ripley from earlier today: https://stat.ethz.ch/pipermail/r-help/2008-August/169974.html :-) Regards, Marc Schwartz -- Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 663-0513 Home (no voice mail please) ** [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] setting editor environment variable EDITOR either when configuring R for installation or in .Rprofile
Marc, That is exactly what I was looking for, although I am surprised that this variable does not appear to be configurable during installation from source as are some of the other environmental variables. Mark On Wed, Jul 30, 2008 at 2:04 PM, Marc Schwartz [EMAIL PROTECTED]wrote: on 07/30/2008 12:54 PM Mark Kimpel wrote: I'm running R on Linux and use emacs as my editor. When doing edit(vignette(foo.vignette)) I would like to invoke emacs rather than the default vi. I am able to manually set this by editing $R_HOME/etc/Renviron but would like to avoid doing this with each install. I assume this can be accomplished with a flag to .configure or in .Rprofile but I can't find the syntax in R-admin. Editor is not listed as an environment variable in appendix B of that manual. So, help is appreciated as I've probably missed something. Mark Mark, See ?options for 'editor'. This is also referenced in ?edit, where the default value for the 'editor' argument is getOption(editor). Thus, in your .Rprofile, put: options(editor=emacs) HTH, Marc Schwartz -- Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 663-0513 Home (no voice mail please) ** [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] source code for R-dev packages
Where is the link on www.r-project.org or CRAN to download source code for development versions of packages? This is straightforward for BioConductor packages but I can't seem to find it for R packages. Mark -- Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 663-0513 Home (no voice mail please) ** [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with cube3d cube size
Duncan Ben, Thanks for giving me some tips as to how I can best investigate packages. Ive been using ESS-Emacs and the help pages are HTML. Looks like I need to figure out how to enable HTML help. As for a draft vignette, sure, I would be happy (and honored) to contribute as best I can. I would need a lot of help from the two of you, but it would be a good learning experience for me and give me a chance to give something back to the community which has been so good to me. Speaking of vignettes, I wonder if there should be a vignette written called navigating new functions and packages for the R beginner or be included as a chapter in An Introduction to R, which focuses on R-base functionality. Im just musing here, but I often think that if I need to learn something perhaps others do too. One thing at a time, however, so I'll work on a draft and send it to you for comments and input. Ben, if you get such a list of packages with functions that depend on rgl, send it to me. Mark On Thu, Jun 26, 2008 at 7:02 AM, Duncan Murdoch [EMAIL PROTECTED] wrote: Mark Kimpel wrote: Thanks for the pointers. I think the package is great, just want to use it to its full potential without driving the list crazy with questions. Below is my output to help and sessionInfo(). I don't see the scaling functions, although I now see that they can be retrieved with ?matrices. You guys are the experts as to what should go where, but for someone unfamiliar with how rgl works, having them print out with help(package = rgl) would make some of these functions more obvious to the newbie. Mark The issue here is that several of those are documented on the same page, and that index lists the names of help pages, not all of the aliases that point there. The index in the HTML help version (or the CHM help in Windows) repeats each alias, so you'll see things like scale3dWork with homogeneous coordinates scaleMatrixWork with homogeneous coordinates as well as the entry for matrices as below. I can see arguments for both: repetition is bad, full coverage is good. The index in the pdf document is another approach: it lists all the aliases, and tells you which topic documents them. All of this is common to all R packages; rgl isn't doing anything special to produce these lists. Duncan Murdoch aspect3dSet the aspect ratios of the current plot axes3d Draw boxes, axes and other text outside the data ellipse3d Make an ellipsoid grid3d Add a grid to a 3D plot matricesWork with homogeneous coordinates par3d Set or Query RGL Parameters par3dinterp Interpolator for par3d parameters persp3d Surface plots play3d Play animation of rgl scene plot3d 3D Scatterplot points3dadd primitive set shape qmesh3d 3D Quadrangle Mesh objects r3d Generic 3D interface rgl-package 3D visualization device system rgl.bboxSet up Bounding Box decoration rgl.bg Set up Background rgl.bringtotop Assign focus to an RGL window rgl.clear scene management rgl.light add light source rgl.materialGeneric Appearance setup rgl.postscript export screenshot rgl.primitive add primitive set shape rgl.setMouseCallbacks User callbacks on mouse events rgl.snapshotexport screenshot rgl.spheres add sphere set shape rgl.surface add height-field surface shape rgl.texts add text rgl.user2window Convert between rgl user and window coordinates rgl.viewpoint Set up viewpoint select3dSelect a rectangle in an RGL scene spin3d Create a function to spin a scene at a fixed rate sprites3d add sprite set shape subdivision3d generic subdivision surface method surface3d add height-field surface shape sessionInfo() R version 2.7.1 (2008-06-23) i686-pc-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] rgl_0.79 graph_1.18.1 loaded via a namespace (and not attached): [1] cluster_1.11.11 tools_2.7.1 On Wed, Jun 25, 2008 at 5:25 PM, Duncan Murdoch [EMAIL PROTECTED] wrote: Mark Kimpel wrote: Ben and Duncan, Thanks for your helpful suggestions. Im having some difficulty navigating this really good package using my normal learning techniques. When I do 'help(package = rgl) it seems only a very
Re: [R] help with cube3d cube size
Thanks for the pointers. I think the package is great, just want to use it to its full potential without driving the list crazy with questions. Below is my output to help and sessionInfo(). I don't see the scaling functions, although I now see that they can be retrieved with ?matrices. You guys are the experts as to what should go where, but for someone unfamiliar with how rgl works, having them print out with help(package = rgl) would make some of these functions more obvious to the newbie. Mark aspect3dSet the aspect ratios of the current plot axes3d Draw boxes, axes and other text outside the data ellipse3d Make an ellipsoid grid3d Add a grid to a 3D plot matricesWork with homogeneous coordinates par3d Set or Query RGL Parameters par3dinterp Interpolator for par3d parameters persp3d Surface plots play3d Play animation of rgl scene plot3d 3D Scatterplot points3dadd primitive set shape qmesh3d 3D Quadrangle Mesh objects r3d Generic 3D interface rgl-package 3D visualization device system rgl.bboxSet up Bounding Box decoration rgl.bg Set up Background rgl.bringtotop Assign focus to an RGL window rgl.clear scene management rgl.light add light source rgl.materialGeneric Appearance setup rgl.postscript export screenshot rgl.primitive add primitive set shape rgl.setMouseCallbacks User callbacks on mouse events rgl.snapshotexport screenshot rgl.spheres add sphere set shape rgl.surface add height-field surface shape rgl.texts add text rgl.user2window Convert between rgl user and window coordinates rgl.viewpoint Set up viewpoint select3dSelect a rectangle in an RGL scene spin3d Create a function to spin a scene at a fixed rate sprites3d add sprite set shape subdivision3d generic subdivision surface method surface3d add height-field surface shape sessionInfo() R version 2.7.1 (2008-06-23) i686-pc-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] rgl_0.79 graph_1.18.1 loaded via a namespace (and not attached): [1] cluster_1.11.11 tools_2.7.1 On Wed, Jun 25, 2008 at 5:25 PM, Duncan Murdoch [EMAIL PROTECTED] wrote: Mark Kimpel wrote: Ben and Duncan, Thanks for your helpful suggestions. Im having some difficulty navigating this really good package using my normal learning techniques. When I do 'help(package = rgl) it seems only a very small subset of functions available show up. I think the full list shows up there, if you're using a current version. What specific function is missing? Perusing the rgl.pdf downloaded from CRAN demonstrates the same lack of documentation. All of the functions intended for users are documented, and they show up in rgl.pdf. There is no vignette. In addition, I have found at least one other package with 3d functions (emdbook::curve3d()). A vignette would be nice, but there isn't one. Our paper from useR 2007 is the most recent reference (see http://www.r-project.org/conferences/useR-2007/program/presentations/murdoch.pdf); it cites Daniel's 2003 thesis and the 2003 paper about the package. After those, the NEWS file lists some recent additions. emdbook makes use of rgl and some other 3d engines, as does misc3d. scatterplot3d does it's own drawing. rggobi is a completely different interactive package. What is the best resource for learning about all the foo3d() and lower level functionality that rgl and its dependents provide? I saw a book at BN just last week on openGL. Would that be helpful? It might, but probably not. rgl is intended to be a higher level R-style interface to the things described in a book like that. So if you have a particular question about how to do something, you'd never find it there. On the other hand, if you want to know if something is possible, then that might be a place to look for ideas. Duncan Murdoch Mark On Tue, Jun 24, 2008 at 10:54 PM, Ben Bolker [EMAIL PROTECTED] wrote: Mark Kimpel mwkimpel at gmail.com writes: I'm using the command below on an open3d() object to create a shaded cube. Changes to myScalingFactor do not effect changes in the size of the cube. What is the correct approach? Mark how about scale3d() ? shade3d(translate3d(scale3d(cube3d(),5,5,5),-6,1
[R] question on rgl.surface
I'd like to use rgl.surface (or some other function if more appropriate) to create a horizontal and vertical transparent grey slice (plane) running through both the x and y origins and extending across the z axis, i.e. the 3-d equivalent of the normal 2-d coordinate axes we are all familiar with. The examples for rgl.surface are a bit more complex than what I need and I am having trouble understanding them. Here is the code if I come up with, but which obviously doesn't work. require(rgl) set.seed(123) d3.mat - matrix(runif(30, min = -5, max = 5), ncol = 3, nrow = 10) open3d() plot3d(x = d3.mat, type = s, col = blue, size = 0.33, xlab =x, ylab = y, zlab = z) x - 0 y - 0 z - matrix(c(floor(min(d3.mat[,1])),ceiling(max(d3.mat[,1])),floor(min(d3.mat[,3])),ceiling(max(d3.mat[,3]))),nrow = 2, ncol =2) rgl.surface(x, z, y, color=grey, back=lines) What am I doing wrong? And, while I'm at it, there is another minor question I have, which is how can I exaggerate the size difference in the spheres between front and back? Thanks, Mark -- Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 663-0513 Home (no voice mail please) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] trouble installing Rmpi on 64-bit Ubuntu 8.04 with openmpi
Thanks to all for the advice. Took me a bit to get back to this, but the following worked just fine for me with my 64-bit Ubuntu 8.04 OS: R CMD INSTALL Rmpi_0.5-5.tar.gz --configure-args=--with-mpi=/usr/lib64/openmpi On Thu, Jun 5, 2008 at 3:23 AM, Paul Hewson [EMAIL PROTECTED] wrote: Or (more simply?) install it from the bash prompt using: R CMD INSTALL Rmpi_0.5-5.tar.gz --configure-args=--with-mpi=/opt/openmpi/include/ (or whatever your path to openmpi might be). I missed this as well and confused myself for a bit, but it is mentioned in the R news article on Rmpi H.Yu (2002) Rmpi: Parallel Statistical Computing in R Vol 2(2) page 10-14 Our cluster has a variety of forms of mpi running, and I also had to follow the readme in the Rmpi library carefully to make sure it didn't spawn stray lam-mpi processes as well. Best Paul -=-=-==-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Paul Hewson Lecturer in Statistics University of Plymouth Drake Circus Plymouth PL4 8AA tel ++44(0)1752 232778 email [EMAIL PROTECTED] web http://www.plymouth.ac.uk/staff/phewson From: [EMAIL PROTECTED] [EMAIL PROTECTED] On Behalf Of tub78 [EMAIL PROTECTED] Sent: 04 June 2008 21:11 To: r-help@r-project.org Subject: Re: [R] trouble installing Rmpi on 64-bit Ubuntu 8.04 with openmpi The problem here is that the compiler cannot find the include files for mpi. Notice that the first checks that fail are: checking mpi.h usability... no checking mpi.h presence... no checking for mpi.h... no One solution is to create a file named ~/.R/Makevars with the following line: PKG_CPPFLAGS = -I/opt/openmpi/include ... where /opt/openmpi/include/ contains the necessary mpi.h include file. Then, try to recompile. Hope this helps, - Stu On May 6, 12:52 pm, Mark Kimpel [EMAIL PROTECTED] wrote: Subject pretty much says it all. I am running 64-bit Ubuntu 8.04, i.e. Hardy Heron, have openmpi installed, and get the following error message with attempted install of Rmpi. sessionInfo() follows. Mark checking for ANSI C header files... yes checking for sys/types.h... yes checking for sys/stat.h... yes checking for stdlib.h... yes checking for string.h... yes checking for memory.h... yes checking for strings.h... yes checking for inttypes.h... yes checking for stdint.h... yes checking for unistd.h... yes checking mpi.h usability... no checking mpi.h presence... no checking for mpi.h... no Try to find libmpi.so or libmpich.a checking for main in -lmpi... yes checking for openpty in -lutil... yes checking for main in -lpthread... yes configure: creating ./config.status config.status: creating src/Makevars ** libs gcc -std=gnu99 -I/home/mkimpel/R_HOME/R-patched/R-build/lib64/R/include -DPACKAGE_NAME=\\ -DPACKAGE_TARNAME=\\ -DPACKAGE_VERSION=\\ -DPACKAGE_STRING=\\ -DPACKAGE_BUGREPORT=\\ -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -DUNKNOWN -fPIC -I/usr/local/include-fpic -g -O2 -c conversion.c -o conversion.o In file included from conversion.c:18: Rmpi.h:1:17: error: mpi.h: No such file or directory In file included from conversion.c:18: Rmpi.h:14: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'mpitype' make: *** [conversion.o] Error 1 chmod: cannot access `/home/mkimpel/R_HOME/site-library-2.7.0/Rmpi/libs/*': No such file or directory ERROR: compilation failed for package 'Rmpi' ** Removing '/home/mkimpel/R_HOME/site-library-2.7.0/Rmpi' The downloaded packages are in /tmp/RtmppcK0FI/downloaded_packages Warning message: sessionInfo() R version 2.7.0 Patched (2008-05-04 r45620) x86_64-unknown-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF- 8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_A DDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] graph_1.18.0 loaded via a namespace (and not attached): [1] cluster_1.11.10 tcltk_2.7.0 tools_2.7.0 -- Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 663-0513 Home (no voice mail please) ** [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing
Re: [R] strange (to me) p-value distribution
Wolfgang, Thank you for both the explanation and the beautiful R code to demonstrate your point. Even after seeing the empirical evidence, however, I couldn't get the underlying mechanism into my head. I tweaked your code a bit to make the batch effect even bigger, to the point where, ah ha, the distribution no longer approximates normal but is clearly bivariate (additional histograms). I went back to my original data and looked at the histogram of logged expression values. Although not as clear cut, the distribution is not normal and indeed there is a hint of several humps corresponding to different batches. I need to see what effect their inclusion into my model is. Lesson relearned about the importance of visualizing data before starting an analysis. My slightly tweaked code is below, in case anyone wants to look at it. Mark library(genefilter) nr - 31000 nc - 20 x.1 - matrix(rnorm(nr*nc), nrow = nr, ncol = nc) fact - factor(1:nc %% 2) sapply(split(x.1, fact), mean) sapply(split(x.1, fact), sd) rt1 - rowttests(x.1, fact) ## add a batch effect x.2 - x.1 x.2[, 1:10] - x.2[, 1:10] + pi sapply(split(x.2, fact), mean) sapply(split(x.2, fact), sd) rt2 - rowttests(x.2, fact) par(mfrow = c(2,2)) hist(x.1, breaks = 50, col = mintcream) hist(x.2, breaks = 50, col = mistyrose) hist(rt1$p.value, breaks = 100, col = mintcream) hist(rt2$p.value, breaks = 100, col = mistyrose) On Sat, Jun 7, 2008 at 6:40 PM, Wolfgang Huber [EMAIL PROTECTED] wrote: Dear Mark, try out the example code below. Such a p-value distribution often occurs if you have batch effects, i.e. if the between-group variability is in fact less than the within-group variability. In the example below, I do, for each row of x, a t-test between the values in the even and odd columns; for rt2, a batch effect has been added to columns 1:10. hope this helps Wolfgang library(genefilter) nr = 31000 nc = 20 x = matrix(rnorm(nr*nc), nrow=nr, ncol=nc) rt1 = rowttests(x, factor(1:nc %% 2)) ## add a batch effect x[, 1:10] = x[, 1:10] + pi/2 rt2 = rowttests(x, factor(1:nc %% 2)) par(mfrow=c(2,1)) hist(rt1$p.value, breaks=100, col=mistyrose) hist(rt2$p.value, breaks=100, col=mistyrose) -- Wolfgang Huber EBI/EMBL Cambridge UK http://www.ebi.ac.uk/huber Mark Kimpel a écrit 07/06/2008 18:39: I'm working with a genomic data-set with ~31k end-points and have performed an F-test across 5 groups for each end-point. The QA measurments on the individual micro-arrays all look good. One of the first things I do in my work-flow is take a look at the p-valued distribution. it is my understanding that, if the findings are due to chance alone, the p-value distribution should be uniform. In this case the histogram, even with 1000 break points, starts low on the left and climbs almost linearly to the right. In other words, very skewed towards high p-values. I understand that this could be happening by chance alone, but the same behavior is seen in the two contrasts of interest I looked at and I have seen it in a couple of our other genomic, high-dimensional experiments as well. I might also add that I looked at the actual numbers of genes with p-val X and indeed, for each X 0.05, there are far fewer sig. genes than one would expect by chance. I can't figure out what is causing this and, if there is a cause, I'd like to be able to tell the experimenter if it indicates a technical factor. I've had other experiments where the p-value dist approximates normal and of course those that have nice spikes at low p-values indicating we have some significant genes. I'm addressing this hear rather than to BioC because I suspect there is some basis statistical mechanism that could explain this. Is there? Mark -- Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 663-0513 Home (no voice mail please) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] strange (to me) p-value distribution
I'm working with a genomic data-set with ~31k end-points and have performed an F-test across 5 groups for each end-point. The QA measurments on the individual micro-arrays all look good. One of the first things I do in my work-flow is take a look at the p-valued distribution. it is my understanding that, if the findings are due to chance alone, the p-value distribution should be uniform. In this case the histogram, even with 1000 break points, starts low on the left and climbs almost linearly to the right. In other words, very skewed towards high p-values. I understand that this could be happening by chance alone, but the same behavior is seen in the two contrasts of interest I looked at and I have seen it in a couple of our other genomic, high-dimensional experiments as well. I might also add that I looked at the actual numbers of genes with p-val X and indeed, for each X 0.05, there are far fewer sig. genes than one would expect by chance. I can't figure out what is causing this and, if there is a cause, I'd like to be able to tell the experimenter if it indicates a technical factor. I've had other experiments where the p-value dist approximates normal and of course those that have nice spikes at low p-values indicating we have some significant genes. I'm addressing this hear rather than to BioC because I suspect there is some basis statistical mechanism that could explain this. Is there? Mark -- Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 663-0513 Home (no voice mail please) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rggobi is crashing R-2.7.0
Hard as it is for me to imagine, the ggobi windows stay open and functional while R (in emacs) has crashed in the background after throwing the error messages. As a perhaps naive Linux user, I thought that if a parent process crashed any processes it spawned would crash too. I guess not in this case. Ubuntu currently is distributing graphviz 2.16.1 Thanks, Mark On Wed, May 7, 2008 at 12:14 AM, Michael Lawrence [EMAIL PROTECTED] wrote: On Tue, May 6, 2008 at 6:06 PM, Mark Kimpel [EMAIL PROTECTED] wrote: R crashed just after the warnings were issued, but ggobi kept running (if that makes sense). I am not sure if that makes sense; GGobi should exit when the R process does. I have Graphviz and Rgraphiz installed and use Rgraphviz regularly without a problem, so I'm not sure why it didn't load. Mark Which version of graphviz? I am not sure which version the Ubuntu binary expects, but this may be a binary compatibility issue. That said, I am not sure if the GraphLayout plugin is the reason for R crashing... On Tue, May 6, 2008 at 4:37 PM, Michael Lawrence [EMAIL PROTECTED] wrote: On Tue, May 6, 2008 at 10:32 AM, Mark Kimpel [EMAIL PROTECTED] wrote: I am running 64-bit Ubuntu 8.04 and when I invoke rggobi the interactive graph displays but R crashes. See my sessionInfo() and a short example below. Ggobi and rggobi installed without complaints. Mark sessionInfo() R version 2.7.0 Patched (2008-05-04 r45620) x86_64-unknown-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] rggobi_2.1.9 RGtk2_2.12.5-3 graph_1.18.0 loaded via a namespace (and not attached): [1] cluster_1.11.10 tools_2.7.0 a - matrix(rnorm(1000), nrow = 10) g - ggobi(a) ** (R:25146): CRITICAL **: Error on loading plugin library plugins/GraphLayout/plugin.la: libgvc.so.3: cannot open shared object file: No such file or directory ** (R:25146): CRITICAL **: Error on loading plugin library plugins/GraphLayout/plugin.la: libgvc.so.3: cannot open shared object file: No such file or directory ** (R:25146): CRITICAL **: can't locate required plugin routine addToToolsMenu in GraphLayout It's not clear to me - did R crash or did you just receive these warnings? These warnings are due to a missing graphviz, so the GraphLayout plugin fails to load. -- Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 663-0513 Home (no voice mail please) ** [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 663-0513 Home (no voice mail please) ** -- Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 663-0513 Home (no voice mail please) ** [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rggobi is crashing R-2.7.0
Uninstalling and reinstalling ggobi via Synaptic solved the problem, at least for the demo data mtcars. Rotation works fine. No crashes on exit. Thanks for the good advice. Mark On Wed, May 7, 2008 at 2:12 PM, Paul Johnson [EMAIL PROTECTED] wrote: On Tue, May 6, 2008 at 12:32 PM, Mark Kimpel [EMAIL PROTECTED] wrote: I am running 64-bit Ubuntu 8.04 and when I invoke rggobi the interactive graph displays but R crashes. See my sessionInfo() and a short example below. Ggobi and rggobi installed without complaints. Mark sessionInfo() R version 2.7.0 Patched (2008-05-04 r45620) x86_64-unknown-linux-gnu In the R 2.7 release notes, there is a comment about a change in the GUI libraries and it says that one must recompile everything that relies on R. If your R 2.7 was an upgrade, not a fresh install, it could explain why this is happening. If there's some old library or R package sitting around, it could account for this. The part that concerned me about the R release note is that they don't give a very clear guide on how far back in the toolchain we are supposed to go. Certainly, ggobi has to be rebuilt from scratch. But are any of the things on which ggobi depends needing recompliation as well. pj -- Paul E. Johnson Professor, Political Science 1541 Lilac Lane, Room 504 University of Kansas -- Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 663-0513 Home (no voice mail please) ** [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] trouble installing Rmpi on 64-bit Ubuntu 8.04 with openmpi
Subject pretty much says it all. I am running 64-bit Ubuntu 8.04, i.e. Hardy Heron, have openmpi installed, and get the following error message with attempted install of Rmpi. sessionInfo() follows. Mark checking for ANSI C header files... yes checking for sys/types.h... yes checking for sys/stat.h... yes checking for stdlib.h... yes checking for string.h... yes checking for memory.h... yes checking for strings.h... yes checking for inttypes.h... yes checking for stdint.h... yes checking for unistd.h... yes checking mpi.h usability... no checking mpi.h presence... no checking for mpi.h... no Try to find libmpi.so or libmpich.a checking for main in -lmpi... yes checking for openpty in -lutil... yes checking for main in -lpthread... yes configure: creating ./config.status config.status: creating src/Makevars ** libs gcc -std=gnu99 -I/home/mkimpel/R_HOME/R-patched/R-build/lib64/R/include -DPACKAGE_NAME=\\ -DPACKAGE_TARNAME=\\ -DPACKAGE_VERSION=\\ -DPACKAGE_STRING=\\ -DPACKAGE_BUGREPORT=\\ -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -DUNKNOWN -fPIC -I/usr/local/include-fpic -g -O2 -c conversion.c -o conversion.o In file included from conversion.c:18: Rmpi.h:1:17: error: mpi.h: No such file or directory In file included from conversion.c:18: Rmpi.h:14: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'mpitype' make: *** [conversion.o] Error 1 chmod: cannot access `/home/mkimpel/R_HOME/site-library-2.7.0/Rmpi/libs/*': No such file or directory ERROR: compilation failed for package 'Rmpi' ** Removing '/home/mkimpel/R_HOME/site-library-2.7.0/Rmpi' The downloaded packages are in /tmp/RtmppcK0FI/downloaded_packages Warning message: sessionInfo() R version 2.7.0 Patched (2008-05-04 r45620) x86_64-unknown-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] graph_1.18.0 loaded via a namespace (and not attached): [1] cluster_1.11.10 tcltk_2.7.0 tools_2.7.0 -- Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 663-0513 Home (no voice mail please) ** [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] rggobi is crashing R-2.7.0
I am running 64-bit Ubuntu 8.04 and when I invoke rggobi the interactive graph displays but R crashes. See my sessionInfo() and a short example below. Ggobi and rggobi installed without complaints. Mark sessionInfo() R version 2.7.0 Patched (2008-05-04 r45620) x86_64-unknown-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] rggobi_2.1.9 RGtk2_2.12.5-3 graph_1.18.0 loaded via a namespace (and not attached): [1] cluster_1.11.10 tools_2.7.0 a - matrix(rnorm(1000), nrow = 10) g - ggobi(a) ** (R:25146): CRITICAL **: Error on loading plugin library plugins/GraphLayout/plugin.la: libgvc.so.3: cannot open shared object file: No such file or directory ** (R:25146): CRITICAL **: Error on loading plugin library plugins/GraphLayout/plugin.la: libgvc.so.3: cannot open shared object file: No such file or directory ** (R:25146): CRITICAL **: can't locate required plugin routine addToToolsMenu in GraphLayout -- Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 663-0513 Home (no voice mail please) ** [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [BioC] RCurl loading problem with 64 bit linux distribution
_Jv_RegisterClasses 00208008 d __CTOR_END__ 00208000 d __CTOR_LIST__ 00208018 d __DTOR_END__ 00208010 d __DTOR_LIST__ 7e58 r __FRAME_END__ 00208020 d __JCR_END__ 00208020 d __JCR_LIST__ 0020aee0 A __bss_start w __cxa_finalize@@GLIBC_2.2.5 6e20 t __do_global_ctors_aux 3890 t __do_global_dtors_aux 00208680 d __dso_handle w __gmon_start__ U __stack_chk_fail@@GLIBC_2.4 U __strdup@@GLIBC_2.2.5 0020aee0 A _edata 0020aef0 A _end 6e58 T _fini 3188 T _init 5c30 T addFormElement 6050 T buildForm 3870 t call_gmon_start U calloc@@GLIBC_2.2.5 4f90 T checkEncoding 0020aee0 b completed.6183 6a30 T createNamedEnum U curl_easy_cleanup@@CURL_GNUTLS_3 U curl_easy_duphandle@@CURL_GNUTLS_3 U curl_easy_getinfo@@CURL_GNUTLS_3 U curl_easy_init@@CURL_GNUTLS_3 U curl_easy_perform@@CURL_GNUTLS_3 U curl_easy_setopt@@CURL_GNUTLS_3 U curl_easy_strerror@@CURL_GNUTLS_3 U curl_escape@@CURL_GNUTLS_3 U curl_formadd@@CURL_GNUTLS_3 U curl_formfree@@CURL_GNUTLS_3 U curl_free@@CURL_GNUTLS_3 U curl_global_cleanup@@CURL_GNUTLS_3 U curl_global_init@@CURL_GNUTLS_3 U curl_multi_add_handle@@CURL_GNUTLS_3 U curl_multi_fdset@@CURL_GNUTLS_3 U curl_multi_init@@CURL_GNUTLS_3 U curl_multi_perform@@CURL_GNUTLS_3 U curl_multi_remove_handle@@CURL_GNUTLS_3 U curl_slist_append@@CURL_GNUTLS_3 U curl_slist_free_all@@CURL_GNUTLS_3 U curl_unescape@@CURL_GNUTLS_3 U curl_version@@CURL_GNUTLS_3 U curl_version_info@@CURL_GNUTLS_3 U fprintf@@GLIBC_2.2.5 38e0 t frame_dummy U free@@GLIBC_2.2.5 4590 T getBinaryDataFromR 43f0 T getCURLPointerRObject 53e0 T getCurlError 5560 T getCurlInfoElement 5900 T getCurlPointerForData 3c40 T getMultiCURLPointerRObject 3ec0 T getRStringsFromNullArray 4190 T makeCURLPointerRObject 47b0 T makeCURLcodeRObject 3d80 T makeMultiCURLPointerRObject U malloc@@GLIBC_2.2.5 U memcpy@@GLIBC_2.2.5 002080a0 d names.7400 00208688 d p.6181 U realloc@@GLIBC_2.2.5 U select@@GLIBC_2.2.5 U sprintf@@GLIBC_2.2.5 U stderr@@GLIBC_2.2.5 U strcpy@@GLIBC_2.2.5 U strlen@@GLIBC_2.2.5 U strncpy@@GLIBC_2.2.5 mkimpel-m90 ~/bin/curl-7.18.1: On Tue, May 6, 2008 at 10:36 PM, Martin Morgan [EMAIL PROTECTED] wrote: Hi Mark... A couple of shots in the dark, as no one else seems to be leaping in... The symbol Curl_base64_encode should be defined in /home/mkimpel/R_HOME/site-library-2.7.0/RCurl/libs/RCurl.so. What does nm /home/mkimpel/R_HOME/site-library-2.7.0/RCurl/libs/RCurl.so say? Mine says 3980 T Curl_base64_encode with the 'T' indicating that the symbol is defined (make sure nm spits out a bunch of lines before concluding that Curl_base64_encode is not defined). I retrieved the RCurl source, and one thing I notice is that RCurl/src/curl_base64.c has the 'execute' bit set, and perhaps a sane system would not compile it. Try % chmod -x RCurl/src/curl_base64.c and then % R CMD INSTALL RCurl Martin Mark Kimpel [EMAIL PROTECTED] writes: I'm having same problem on Ubuntu 64-bit Hardy Heron. A bunch of security patches from Ubuntu came out and I installed them today. After that was when I first noted the problem (affycoretools, which I use all the time, won't load). Below is my initial output, what follows is my reinstallation output followed by the same error messages as obtained intially. I wonder if a security patch has changed Curl? Or did RCurl just change? I have been using R-2.7.0 since half-way through its develoment cycle and this is a new problem for me. Mark require(RCurl) Loading required package: RCurl Error in dyn.load(file, DLLpath = DLLpath, ...) : unable to load shared library '/home/mkimpel/R_HOME/site-library-2.7.0/RCurl/libs/RCurl.so': /home/mkimpel/R_HOME/site-library-2.7.0/RCurl/libs/RCurl.so: undefined symbol: Curl_base64_encode install.packages(RCurl) sessionInfo() R version 2.7.0 Patched (2008-05-04 r45620) x86_64-unknown-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8
Re: [R] [BioC] RCurl loading problem with 64 bit linux distribution
Duncan, I know have two version of libcurl on my system, the Ubuntu installed 7.18.0 and my newly compiled from source 7.18.1 (which I installed after my problems began with RCurl). I was afraid to uninstall 7.18.0 because Synaptic wanted to uninstall half of my system if I did so via my package manger. I must not have my PATH set up correctly because when I do curl --version, I get: curl 7.18.1 (x86_64-unknown-linux-gnu) libcurl/7.18.0 OpenSSL/0.9.8g zlib/ 1.2.3.3 libidn/1.1 Protocols: tftp ftp telnet dict ldap ldaps http file https ftps Features: GSS-Negotiate IDN IPv6 Largefile NTLM SSL libz So curl is new and libcurl is a version older. This probably isn't ideal, but may help us figure out what is going on because I ran nm against both versions of libcurl for libcurl 7.18.1: mkimpel-m90 /usr/local/lib: nm libcurl.so | grep Curl_base64_encode 9e30 T Curl_base64_encode for libcurl 7.18.0 mkimpel-m90 /usr/lib: nm libcurl.so | grep Curl_base64_encode nm: libcurl.so: no symbols it looks like RCurl is still trying to link against 7.18.0, but, do I interpret your comments to mean that, if I set up my PATH correctly so that the newer version is found first, things might work? Regardless, I hope this is somewhat diagnostic. If I have screwed up the setup too much to make sense out of, perhaps one of the other guys with problems could also furnish the info. BTW, I did cc you on my post to R-help this evening (see below). Thanks for your help and all your development efforts, Mark romMark Kimpel [EMAIL PROTECTED]toLoyal Goff [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED], dateTue, May 6, 2008 at 9:18 PM On Wed, May 7, 2008 at 12:43 AM, Duncan Temple Lang [EMAIL PROTECTED] wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi all ~ I'm glad this made it to R-help (or R-devel) so that I saw it as this is the sort of problem that should be at least CC'ed to the package maintainer. ~ Yes, there was a change to RCurl yesterday with one of the changes being to synchronize code between libcurl and RCurl regarding base64 encoding which was causing a segfault with recent versions of libcurl. ~ The latest RCurl does not include the code for the Curl_base64_encode which was in the curl_base64.c file. The intent was to link against the on in libcurl, but what your reports suggest is that one some systems this is not available from libcurl.so. Can you confirm this with the nm output from libcurl.so ~nm libcurl.so | grep Curl_base64_encode Precisely where libcurl.so (or libcurl.so.digit...) will vary, but it is probably in /usr/local/lib/ and you can see by using ~curl-config --libs and seeing if there is -Ldirectory/path in the output which will tell you where it is likely to be. If the symbol (Curl_base64_encode) is not there, there will be no output! ~ If that is the case, we will have to back to having our own copy of that routine and so we will end up with two versions - one for the old and one for the new and the configuration will endeavor to determine which is appropriate. ~ HTH ~ D. Mark Kimpel wrote: | Martin, | | Well, thanks for jumping in! We need all the help we can get ;) | | I changed the execute bit as you suggested and recompiled, no luck, still | the same error message. | | Below is the output you wanted me to look at, its a bit beyond me so I | include both a brief grep summary and then the whole enchilada. I do note | that my output is different from yours, but I'm not sure how to interpret. | | I also thought about removing curl from my system, but when starting to do | so with Synaptic, it looked like if I removed libcurl I would trash an awful | lot of my system. I did download and install the latest curl 7.18.1 on top | of the other one, put /usr/local/ to the start of my PATH, reinstalled | RCurl, and still the same erro message comes up. | | So, what does it mean that the output of nm is different on our systems and | is it important? | | Thanks, Mark | | mkimpel-m90 ~/bin/curl-7.18.1: nm | /home/mkimpel/R_HOME/site-library-2.7.0/RCurl/libs/RCurl.so | grep | base64_encode | U Curl_base64_encode | 3910 T R_base64_encode | mkimpel-m90 ~/bin/curl-7.18.1: nm | /home/mkimpel/R_HOME/site-library-2.7.0/RCurl/libs/RCurl.so | grep | Curl_base64_encode | U Curl_base64_encode | mkimpel-m90 ~/bin/curl-7.18.1: nm | /home/mkimpel/R_HOME/site-library-2.7.0/RCurl/libs/RCurl.so | U CDR | 00208aa0 d CallEntries | 00208c00 D CurlErrorNames | 0020aac0 D CurlInfoNames | 00209740 D CurlOptionNames | U Curl_base64_decode | U Curl_base64_encode | U INTEGER | U LENGTH | U LOGICAL | 0020aee8 B OptionMemoryManager | U PRINTNAME | U RAW | 3f70 T
[R] interactive rotatable 3d scatterplot
I would like to create a 3d scatterplot that is interactive in the sense that I can spin it on its axes to better visualize some PCA results I have. What are the options in R? I've looked at RGL and perhaps it will suffice but it wasn't apparent from the documentation I found. Any demo scripts available for a package that will work? Mark -- Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 663-0513 Home (no voice mail please) ** [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] merge with rownames?
Can merge be tricked into merging via rownames as opposed to via contents of a particular column? I have two data.frames with overlapping, but out of order, rownames, but no column contents in common and would like to merge without cbinding the rownames to the data.frames. Mark -- Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 663-0513 Home (no voice mail please) ** [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Journal for R
As a medical researcher, I keep tabs on the journals Bioinformatics andd BMC Bioinformatics. If your package is for 'omics, those are good journals to look at. Mark On Sun, Mar 30, 2008 at 10:02 AM, Peter Dalgaard [EMAIL PROTECTED] wrote: Gavin Simpson wrote: On Sun, 2008-03-30 at 15:01 +0200, Christophe Genolini wrote: Hi the list I made up a new statistical procedure. I will publish it in a medical journal, but there will be only the way of using it, no calculation or algorithme detail. So is there a journal (I mean scientific journal) with selection commity to submit an article describing the detail of a package? There may be others, perhaps dedicated to a particular area of study (e.g. Computers and Geosciences, and Ecological Modelling might be appropriate places for the broad area of environmetrics), but the Journal of Statistical Software is an obvious choice for what you describe. www.jstatsoft.org HTH G Also, let me remind you and others that R News is also peer reviewed (although not yet indexed). It specifically includes solicits short to medium length articles of the following kinds * Changes in R: new features of the latest release * Changes on CRAN: new add-on packages, manuals, binary distributions, mirrors,... * Add-on packages: short introductions to or reviews of R extension packages * Programmer's Niche: nifty hints for programming in R (or S) * Hints for newcomers: Explaining sides of R that might not be so obvious from reading the manuals and FAQs. * Applications: Examples of analyzing data with R Lengthier papers are probably better placed in the JSS, JCGS (J.Computational and Graphical Statistics), or CSDA (Computational Statistics and Data Analysis), depending on contents. For in-depth descriptions of packages, JSS has become quite popular in recent years -pd (Current associate editor for R News, past AE for JSS) -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 663-0513 Home (no voice mail please) ** [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Rating R Helpers
I'll throw one more idea into the mix. I agree with Bill that a rating system for respondents is probably not that practical and of not the highest importance. It also seems like a recipe for creating inter-personal problems that the list doesn't need. I do like Bill's idea of a review system for packages, which could be incorporated into my idea that follows... What I would find useful would be some sort of tagging system for messages. I can't count the times I've remembered seeing a message that addresses a question I have down the road but, when Googled, I can't find it. It would be so nice, for example, to reliably be able to find all messages related to a certain package or package function posted within the last X days. This could be implemented as simply as asking posters to provide keywords at the end of a message, but it would be great if they could somehow be pulled out of a message and stored in a DB. For instance keywords could be surrounded by a sequence of special characters, which a parser could then extract and store in a DB along with the message. Of course, this would be work to set up, but how many of our experts who so kindly give of their time, get exasperated when similar questions keep popping up on the list? Also, if we had a web-accessable DB, the responses, not the responders, could be rated as to how well a reply takes care of an issue. Thus, over time, a sort of auto-wiki could be born. I can think of more uses for this as well. For example a developer could quickly check to see what usability problems or suggestions have cropped up of on individual package. Mark On Dec 1, 2007 2:21 AM, [EMAIL PROTECTED] wrote: This seems a little impractical to me. People respond so much at random and most only tackle questions with which they feel comfortable. As it's not a competition in any sense, it's going to be hard to rank people in any effective way. But suppose you succeed in doing so, then what? To me a much more urgent initiative is some kind of user online review system for packages, even something as simple as that used by Amazon.com has for customer review of books. I think the need for this is rather urgent, in fact. Most packages are very good, but I regret to say some are pretty inefficient and others downright dangerous. You don't want to discourage people from submitting their work to CRAN, but at the same time you do want some mechanism that allows users to relate their experience with it, good or bad. Bill Venables CSIRO Laboratories PO Box 120, Cleveland, 4163 AUSTRALIA Office Phone (email preferred): +61 7 3826 7251 Fax (if absolutely necessary): +61 7 3826 7304 Mobile: +61 4 8819 4402 Home Phone: +61 7 3286 7700 mailto:[EMAIL PROTECTED] http://www.cmis.csiro.au/bill.venables/ -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Doran, Harold Sent: Saturday, 1 December 2007 6:13 AM To: R Help Subject: [R] Rating R Helpers Since R is open source and help may come from varied levels of experience on R-Help, I wonder if it might be helpful to construct a method that can be used to rate those who provide help on this list. This is something that is done on other comp lists, like http://www.experts-exchange.com/. I think some of the reasons for this are pretty transparent, but I suppose one reason is that one could decide to implement the advise of those with superior or expert levels. In other words, you can trust the advice of someone who is more experienced more than someone who is not. Currently, there is no way to discern who on this list is really an R expert and who is not. Of course, there is R core, but most people don't actually know who these people are (at least I surmise that to be true). If this is potentially useful, maybe one way to begin the development of such ratings is to allow the original poster to rate the level of help from those who responded. Maybe something like a very simple questionnaire on a likert-like scale that the original poster would respond to upon receiving help which would lead to the accumulation of points for the responders. Higher points would result in higher levels of expertise (e.g., novice, ..., wizaRd). Just a random thought. What do others think? Harold [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- -- Mark W. Kimpel MD