Re: [R] Loading problem with XML_1.9
Well, as you mention at the end of the mail, several people have given you suggestions about how to solve the problem using different approaches. You might search on the Web for how to install a 64 bit version of libxml2? Using xmlTreeParse(, useInternalNodes = TRUE) is an approach to reducing the memory consumption as is using the handlers argument. And if size is really the issue, you should consider the SAX model which is very memory efficient and made available via the xmlEventParse() function in the XML package. And it even provides the concepts of branches to provide a hybrid of SAX and DOM-style parsing together. However, to solve the problem of the xmlMemDisplay symbol not being found, you can look for where that is used and remove it.It is in src/DocParse.c in the routine RS_XML_MemoryShow(). You can remove the line xmlMemDisplay(stderr) or indeed the entire routine. Then re-install and reload the package. D. Luo Weijun wrote: Hello Dr. Lang and all, I posted this message in R-help mail list, but haven’t solved my problem so far. Therefore, could you help me look at it? I have loading problem with XML_1.9 under 64 bit R2.3.1 for Mac OS X, which I got from http://R.research.att.com/. XML_1.9 works fine under 32 bit R2.5.0. I thought that could be installation problem, and I tried install.packages or biocLite, every time the package installed fine, except some warning messages below: ld64 warning: in /usr/lib/libxml2.dylib, file does not contain requested architecture ld64 warning: in /usr/lib/libz.dylib, file does not contain requested architecture ld64 warning: in /usr/lib/libiconv.dylib, file does not contain requested architecture ld64 warning: in /usr/lib/libz.dylib, file does not contain requested architecture ld64 warning: in /usr/lib/libxml2.dylib, file does not contain requested architecture Here is the error messages I got, when XML is loaded: library(XML) Error in dyn.load(x, as.logical(local), as.logical(now)) : unable to load shared library '/usr/local/lib64/R/library/XML/libs/XML.so': dlopen(/usr/local/lib64/R/library/XML/libs/XML.so, 6): Symbol not found: _xmlMemDisplay Referenced from: /usr/local/lib64/R/library/XML/libs/XML.so Expected in: flat namespace Error: .onLoad failed in 'loadNamespace' for 'XML' Error: package/namespace load failed for 'XML' Session information sessionInfo() Version 2.3.1 Patched (2006-06-27 r38447) powerpc64-apple-darwin8.7.0 attached base packages: [1] methods stats graphics grDevices utils datasets [7] base Prof Brian Ripley also suggested that this could be that I don’t have a 64-bit version of libxml2 installed. Where I get it and where/how to install it, if that’s the problem? The reason I need to use R64 is that I have memory limitation issue with R 32 bit version when I load some very large XML trees (the data file is about 800M). And Martin suggested me to use 'handler' argument of xmlTreeParse, tried 'handler' with useInternalNodes=T, but I still got this memory problem with R 32 bit version. Please tell me what I can do now. Thank you so much! Weijun Comedy with an Edge to see what's on, when. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] random sampling with some limitive conditions?
The method can get one new data. But I think that it is not random. I used the new random data to compute the index which I want to get. The same value was achieved with the data sites. I try it again and again. The result is the same. So I think I need to find one new random sampling method. On 7/7/07, Daniel Nordlund [EMAIL PROTECTED] wrote: -Original Message- From: [EMAIL PROTECTED] [mailto: [EMAIL PROTECTED] On Behalf Of Zhang Jian Sent: Saturday, July 07, 2007 12:31 PM To: r-help Subject: [R] random sampling with some limitive conditions? I want to gain thousands of random sampling data by randomizing the presence-absence data. Meantime, one important limition is that the row and column sums must be fixed. For example, the data tst is following: site1 site2 site3 site4 site5 site6 site7 site8 1 0 0 0 1 1 0 0 0 1 1 1 0 1 0 1 1 0 0 0 1 0 1 0 0 0 0 1 0 1 0 1 1 0 1 0 0 0 0 0 0 1 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 sum(tst[1,]) = 3, sum(tst[,1])=4, and so on. When I randomize the data, the first row sums must equal to 3, and the first column sums must equal to 4. The rules need to be applied to each row and column. How to get the new random sampling data? I have no idea. Thanks. You could reorder your table by stepping through your table a column at a time, and for each column randomly deciding to swap the current column with a column that has the same column total. Repeat this process for each row, i.e. for each row, randomly choose a row with the same row total to swap with. Here is some example code which is neither efficient nor general, but does demonstrate the basic idea. You will need to decide if this approach meets you needs. # I created a data file with your table (8x8) and read from it sites - read.table(c:/R/R-examples/site_random_sample.txt, header=TRUE) sites # get row and column totals colsums - apply(sites,2,sum) rowsums - apply(sites,1,sum) # randomly swap columns for(i in 1:8) { if (runif(1) .5) { swapcol-sample(which(colsums==colsums[i]),1) temp-sites[,swapcol] sites[,swapcol]-sites[,i] sites[,i]-temp } } # randomly swap rows for(i in 1:8) { if (runif(1) .5) { swaprow-sample(which(rowsums==rowsums[i]),1) temp-sites[swaprow,] sites[swaprow,]-sites[i,] sites[i,]-temp } } sites Hope this is helpful, Dan Daniel Nordlund Bothell, WA USA [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Antwort: Re: pgup/pgdown in R Graphics Window under Linux ['Watchdog': checked]
Dear Prof. Ripley, Prof Brian Ripley [EMAIL PROTECTED] schrieb am 05.07.2007 21:46:20: Dear S-users. This is the help forum for R users Indeed. (How embarrasing not to be able to spell a one-letter word correctly...) How do I change pages on an X11 graphics device under linux? It is baffling, rather than easy. What did you find in your homework that told you that the X11() device had 'pages' and responded to those keys? My first experience with R was on a windows box (which accepts page-up/-down for flipping pages in a graphics device). Then I read the statements re. platform independence found on R-project.org (R is a free software [...]. It compiles and runs on a wide variety of UNIX platforms, Windows and MacOS.), and assumed that runs implied that most important features are implemented indepently of the underlying OS. My homework included a rather large number of variations over the following theme: http://www.google.com/search?q=linux+r-project+xyplot+pgup Finally, I turned to the r-help mailing list for help... After reading your reply, I am surprised that the current implementation of the X11 device apparently renders Z in xyplot(..., layout(X,Y,Z)) quite useless? (I'm sure you'll not hesitate to correct me if I'm wrong?) KR, PMD. ***Abbott GmbH Co. KG *** Sitz der Gesellschaft: Wiesbaden, Amtsgericht Wiesbaden HRA 4888 Persönlich haftende Gesellschafterin: Abbott Management GmbH Sitz der Gesellschaft: Wiesbaden, Amtsgericht Wiesbaden HRB 12889 Geschäftsführer: Siegfried Brune, Jaime Contreras, Rodolfo Viana Vorsitzender des Aufsichtsrates: John Landgraf *** L e g a l D is c l a i m e r *** Der Inhalt dieser Nachricht ist vertraulich, kann gesetzlichen Bestimmungen unterliegen, kann vertrauliche Informationen beinhalten und ist nur für den direkten Empfänger bestimmt.Sie ist Eigentum von Abbott Laboratories bzw. der betreffenden Niederlassung. Nicht authorisierte Benutzung, unbefugte Weitergabe sowie Kopieren jeglicher Bestandteile dieser Information ist streng verboten und kann als rechtswidrige Handlung eingestuft werden. Sollten Sie diese Nachricht fälschlicherweise erhalten haben, informieren Sie bitte Abbott Laboratories umgehend, indem Sie die Email zurückschicken und diese dann zusammen mit allen zugehörigen Kopien oder Dateianhängen zerstören. The information contained in this communication is confident...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Antwort: Re: pgup/pgdown in R Graphics Window under Linux ['Watchdog': checked]
Hi Deepayan, Deepayan Sarkar [EMAIL PROTECTED] schrieb am 06.07.2007 02:05:02: On 7/5/07, Paul Matthias Diderichsen [EMAIL PROTECTED] wrote: library(lattice) xyplot(speed~dist|speed, data=cars, layout=c(3,3)) If this is your use case, you might be interested in http://cran.r-project.org/src/contrib/Descriptions/plotAndPlayGTK.html Thanks a lot for the pointer; this package seems to be very useful when coding your own plots. However, it's not exactly my use case - rather an example to illustrate the the X11 graphics device is apparently not too useful for multi-page plots. The motivation for my question was that I want to use xpose4 ( http://xpose.sourceforge.net/) under linux. Xpose is a program that provides functions for producing diagnostic plots for population PKPD model evaluation. I am not able to rewrite the entire package, wrapping every call to multi-page plot functions with plotAndPlayGTK. That's why I was hoping that there exist some obscure configuration option for X11 (seems not to be the case, cf. Prof Ripley's reply) or an alternative graphic device that runs under linux. Thank you for your suggestion anyways! KR, PMD. ***Abbott GmbH Co. KG *** Sitz der Gesellschaft: Wiesbaden, Amtsgericht Wiesbaden HRA 4888 Persönlich haftende Gesellschafterin: Abbott Management GmbH Sitz der Gesellschaft: Wiesbaden, Amtsgericht Wiesbaden HRB 12889 Geschäftsführer: Siegfried Brune, Jaime Contreras, Rodolfo Viana Vorsitzender des Aufsichtsrates: John Landgraf *** L e g a l D is c l a i m e r *** Der Inhalt dieser Nachricht ist vertraulich, kann gesetzlichen Bestimmungen unterliegen, kann vertrauliche Informationen beinhalten und ist nur für den direkten Empfänger bestimmt.Sie ist Eigentum von Abbott Laboratories bzw. der betreffenden Niederlassung. Nicht authorisierte Benutzung, unbefugte Weitergabe sowie Kopieren jeglicher Bestandteile dieser Information ist streng verboten und kann als rechtswidrige Handlung eingestuft werden. Sollten Sie diese Nachricht fälschlicherweise erhalten haben, informieren Sie bitte Abbott Laboratories umgehend, indem Sie die Email zurückschicken und diese dann zusammen mit allen zugehörigen Kopien oder Dateianhängen zerstören. The information contained in this communication is confident...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] one question about the loop
jim holtman [EMAIL PROTECTED] a écrit : Is this what you want? t(combn(5,2)) Well, it seems nice, but from which library does it come ? I try help.search(combn), but that did not give me any valuable information... Christophe Ce message a ete envoye par IMP, grace a l'Universite Paris 10 Nanterre __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] longitudinal data
Hello all, I want to analyze data that looks like this: Id var1 var2 var3.. 1 0 1 0 1 0 1 1 2 2 2 2 Not all id's have the same no. of observations. At the first stage I want to count how many people in the survey, how many have 1 in var1, etc. How do I do that? Thank you, Sigalit. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] one question about the loop
combinat http://cran.r-project.org/src/contrib/Descriptions/combinat.html [EMAIL PROTECTED] schreef: jim holtman [EMAIL PROTECTED] a écrit : Is this what you want? t(combn(5,2)) Well, it seems nice, but from which library does it come ? I try help.search(combn), but that did not give me any valuable information... Christophe Ce message a ete envoye par IMP, grace a l'Universite Paris 10 Nanterre __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] calculating p-values of columns in a dataframe
Thomas Pujol wrote: I have a dataframe (mydf) that contains differences of means. I wish to test whether these differences are significantly different from zero. Below, I calculate the t-statistic for each column. What is a good method to calculate/look-up the p-value for each column? mydf=data.frame(a=c(1,-22,3,-4),b=c(5,-6,-7,9)) mymean=mean(mydf) mysd=sd(mydf) mynn=sapply(mydf, function(x) {sum ( as.numeric(x) = -Inf) }) myse=mysd/sqrt(mynn) myt=mymean/myse myt You can do the whole lot with L - lapply(mydf, t.test) or if you only want the t statistics and p-values now: sapply(L, [, c(statistic, p.value)) If you want to follow your initial approach quickly, you can calculate the probability function of the t distribution with 3 degrees of freedom (for your data) with 2 * pt(-abs(myt), df = nrow(mydf) - 1) Uwe Ligges - Food fight? Enjoy some healthy debate [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problems with e1071 and SparseM
Hello all, I am trying to use the svm method provided by e1071 (Version: 1.5-16) together with a matrix provided by the SparseM package (Version: 0.73) but it fails with this message: model - svm(lm, lv, scale = TRUE, type = 'C-classification', kernel = 'linear') Error in t.default(x) : argument is not a matrix although lm was created before with read.matrix.csr (from the e1071) package. I also tried to simply convert a normal matrix to a SparseM matrix and then pass it, but I get the same error again. According to the manual of svm(), this is supposed to work though: x: a data matrix, a vector, or a sparse matrix (object of class 'matrix.csr' as provided by the package 'SparseM'). Used R version: R version 2.4.0 Patched (2006-11-25 r39997) Does anyone know how I can use Sparse Matrices with e1071? This would be really important because the matrix is simply too large to write it out. Best regards, Chris -- Christian Holler System Administrator Chair of Prof. Dr. W.J. Paul Saarland University Germany Building E1 3, Room 3.20 phone: +49 - 681 / 302 - 5537 fax: +49 - 681 / 302 - 4290 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to revert to an older limma version?
Dear Sirs, How can I revert to an older limma version? Typing install.packages(limma) in R gives a list of mirrors. How can I install the version I want after I obtain and untar the file (e.g, limma_2.9.1.tar.gz)? I am running R 2.5.0 on a Linux machine (CentOS 5). When using limma it will not go past the read.maimages command. I get this error: Error in readGenericHeader(fullname, columns = columns, sep = sep) : Specified column headings not found in file In addition: Warning message: input string 1 is invalid in this locale in: grep(pattern, x, ignore.case, extended, value, fixed, useBytes) I was told by a colleague that this may be due to my limma version. I try to use limma 2.10.5 and he uses 2.9.1 Could this be the reason? Thanks in advance Maya [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to revert to an older limma version?
Maya Bercovich [EMAIL PROTECTED] writes: Dear Sirs, Please post to the Bioconductor list (see http://bioconductor.org for instructions) How can I revert to an older limma version? Typing install.packages(limma) in R gives a list of mirrors. How can I install the version I want after I obtain and untar the file (e.g, limma_2.9.1.tar.gz)? I am running R 2.5.0 on a Linux machine (CentOS 5). When using limma it will not go past the read.maimages command. I get this error: Error in readGenericHeader(fullname, columns = columns, sep = sep) : Specified column headings not found in file In addition: Warning message: input string 1 is invalid in this locale in: grep(pattern, x, ignore.case, extended, value, fixed, useBytes) Please provide a short section of code that shows how you invoke the function. It is very hard to tell what the cause of your problem is without this. read.maimages has a 'columns' argument. Do you supply it? If so, does it contain the correct column names for the file type you are reading? For instance, is the capitalization correct? The help page for read.maimages provides some guidance. I was told by a colleague that this may be due to my limma version. I try to use limma 2.10.5 and he uses 2.9.1 Specific Bioconductor package versions work with specific R versions. Instead of using install.packages, use source(http://bioconductor.org/biocLite.R;) biocLite(limma) to get the right version for your R. For R 2.5.0, limma 2.10.5 is the correct version. The limma author is very responsive to bug reports, so seek additional help and if necessary report bugs rather than revert to previous versions. Could this be the reason? Please always provide the output of the command sessionInfo() to provide a concise summary of your system. Thanks in advance Maya [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Martin Morgan Bioconductor / Computational Biology http://bioconductor.org __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to revert to an older limma version?
That sounds as if you are running in a UTF-8 locale and your colleague is not. We do ask for the results of sessionInfo(), which would have helped. I suggest you try an English 8-bit locale and see what happens. On Sun, 8 Jul 2007, Maya Bercovich wrote: Dear Sirs, How can I revert to an older limma version? Typing install.packages(limma) in R gives a list of mirrors. How can I install the version I want after I obtain and untar the file (e.g, limma_2.9.1.tar.gz)? I am running R 2.5.0 on a Linux machine (CentOS 5). When using limma it will not go past the read.maimages command. I get this error: Error in readGenericHeader(fullname, columns = columns, sep = sep) : Specified column headings not found in file In addition: Warning message: input string 1 is invalid in this locale in: grep(pattern, x, ignore.case, extended, value, fixed, useBytes) I was told by a colleague that this may be due to my limma version. I try to use limma 2.10.5 and he uses 2.9.1 Could this be the reason? Thanks in advance Maya [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] one question about the loop
It is part of the standard 'util' library that comes with R ?combn help.search('combination') On 7/8/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: jim holtman [EMAIL PROTECTED] a Ã(c)crit : Is this what you want? t(combn(5,2)) Well, it seems nice, but from which library does it come ? I try help.search(combn), but that did not give me any valuable information... Christophe Ce message a ete envoye par IMP, grace a l'Universite Paris 10 Nanterre __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] longitudinal data
The question is, how is the missing data accounted for? Is this a CSV file where the missing data is left blank? If it is just separated by white space, how do you know that var1 is missing is var2 is there? If it is the case that just the initial values are there, then you can use fill=TRUE on read.table which will supply NAs for the trailing values in uneven rows. You need to provide a reproducible script/data so that we have a better chance of answering your questions. On 7/8/07, sigalit mangut-leiba [EMAIL PROTECTED] wrote: Hello all, I want to analyze data that looks like this: Id var1 var2 var3.. 1 0 1 0 1 0 1 1 2 2 2 2 Not all id's have the same no. of observations. At the first stage I want to count how many people in the survey, how many have 1 in var1, etc. How do I do that? Thank you, Sigalit. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] random sampling with some limitive conditions?
Any methods or advices about the random sampling method? I have no idea. Thanks a lot. On 7/8/07, Zhang Jian [EMAIL PROTECTED] wrote: The method can get one new data. But I think that it is not random. I used the new random data to compute the index which I want to get. The same value was achieved with the data sites. I try it again and again. The result is the same. So I think I need to find one new random sampling method. On 7/7/07, Daniel Nordlund [EMAIL PROTECTED] wrote: -Original Message- From: [EMAIL PROTECTED] [mailto: [EMAIL PROTECTED] On Behalf Of Zhang Jian Sent: Saturday, July 07, 2007 12:31 PM To: r-help Subject: [R] random sampling with some limitive conditions? I want to gain thousands of random sampling data by randomizing the presence-absence data. Meantime, one important limition is that the row and column sums must be fixed. For example, the data tst is following: site1 site2 site3 site4 site5 site6 site7 site8 1 0 0 0 1 1 0 0 0 1 1 1 0 1 0 1 1 0 0 0 1 0 1 0 0 0 0 1 0 1 0 1 1 0 1 0 0 0 0 0 0 1 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 sum(tst[1,]) = 3, sum(tst[,1])=4, and so on. When I randomize the data, the first row sums must equal to 3, and the first column sums must equal to 4. The rules need to be applied to each row and column. How to get the new random sampling data? I have no idea. Thanks. You could reorder your table by stepping through your table a column at a time, and for each column randomly deciding to swap the current column with a column that has the same column total. Repeat this process for each row, i.e. for each row, randomly choose a row with the same row total to swap with. Here is some example code which is neither efficient nor general, but does demonstrate the basic idea. You will need to decide if this approach meets you needs. # I created a data file with your table (8x8) and read from it sites - read.table(c:/R/R-examples/site_random_sample.txt, header=TRUE) sites # get row and column totals colsums - apply(sites,2,sum) rowsums - apply(sites,1,sum) # randomly swap columns for(i in 1:8) { if (runif(1) .5) { swapcol-sample(which(colsums==colsums[i]),1) temp-sites[,swapcol] sites[,swapcol]-sites[,i] sites[,i]-temp } } # randomly swap rows for(i in 1:8) { if (runif(1) .5) { swaprow-sample(which(rowsums==rowsums[i]),1) temp-sites[swaprow,] sites[swaprow,]-sites[i,] sites[i,]-temp } } sites Hope this is helpful, Dan Daniel Nordlund Bothell, WA USA [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] xmlOutputBuffer vs xmlOutputDOM
Hi, I am trying to use the XML package to write some data (pretty large amounts of data) into XML files. I experimented with a few variations, using xmlOutputBuffer and xmlOutputDOM. xmlOutputDOM provides neat formatted, indented output, but takes very long. xmlOutputBuffer is incompatible (in my experiences) with the saveXML function, and so i hacked around it by outputting its $value() to cat. This unfortunately makes it lose all proper formatting, and so gives me an XML file with new lines after every tag or entry, and with no indenting at all. However, xmlOutputDOM takes very long - I am outputting rather large files, and where xmlOutputBuffer takes about 10-15 seconds, xmlOutputDOM takes about 20 minutes. Am I using xmlOutputDOM in some wrong way? Is there a way to get proper formatting out of xmlOutputBuffer? Either of these solutions would be useful, as I see no advantage to using one over the other for just outputting lots of data (1 fields at minimum) Below is my code, and after that, an output of the times that were reported on a sample run: library(XML) buffer - xmlOutputBuffer() buffer2 - xmlOutputDOM() buffer$addTag(outside, close = FALSE) buffer2$addTag(outside, close = FALSE) for(i in 1:1000) { buffer$addTag(tag, i) buffer2$addTag(tag, i) } buffer$closeTag() buffer2$closeTag() system.time(cat(buffer$value(), file = foo2.xml)) system.time(saveXML(buffer2$value(), file = foo.xml)) Times reported : the xmlOutputDOM is more than 100x slower. system.time(cat(buffer$value(), file = foo2.xml)) user system elapsed 0.004 0.000 0.001 system.time(saveXML(buffer2$value(), file = foo.xml)) user system elapsed 0.476 0.024 0.516 I am using R version 2.5.1, and XML package version 1.9-0 Yours sincerely, Arjun Ravi Narayan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Extracting S code from a C program
There is a C program called GPS: 'gamma poisson shrinker' at ftp://ftp.research.att.com/dist/gps/ The algorithms in GPS are based on S-Plus programs written by William DuMouchel with support from Columbia University and ATT Labs. My question is: is there a relatively easy way to extract some of the S code from this windows program? Thanks. -- View this message in context: http://www.nabble.com/Extracting-S-code-from-a-C-program-tf4044952.html#a11489962 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] random sampling with some limitive conditions?
If I understand your problem, this might be a solution. Assign independent random numbers for row and column and use the corresponding ordering to assign the row and column indices. Thus row and column assignments are independent and the row and column totals are fixed. If cc and rr are respectively the desired row and column totals, with sum(cc)==sum(rr), then n = sum(cc) row.assign = rep(1:length(rr),rr)[order(runif(n))] col.assign = rep(1:length(cc),cc)[order(runif(n))] If you want many such sets of random assignments to be generated at once you can use a few more rep() calls in the expressions to generate multiple sets in the same way. (Do you actually want the assignments or just the tables?) Of course there are many other possible solutions since you have not fully specified the distribution you want. Alan Zaslavsky Harvard U From: Zhang Jian [EMAIL PROTECTED] Subject: [R] random sampling with some limitive conditions? To: r-help r-help@stat.math.ethz.ch I want to gain thousands of random sampling data by randomizing the presence-absence data. Meantime, one important limition is that the row and column sums must be fixed. For example, the data tst is following: site1 site2 site3 site4 site5 site6 site7 site8 1 0 0 0 1 1 0 0 0 1 1 1 0 1 0 1 1 0 0 0 1 0 1 0 0 0 0 1 0 1 0 1 1 0 1 0 0 0 0 0 0 1 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 sum(tst[1,]) = 3, sum(tst[,1])=4, and so on. When I randomize the data, the first row sums must equal to 3, and the first column sums must equal to 4. The rules need to be applied to each row and column. How to get the new random sampling data? I have no idea. Thanks. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting S code from a C program
On 08/07/2007 12:07 PM, francogrex wrote: There is a C program called GPS: 'gamma poisson shrinker' at ftp://ftp.research.att.com/dist/gps/ The algorithms in GPS are based on S-Plus programs written by William DuMouchel with support from Columbia University and ATT Labs. My question is: is there a relatively easy way to extract some of the S code from this windows program? Thanks. No. From the description, there's no S code there to extract, it's been translated to C, and you don't even have the source code. I'd recommend contacting Dr. DuMouchel to see if he is willing to let you have his S code. Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] generating a data frame with a subset from another data frame
R gurus, I have a data set that looks something like this: SiteSpecies DBH #Vines G PLOC45.94 G ACNE23.31 G ACNE12.00 G FRAM35.95 G AEGL11.22 N PLOC77.312 N JUNI78.67 N ACNE18.91 N ACNE15.73 N ACRU35.54 H ACSA2 24.16 H ULAM35.27 There are 730 individual trees (22 species) from four sites in the actual data set. I would like to create a second data frame that contains just the most common species (mainly ACNE, PLOC, ULAM, FRAM, and ACSA2). Here's some of my attempts: study.1-subset(study,study$Species=c (ACNE,PLOC,FRAM,ULAM,ACSA2)) Error: syntax error study.1-study[study$Species==,c(ACNE,PLOC,FRAM,ULAM,ACSA2)] Error: syntax error study.1-study[c(ACNE,PLOC,FRAM,ULAM,ACSA2),] #This one appeared to work, but upon inspection, it just copied the entire study data frame instead of just copying the data I wanted, as study.1$Species had a length of 22 (the same as the original) instead of the desired length of 5. I've already consulted a book on R as well as spent the last three hours searching the R-help archives. There must be a way to get the subset I desire but it is not obvious to me. Thanks in advance for your help. Jim Milks Graduate Student Environmental Sciences Ph.D. Program Wright State University 3640 Colonel Glenn Hwy Dayton, OH 45431 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] random sampling with some limitive conditions?
It is not right. My data is the presence-absence data. And I want to get thousands of presence-absence random data which length of rows and columns is the same with the former data. Meantime, the new data needs to have the fixed sums for each row and column with the former data. For example: The data sites: site1 site2 site3 site4 site5 site6 site7 site8 1 0 0 0 1 1 0 0 0 1 1 1 0 1 0 1 1 0 0 0 1 0 1 0 0 0 0 1 0 1 0 1 1 0 1 0 0 0 0 0 0 1 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 apply(sites,2,sum) site1 site2 site3 site4 site5 site6 site7 site8 4 2 2 4 3 5 2 4 apply(sites,1,sum) [1] 3 5 3 3 2 6 1 3 If I get the new data sites.random: site1 site2 site3 site4 site5 site6 site7 site8 1 0 0 0 1 1 0 0 1 1 1 1 0 1 0 0 1 0 0 0 1 0 1 0 0 0 0 1 0 1 0 1 0 0 1 0 0 0 0 1 0 1 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 apply(sites.random,2,sum) # the same with the former data site1 site2 site3 site4 site5 site6 site7 site8 4 2 2 4 3 5 2 4 apply(sites.random,1,sum) # the same with the former data [1] 3 5 3 3 2 6 1 3 How can I get the new random data? Thanks. On 7/8/07, Alan Zaslavsky [EMAIL PROTECTED] wrote: If I understand your problem, this might be a solution. Assign independent random numbers for row and column and use the corresponding ordering to assign the row and column indices. Thus row and column assignments are independent and the row and column totals are fixed. If cc and rr are respectively the desired row and column totals, with sum(cc)==sum(rr), then n = sum(cc) row.assign = rep(1:length(rr),rr)[order(runif(n))] col.assign = rep(1:length(cc),cc)[order(runif(n))] If you want many such sets of random assignments to be generated at once you can use a few more rep() calls in the expressions to generate multiple sets in the same way. (Do you actually want the assignments or just the tables?) Of course there are many other possible solutions since you have not fully specified the distribution you want. Alan Zaslavsky Harvard U From: Zhang Jian [EMAIL PROTECTED] Subject: [R] random sampling with some limitive conditions? To: r-help r-help@stat.math.ethz.ch I want to gain thousands of random sampling data by randomizing the presence-absence data. Meantime, one important limition is that the row and column sums must be fixed. For example, the data tst is following: site1 site2 site3 site4 site5 site6 site7 site8 1 0 0 0 1 1 0 0 0 1 1 1 0 1 0 1 1 0 0 0 1 0 1 0 0 0 0 1 0 1 0 1 1 0 1 0 0 0 0 0 0 1 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 sum(tst[1,]) = 3, sum(tst[,1])=4, and so on. When I randomize the data, the first row sums must equal to 3, and the first column sums must equal to 4. The rules need to be applied to each row and column. How to get the new random sampling data? I have no idea. Thanks. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Efficient matrix slices
Indexing matrices by subsets of rows and columns is quite convenient, but it seems to take time linear in the size of the matrix (even for a small slice of the matrix): dim(y) [1] 732 1332 length(which(a[1,]==1)) [1] 4 length(which(b[1,]==1)) [1] 12 proc.time(y[which(a[1,]==1),which(b[1,]==1)]) [1] 32.596 1.809 510.928 0.000 0.000 proc.time(sum(y)) [1] 33.082 1.914 547.469 0.000 0.000 Does anybody know how matrix slices are actually implemented in R? Thanks a lot, Gabriel __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] random effect variance per treatment group in lmer
All, How does one specify a model in lmer such that say the random effect for the intercept has a different variance per treatment group? Thus, in the model equation, we'd have say b_ij represent the random effect for patient j in treatment group i, with variance depending on i, i.e, var(b_ij) = tau_i. Didn't see this in the docs or Pinherio Bates (section 5.2 is specific for modelling within group errors). Sample repeated measures code below is for a single random effect variance, where the random effect corresponds to patient. cheers, dave z - rnorm(24, mean=0, sd=1) time - factor(paste(Time-, rep(1:6, 4), sep=)) Patient - rep(1:4, each = 6) drug - factor(rep(c(D, P), each = 6, times = 2)) ## P = placebo, D = Drug dat.new - data.frame(time, drug, z, Patient) fm = lmer(z ~ drug + time + (1 | Patient), data = dat.new ) [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] generating a data frame with a subset from another data frame
?%in% I think what you want is: study.1-subset(study,study$Species %in% c(ACNE,PLOC,FRAM,ULAM,ACSA2)) On 7/8/07, James R. Milks [EMAIL PROTECTED] wrote: R gurus, I have a data set that looks something like this: SiteSpecies DBH #Vines G PLOC45.94 G ACNE23.31 G ACNE12.00 G FRAM35.95 G AEGL11.22 N PLOC77.312 N JUNI78.67 N ACNE18.91 N ACNE15.73 N ACRU35.54 H ACSA2 24.16 H ULAM35.27 There are 730 individual trees (22 species) from four sites in the actual data set. I would like to create a second data frame that contains just the most common species (mainly ACNE, PLOC, ULAM, FRAM, and ACSA2). Here's some of my attempts: study.1-subset(study,study$Species=c (ACNE,PLOC,FRAM,ULAM,ACSA2)) Error: syntax error study.1-study[study$Species==,c(ACNE,PLOC,FRAM,ULAM,ACSA2)] Error: syntax error study.1-study[c(ACNE,PLOC,FRAM,ULAM,ACSA2),] #This one appeared to work, but upon inspection, it just copied the entire study data frame instead of just copying the data I wanted, as study.1$Species had a length of 22 (the same as the original) instead of the desired length of 5. I've already consulted a book on R as well as spent the last three hours searching the R-help archives. There must be a way to get the subset I desire but it is not obvious to me. Thanks in advance for your help. Jim Milks Graduate Student Environmental Sciences Ph.D. Program Wright State University 3640 Colonel Glenn Hwy Dayton, OH 45431 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Windows Binary for ncdf package
Dear Sir There is no window binary version of package ncdf in the latest release of R 2.5.1. i dont have any information about the old versions. Please guid in thie regard Thank you -- AMINA SHAHZADI Department of Statistics GC University Lahore, Pakistan. Email: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] xmlOutputBuffer vs xmlOutputDOM
Hi Arjun Have you tried using xmlTree() which uses an opaque C representation of the document and I expect will serialize the contents relatively rapidly. The interface for creating the tree is intended to be the same, and is at least similar to, as xmlOutputDOM. The intent is that the representations are easily interchangeable. xmlOutputDOM is slow because it is representing a tree in R as a list of lists. You might also use xmlHashTree() which uses a more efficient representation in R. But the nature of the C representation (and not just the fact that it uses C code) will probably speed things up considerably. A question that comes to mind is why you really care about pretty printing of the resulting document if it is very large? Will a human read it? If so and it is just for verifying it is correct, read it back into R and validate the contents programmatically. D. Arjun Ravi Narayan wrote: Hi, I am trying to use the XML package to write some data (pretty large amounts of data) into XML files. I experimented with a few variations, using xmlOutputBuffer and xmlOutputDOM. xmlOutputDOM provides neat formatted, indented output, but takes very long. xmlOutputBuffer is incompatible (in my experiences) with the saveXML function, and so i hacked around it by outputting its $value() to cat. This unfortunately makes it lose all proper formatting, and so gives me an XML file with new lines after every tag or entry, and with no indenting at all. However, xmlOutputDOM takes very long - I am outputting rather large files, and where xmlOutputBuffer takes about 10-15 seconds, xmlOutputDOM takes about 20 minutes. Am I using xmlOutputDOM in some wrong way? Is there a way to get proper formatting out of xmlOutputBuffer? Either of these solutions would be useful, as I see no advantage to using one over the other for just outputting lots of data (1 fields at minimum) Below is my code, and after that, an output of the times that were reported on a sample run: library(XML) buffer - xmlOutputBuffer() buffer2 - xmlOutputDOM() buffer$addTag(outside, close = FALSE) buffer2$addTag(outside, close = FALSE) for(i in 1:1000) { buffer$addTag(tag, i) buffer2$addTag(tag, i) } buffer$closeTag() buffer2$closeTag() system.time(cat(buffer$value(), file = foo2.xml)) system.time(saveXML(buffer2$value(), file = foo.xml)) Times reported : the xmlOutputDOM is more than 100x slower. system.time(cat(buffer$value(), file = foo2.xml)) user system elapsed 0.004 0.000 0.001 system.time(saveXML(buffer2$value(), file = foo.xml)) user system elapsed 0.476 0.024 0.516 I am using R version 2.5.1, and XML package version 1.9-0 Yours sincerely, Arjun Ravi Narayan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Windows Binary for ncdf package
amna khan wrote: Dear Sir There is no window binary version of package ncdf in the latest release of R 2.5.1. i dont have any information about the old versions. Please guid in thie regard Thank you See the ReadMe in the CRAN repository. It tells you that ncdf does not build out of the box on Windows and is hence not available on CRAN. Nevertheless, Brian kindly provides the binary in his CRAN (extras) repository (URL http://www.stats.ox.ac.uk/pub/RWin). Just type install.packages(ncdf) and it will be installed ... Uwe Ligges __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] Scagnostics - scatterplot diagnostics
The scagnostics package implements the graph theoretic scagnostics described by Leland Wilkinson, Anushka Anand and Robert Grossman (http://www.ncdm.uic.edu/publications/files/proc-094.pdf), building on an old idea of Tukey's to define indices of interestingness to help guide the search for interesting features in the pair-wise scatterplots of a highly multivariate dataset. The scagnostics package currently only supports two methods, one which computes the scagnostics for a pair of variables, and the other for all pairs of variables in a data.frame. If you are attending the JSM, there is a session on scagnostics. Details are available at http://tinyurl.com/324yb5 (The package has just been added to CRAN, it may be a couple of days before it is available on your local mirror) Regards, Hadley ___ R-packages mailing list [EMAIL PROTECTED] https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Windows Binary for ncdf package
Original Message Subject: [R] Windows Binary for ncdf package From: amna khan [EMAIL PROTECTED] To: R-help@stat.math.ethz.ch, R-help@stat.math.ethz.ch Date: 08.07.2007 19:52 Dear Sir There is no window binary version of package ncdf in the latest release of R 2.5.1. i dont have any information about the old versions. Please guid in thie regard Thank you It is there! install.packages(ncdf) trying URL 'http://www.stats.ox.ac.uk/pub/RWin/bin/windows/contrib/2.5/ncdf_1.6.zip' Content type 'application/zip' length 231140 bytes opened URL downloaded 225Kb package 'ncdf' successfully unpacked and MD5 sums checked Maybe you should try another mirror. Stefan version _ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 2 minor 5.1 year 2007 month 06 day27 svn rev42083 language R version.string R version 2.5.1 (2007-06-27) -=-=- ... Money: There's nothing in the world so demoralizing as money. (Sophocles) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Loading problem with XML_1.9
Thanks, Dr. Lang, I used xmlEventParse() + branches concept as you suggested, it really works, and the memory issue is gone. Now I can query large XML files from within R. but here is another problem: it is too slows (a simple query has not finished for 1.5h), even though the number of relevant records is very limited, but the whole XML file has more than 500 thousand similarly-structured records. And the parser has to go through all of them as to find the matches. Attached is part of the XML files with two records. I am trying to retrieve the content of moleculeName nodes from molecule records where name nodes bear specific gene names. Is it possible to locate based on node content (or xmlValue) rather than node names (since they are the same in all records) first and then parse the xml record locally? Would query based on XPath be faster in this case? I understand that we do have the facility in the XML package for XPath based queries, called getNodeSet(). But that requires reading the whole XML tree into the memory first, which is not feasible for my large XML file. Or can I call XML::XPath statements using your R-Perl interface package? Any suggestions/thoughts? Thank you! Weijun Part of my XML file: molecule provimimid20/imid/im/provmoleculeID119043/moleculeID moleculeTypeproteinprovimimid20/imid/im/prov/moleculeType organismID10090provimimid20/imid/im/prov/organismID idprovimimid20/imid/im/providTypeGI/idTypeidValue6677981/idValue/id nameSKD1provimimid20/imid/im/prov/name nameVps4bprovimimid20/imid/im/prov/name name8030489C12Rikprovimimid20/imid/im/prov/name descriptiondistributionvalueMouse homologue of yeast Vacuolar protein sorting 4 (Vps4); Suppressor of potassium transport defect 1. Mem ber of mammalian class E Vps proteins involved in endosomal transport; AAA-type ATPase.provimimid20/imid/im/prov/valuevalueMo use homologue of yeast Vacuolar protein sorting 4 (Vps4); Suppressor of potassium transport defect 1. Member of mammalian class E Vps prot eins involved in endosomal transport; AAA-type ATPase.provimimid20/imid/im/prov/value/distribution/description orthologue methodmethodID337974/methodIDmethodNamemiClust80/methodName/method /orthologue variant provimimid20/imid/im/provvariantID0/variantID /variant interactioninteractionRef201581/interactionRefmoleculeRef89434/moleculeRefmoleculeNameSBP1/moleculeName selfVariantRef0/selfVariantRefpartnerVariantRef0/partnerVariantRef/interaction interactioninteractionRef201582/interactionRefmoleculeRef17953/moleculeRefmoleculeNamemVps2/moleculeName selfVariantRef0/selfVariantRefpartnerVariantRef0/partnerVariantRef/interaction /molecule molecule provimimid30/imid/im/provmoleculeID116226/moleculeID moleculeTypeproteinprovimimid30/imid/im/prov/moleculeType organismID9606provimimid30/imid/im/prov/organismID idprovimimid30/imid/im/providTypeHGNC/idTypeidValue9859/idValue/id nameRAP1GDS1provimimid30/imid/im/prov/name nameGDS1provimimid30/imid/im/prov/name nameMGC118859provimimid30/imid/im/prov/name nameMGC118861provimimid30/imid/im/prov/name variant provimimid30/imid/im/provvariantID0/variantID /variant interactioninteractionRef93569/interactionRefmoleculeRef116280/moleculeRefmoleculeNameRAC1/moleculeName selfVariantRef0/selfVariantRefpartnerVariantRef0/partnerVariantRef/interaction interactioninteractionRef104132/interactionRefmoleculeRef103040/moleculeRefmoleculeNameRHOA/moleculeName selfVariantRef0/selfVariantRefpartnerVariantRef0/partnerVariantRef/interaction interactioninteractionRef121818/interactionRefmoleculeRef74726/moleculeRefmoleculeNameMBIP/moleculeName selfVariantRef0/selfVariantRefpartnerVariantRef0/partnerVariantRef/interaction /molecule --- Duncan Temple Lang [EMAIL PROTECTED] wrote: Well, as you mention at the end of the mail, several people have given you suggestions about how to solve the problem using different approaches. You might search on the Web for how to install a 64 bit version of libxml2? Using xmlTreeParse(, useInternalNodes = TRUE) is an approach to reducing the memory consumption as is using the handlers argument. And if size is really the issue, you should consider the SAX model which is very memory efficient and made available via the xmlEventParse() function in the XML package. And it even provides the concepts of branches to provide a hybrid of SAX and DOM-style parsing together. However, to solve the problem of the xmlMemDisplay symbol not being found, you can look for where that is used and remove it.It is in src/DocParse.c in the routine RS_XML_MemoryShow(). You can remove the line xmlMemDisplay(stderr) or indeed the entire routine. Then re-install and reload the package. D. Luo Weijun wrote: Hello Dr. Lang and all, I posted this message in R-help mail list, but havenât solved my problem so far. Therefore, could you help me look at it? I have loading problem with XML_1.9 under 64 bit R2.3.1 for Mac OS X, which I got from http://R.research.att.com/.
[R] Writing Excel (.xls) files on non-Windows OSs using Perl
Hi all, There have been quite a few threads in the recent months pertaining to the ability to directly write native Excel (.xls) files from R. For example, exporting R matrices and/or data frames to an Excel file, with perhaps the ability to create multiple tabs (worksheets) within a single file, with one tab/sheet per R object. There exists the xlsReadWrite package on CRAN by Hans-Peter Suter, which is restricted to Windows, since it utilizes the non-FOSS MS Office API to write the Excel formats. I recently had the need, under Linux (FC6/F7) to create an Excel file containing multiple worksheets, each worksheet containing an 'exported' data frame from R. While one could export the data frames to delimited files (ie. using write.table() ) and then open those files from Excel (or OO.org's Calc), it was rather tedious to do so with a larger number of R objects. Since I would now have the need to engage in this process with some level of frequency, the preceding approach would not be time efficient. I thus embarked on a mini-project to create a Perl script utilizing openly available functions from CPAN and then facilitate the calling of the script directly from R. I am posting the Perl code here for the benefit of others who may have similar requirements. Please note that I am providing this 'as is' and don't have any plans to substantively modify or enhance the code. It does what I need it to do. Feel free to modify for other needs as may be required. The basic calling schema is: WriteXLS.pl [--CSVpath] [--CSVfiles] ExcelFileName Where: CSVpath = Path to the csv files created in R, typically done using write.table() CSVfiles = globbed file name specification (ie. *.csv) ExcelFileName = FULL name of Excel .xls file to create When the Excel file is created, a new worksheet (tab) will be created for each CSV file imported. The worksheet name will be the basename (no path or extension) of the CSV file, up to the first 31 characters, which is a limitation for Excel worksheet names. Note of course that Excel has certain (version specific) limitations with respect to file formats. I list the MS link below for Excel 2007. Similar specs are available for earlier versions: http://office.microsoft.com/en-us/excel/HP100738491033.aspx Finally, note that I use 'Spreadsheet::WriteExcel::Big', as the regular version of the Perl package has a constraint where the ENTIRE Excel file cannot be larger than 7 Mb, which was a problem for my application. Here is the Perl code: #!/usr/bin/perl -w use strict; use Spreadsheet::WriteExcel::Big; use Getopt::Long; use File::Glob; use File::Basename; use Text::CSV_XS; # Initialize and get command line arguments my $CSVPath = '.'; my $CSVFiles = *.csv; GetOptions ('CSVpath=s' = \$CSVPath, 'CSVfiles=s' = \$CSVFiles); my $ExcelFileName = $ARGV[0]; # Create Excel XLS File print Creating Excel File: $ExcelFileName\n\n; my $XLSFile = Spreadsheet::WriteExcel::Big-new($ExcelFileName); # Glob file path and names my @FileNames = $CSVPath/$CSVFiles; foreach my $FileName (@FileNames) { print Reading: $FileName\n; # Open CSV File my $csv = Text::CSV_XS-new(); open (CSVFILE, $FileName) || die ERROR: cannot open $FileName. $!\n; # Create new sheet with filename prefix # ($base, $dir, $ext) = fileparse ($FileName, '..*'); my $FName = (fileparse ($FileName, '\..*'))[0]; # Only take the first 31 chars, which is the # limit for a worksheet name my $SheetName = substr($FName, 0, 31); print Creating New WorkSheet: $SheetName\n\n; my $WorkSheet = $XLSFile-add_worksheet($SheetName); # Rows and columns are zero indexed my $Row = 0; # Write to Sheet while (CSVFILE) { if ($csv-parse($_)) { my @Fields = $csv-fields(); my $Col = 0; foreach my $Fld (@Fields) { $WorkSheet-write($Row, $Col, $Fld); $Col++; } $Row++; } } close CSVFILE; } A 'typical' sequence for the use of the code from within R might be: # Create a character vector of R objects to be exported RObjects - c(VectorOfRObjectNames, ...) # Now loop through the vector, creating CSV files # In this case, export to a 'CSVFILES' sub-directory for (i in RObjects) { write.table(get(i), file = paste(CSVFILES/, i, .csv, sep = ), sep = ,, quote = TRUE, na = , row.names = FALSE) } # Now call the Perl script from within R, presuming # that the script is in the current default directory system(./WriteXLS.pl --CSVPath CSVFILES RExport.xls) This process has worked for me, given the current functional requirements for my project. I hope that this is of some help to others. Regards, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] need some help on Inverse Gaussian distribution
Dear Sir I am a lecture Department of Mathematics University of Dhaka,Bangladesh. http://www.univdhaka.edu/department/facultyMembers.php?bodyid=MATst=25perPage=25. I saw you are in the author of R-functions for inverse Gaussian distributions.By tried them in my R ,but its not working. I will highly appreciate if you kindly let me know how can i use those functions(d,p,q,r--for inverse Gaussian distribution) Sincerely Sharif Mozumder Lecturer Department of Mathematics University of Dhaka Bangladesh. - Got a little couch potato? Check out fun summer activities for kids. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Several quick questions
At 8:45 AM -0400 7/7/07, Sébastien wrote: Dear R users, Here is a couple a quick questions, for which I was unable to not find any answer in the list archives and in the help: 1- Is there any R equivalents of the VB functions Cint, CStr, etc... (for non VB users, these functions transform the category of a specified variable and smartly adapt the value of this variable) ? I have tried to use the as.numeric, as.factor and as.vector commands but the result is not exactly what I want ([1] 1, 3, 5, 6) a-as.factor(cbind(1,3,5,6)) # creates a dummy factor a [1] 1 3 5 6 Levels: 1 3 5 6 a-as.vector(as.numeric(a)) a [1] 1 2 3 4 Does this give what you want? a - factor(c(1,3,5,6)) a [1] 1 3 5 6 Levels: 1 3 5 6 as.numeric(format(a)) [1] 1 3 5 6 as.numeric(as.character(a)) ## an alternative [1] 1 3 5 6 --- remainder omitted --- Thanks in advance for your help. Sebastien -- - Don MacQueen Lawrence Livermore National Laboratory Livermore, CA, USA 925-423-1062 [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] patch to enhance sound module for 96 kHz/24 bit sample sizes
Greetings Matthias, Thanks again for your sound module. I did not ever manage to find the time to play with phase equations, but I found I needed the module for a new project involving bats. I needed to do some work @ 96 kHz/24 bit sample size, and found the limitations of the sound package stop at 48 kHz and 16 bit samples. Here's a patch to bring things up to 96/24. Sorry I cannot test 192/24. I am copying r-help in case others have more advanced equipment and an interest in testing it out. Hope this helps! BTW, if you are curious about the bats, you can check here: http://blogs.cnet.com/8301-13507_1-9738110-18.html?tag=more I will be writing a follow-up that uses sound and seewave in the next few days. [EMAIL PROTECTED] Desktop]$ diff -ru sound-orig/ sound diff -ru sound-orig/man/bits.Rd sound/man/bits.Rd --- sound-orig/man/bits.Rd2006-02-20 12:50:53.0 -0500 +++ sound/man/bits.Rd2007-07-08 19:36:08.0 -0400 @@ -12,13 +12,13 @@ } \arguments{ \item{s}{ a Sample object, or a string giving the name of a wav file. } - \item{value}{ the number of bits per sample, 8 or 16. } + \item{value}{ the number of bits per sample, 8, 16, or 24. } } \details{ The replacement form can be used to reset the sampling quality of a Sample object, that is the number of bits per sample (8 or 16). Here, filenames are not accepted. } \value{ - For \code{bits}, the bits parameter (number of bits per sample) of the Sample object, 8 or 16. + For \code{bits}, the bits parameter (number of bits per sample) of the Sample object, 8, 16, or 24. For \code{setBits}, a Sample object with the new \code{bits} parameter. } Only in sound/man: bits.Rd~ diff -ru sound-orig/man/loadSample.Rd sound/man/loadSample.Rd --- sound-orig/man/loadSample.Rd2006-02-20 12:57:00.0 -0500 +++ sound/man/loadSample.Rd2007-07-08 19:35:31.0 -0400 @@ -11,7 +11,8 @@ \item{filecheck}{ logical. If FALSE, no check for existance and read permission of the file will be performed. } } \details{ -All kinds of wav files are supported: mono / stereo, 8 / 16 bits per sample, 1000 to 48000 samples/second. +All kinds of wav files are supported: mono / stereo, 8 / 16 / 24 bits per sample, 1000 to 96000 samples/second, +but no compressed formats are supported. } \value{ the Sample object that is equivalent to the wav file. Only in sound/man: loadSample.Rd~ diff -ru sound-orig/man/nullSample.Rd sound/man/nullSample.Rd --- sound-orig/man/nullSample.Rd2006-02-20 12:56:37.0 -0500 +++ sound/man/nullSample.Rd2007-07-08 19:37:03.0 -0400 @@ -7,8 +7,8 @@ \usage{nullSample(rate=44100, bits=16, channels=1) } \arguments{ - \item{rate}{ the sampling rate, between 1000 and 48000. } - \item{bits}{ the sample quality (number of bits per sample), 8 or 16. } + \item{rate}{ the sampling rate, between 1000 and 96000. } + \item{bits}{ the sample quality (number of bits per sample), 8, 16, or 24. } \item{channels}{ 1 for mono, or 2 for stereo. } } \value{ Only in sound/man: nullSample.Rd~ diff -ru sound-orig/man/rate.Rd sound/man/rate.Rd --- sound-orig/man/rate.Rd2006-02-20 12:59:34.0 -0500 +++ sound/man/rate.Rd2007-07-08 19:39:22.0 -0400 @@ -12,7 +12,7 @@ } \arguments{ \item{s}{ a Sample object, or a string giving the name of a wav file. } - \item{value}{ an integer between 1000 and 48000 giving the sampling rate. } + \item{value}{ an integer between 1000 and 96000 giving the sampling rate. } } \details{ The replacement form can be used to reset the sampling rate. Here, filenames are not accepted. @@ -26,7 +26,7 @@ } \author{ Matthias Heymann } -\note{ Common sampling rates are between 8000 and 44100 (CD quality). The sampling rate of DAT recorders is 48000. Not every rate is guaranteed to be supported by every wav file player. +\note{ Common sampling rates are between 8000 and 44100 (CD quality). The sampling rate of DAT recorders is 48000. DVD Audio supports rates up to 96000 (and perhaps 192000, though this has not been tested). Not every rate is guaranteed to be supported by every wav file player. Future versions may use a different algorithm for sampling rate conversion to achieve a better sound quality for the returned sample. } Only in sound/man: rate.Rd~ diff -ru sound-orig/man/Sample.Rd sound/man/Sample.Rd --- sound-orig/man/Sample.Rd2006-02-20 12:59:24.0 -0500 +++ sound/man/Sample.Rd2007-07-08 19:39:52.0 -0400 @@ -14,7 +14,7 @@ \arguments{ \item{sound}{ a \code{channels(s)} x \code{sampleLength(s)} matrix or a vector of doubles describing the waveform(s) of the sample. } \item{rate}{ the sampling rate (number of samples per second). } - \item{bits}{ the sampling quality (the number of bits per sample), 8 or 16. } + \item{bits}{ the sampling quality (the number of bits per sample), 8, 16, or 24. } \item{s}{ an R object to be tested.} \item{argname}{ a string giving
[R] ca.jo
Dear R users; I'm using ca.jo for a VECM model. Is there a way that I can get sd/p-value to see whether coefficients estimated are statistical significant? Thank you Yours, Yihsu [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Several quick questions
On 7/7/07, Sébastien [EMAIL PROTECTED] wrote: Dear R users, Here is a couple a quick questions, for which I was unable to not find any answer in the list archives and in the help: [...] 2- When a log scale is called in a graph, the label takes a format like 10^n. That's true for lattice, but not traditional graphics, as far as I know. Is there a way to come back to a regular number format like 1, 10, 100... without having to create a custom axis ? Depends on what you mean by custom axis. You don't need to manually choose the tick positions etc, but you still need to define the rules that determine how they are calculated. See example(axis.default) for an example where the tick positions remain the same (as the defaults), but the labels change. The slightly different rule used in traditional graphics is available through the axTicks() function, which basically boils down to this: logTicks - function (lim, loc = c(1, 5)) { ii - floor(log10(range(lim))) + c(-1, 2) main - 10^(ii[1]:ii[2]) r - as.numeric(outer(loc, main, *)) r[lim[1] = r r = lim[2]] } where 'lim' is the limits in the original scale. So we have logTicks(c(1, 100)) [1] 1 5 10 50 100 logTicks(c(1, 100), loc = c(2, 5, 10)) [1] 1 2 5 10 20 50 100 3- In lattice graphics, how does the default value of the axs argument influence the values of limits ? This question should be considered in the following context. The help states that a 4% extension is applied by default to the axis range in base graphics. So, I have tried to apply this 4 % extension to create some custom lattice graphics. I worked on a dataset in which the independent variable ranged from 0 to 120, so I basically customized my axis using limits=c(-4.8,124.8). The results of the graphics with and without the limits command were not identical... The extension is user-settable in lattice, and defaults to 7% (I think this value came from Trellis specs, but I don't remember the exact details). lattice.getOption(axis.padding) $numeric [1] 0.07 $factor [1] 0.6 -Deepayan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] transform excel data into graph
There are numerous ways of importing data from excel. One is to save as a .csv and use the read.csv function. Or, you can copy to the clipboard and use the read.delim(clipboard,header=T) function. Are you looking at a bar graph where the lessons have the names nested below them on the x axis, and the numbers on the y? As these are all introductory elements of using R, going through the numerous intro R manuals available online is your best bet. Try: http://cran.r-project.org/manuals.html and also under Contributed Documentation at the above site. cross123 wrote: Hello everyone, I have a set of data in the following form, which are stored in an Excel file: nick john peter lesson1 0.465 0.498 0.473 lesson2 0.422 0.44 0.134 lesson3 0.45 0.35 0.543 lesson4 0.590 0.64 0.11 lesson5 0.543 0.50.32 What I want to do is a 2d-graph plot where I will have the name of the student in the X-axis and the name of the lesson in the Y-axis and the number from each pair will be used to construct the plot. I am newbie with R and I don't know which package shall I use nor the commands with which I will import my data in R so that the plot will be created... Any help would be greatly appreciated. -- View this message in context: http://www.nabble.com/transform-excel-data-into-graph-tf4046056.html#a11494545 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] EM algorithm for Missing Data.
Dear all, I need to use the EM algorithm where data are missing. Example: x- c(60.87, NA, 61.53, 72.20, 68.96, NA, 68.35, 68.11, NA, 71.38) May anyone help me? Thanks. Marcus Vinicius [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] EM algorithm for Missing Data.
Sure! Read this: MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM Author(s): DEMPSTER AP, LAIRD NM, RUBIN DB Source: JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL 39 (1): 1-38 1977 then read the posting guide. Simon. On Sun, 2007-07-08 at 23:20 -0300, Marcus Vinicius wrote: Dear all, I need to use the EM algorithm where data are missing. Example: x- c(60.87, NA, 61.53, 72.20, 68.96, NA, 68.35, 68.11, NA, 71.38) May anyone help me? Thanks. Marcus Vinicius [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Simon Blomberg, BSc (Hons), PhD, MAppStat. Lecturer and Consultant Statistician Faculty of Biological and Chemical Sciences The University of Queensland St. Lucia Queensland 4072 Australia Room 320, Goddard Building (8) T: +61 7 3365 2506 email: S.Blomberg1_at_uq.edu.au The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. - John Tukey. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help in installing rggobi in ubuntu linux
Hi R users. I am experimenting with ubuntu 7.04 Feisty. I install the ggobi package with apt-get. I got almost all the packages, but when I try to obtain rggobi, I got this message: - install.packages(rggobi) Aviso en install.packages(rggobi) : argument 'lib' is missing: using '/usr/local/lib/R/site-library' --- Please select a CRAN mirror for use in this session --- Loading Tcl/Tk interface ... done probando la URL 'http://cran.at.r-project.org/src/contrib/rggobi_2.1.4-4.tar.gz' Content type 'application/x-gzip' length 401451 bytes URL abierta == downloaded 392Kb * Installing *source* package 'rggobi' ... checking for pkg-config... /usr/bin/pkg-config checking pkg-config is at least version 0.9.0... yes checking for GGOBI... configure: creating ./config.status config.status: creating src/Makevars ** libs gcc -std=gnu99 -I/usr/share/R/include -I/usr/share/R/include -g -DUSE_EXT_PTR=1 -D_R_=1 -fpic -g -O2 -c brush.c -o brush.o En el fichero incluÃdo de brush.c:1: RSGGobi.h:5:22: error: GGobiAPI.h: No existe el fichero ó directorio In file included from RSGGobi.h:6, from brush.c:1: conversion.h:174: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘asCLogical’ conversion.h:176: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘asCRaw’ --- snip --- brush.c:124: error: ‘t’ no se declaró aquà (primer uso en esta función) brush.c:124: error: ‘s’ no se declaró aquà (primer uso en esta función) brush.c:124: error: el objeto ‘GGOBI(erroneous-expression)’ llamado no es una función brush.c: En el nivel principal: brush.c:135: error: expected ‘)’ before ‘cid’ make: *** [brush.o] Error 1 chmod: no se puede acceder a `/usr/local/lib/R/site-library/rggobi/libs/*': No existe el fichero ó directorio ERROR: compilation failed for package 'rggobi' ** Removing '/usr/local/lib/R/site-library/rggobi' The downloaded packages are in /tmp/RtmpVCacJd/downloaded_packages Warning message: installation of package 'rggobi' had non-zero exit status in: install.packages(rggobi) --- What am I doing wrong? Thank you for your help. -- Kenneth Roy Cabrera Torres Cel 315 504 9339 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] adding latex, html docs. to new packag
Hi again! How do I create the Latex and HTML files for documentation for a new package, please? Is there something in the R CMD stuff that would do it, or do I need to produce by hand, pleaes? thanks, eb __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.