[R] CART and CHAID
Can I say that RPART is a modified algo of CART and PARTY a modified of CHAID? Thanks. Chua Siang Li Consultant - Operations Research Acceval Pte Ltd Tel: 6297 8740 Email: [EMAIL PROTECTED] Website: www.acceval-intl.com This message and any attachments (the message...{{dropped:12}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RODBC - problems using odbcDriverConnect without DSN
As this message is from your (unstated but probably unixODBC) ODBC driver manager, it is nothing to do with RODBC nor R. Most likely the syntax is wrong. As for 'without a DSN needing to be set on every computer that runs it', you need the driver and driver manager installed on every such computer, and a proper installation will have a symbolic declaration of the driver in odbcinst.ini in which case you can use DRIVER=MySQL or some such. E.g. library(RODBC) con - odbcDriverConnect(SERVER=localhost;DRIVER=MySQL;DATABASE=testdb) works for me. On Mon, 21 Jul 2008, Josiah Walker wrote: Hi, I'm trying to use RODBC without having to set up a DSN, using hte direct connection string in odbcDriverConnect. My connection attempt looks something like: odbcDriverConnect(connection = SERVER=localhost;DRIVER={/usr/lib/odbc/libmyodbc.so};DATABASE=myDB;UID=reader;PASSWORD=insecure;) And this returns the message: Warning messages: 1: In odbcDriverConnect(connection = conn) : [RODBC] ERROR: state IM002, code 0, message [unixODBC][Driver Manager]Data source name not found, and no default driver specified 2: In odbcDriverConnect(connection = conn) : ODBC connection failed I know this means it can't find x connection in the dsn... which means my connection string isn't recognised as valid. I've used ODBC and ODBC connection strings before, but I can't work out why this wouldn't work here. I can successfully create the same connection using a user DSN with the same settings as this, but this connection string won't work. It's fairly important for this project that the code can connect without a DSN needing to be set on every computer that runs it. Does anyone know if I'm missing something in my connection string, or can this not be done using RODBC? Thanks, Josiah. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] CART and CHAID
On Mon, 21 Jul 2008, Chua Siang Li wrote: Can I say that RPART is a modified algo of CART and PARTY a modified of CHAID? Not truthfully. CART is a trademark of commercial software. rpart (sic) is similar but not 'modified' from anything -- it is an independent implementation of the ideas in Breiman, Friedman, Olshen and Stone (with some extra ideas by Terry Therneau and others). party (sic) is very much more general than CHAID. Thanks. Chua Siang Li Consultant - Operations Research Acceval Pte Ltd Tel: 6297 8740 Email: [EMAIL PROTECTED] Website: www.acceval-intl.com -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] SVM: Graphical representation
Hi, We are working on binary classification using kernlab for SVM based on more than 30 variables and now we want to provide a graphical representation of our results in 2D or 3D. We have checked the graphical functionality of kernlab but it seems that only works with 2 principal components, and we use to work with more than 8 PC due to the variability of our data. We are thinking in some kind of projection approach. Are there any functions/packages providing this functionality? Thank you Manuel __ Enviado desde Correo Yahoo! La bandeja de entrada más inteligente. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] CART Analysis
Good evening Does R have an extension/add-on package that assists in Classification and Regression Tree analysis? Thanks for your time Darin Brooks Geomatics/GIS/Remote Sensing Coordinator Kim Forest Management Ltd. Cranbrook Office Cranbrook, BC Checked by AVG. 12:59 PM [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] CART Analysis
Darin Brooks wrote: Good evening Does R have an extension/add-on package that assists in Classification and Regression Tree analysis? Yes. Abundantly. Have a look under `Recursive Partitioning' in the following Task View: http://cran.r-project.org/web/views/MachineLearning.html HTH, Tobias Checked by AVG. 12:59 PM [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Erro: cannot allocate vector of size 216.0 Mb
Several questions: - Before we go ahead: Are you sure 3 Gb are sufficient for your problem? - Which OS (I guess Windows)? - Which version of R (let's assume R-2.7.1)? - Is your Windows 3GB enabled in the boot flags, or is it a 64-bit version of Windows? Best wishes, Uwe Ligges José Augusto Jr. wrote: Please, I have a 2GB computer and a huge time-series to embedd, and i tried increasing memory.limit() and memory.size(max=TRUE), but nothing. Just before the command: memory.size(max=TRUE) [1] 13.4375 memory.limit() [1] 1535.875 gc() used (Mb) gc trigger (Mb) max used (Mb) Ncells 209552 5.6 407500 10.9 35 9.4 Vcells 125966 1.0 786432 6.0 496686 3.8 I increased the memory limit: memory.limit(3000) NULL memory.limit() [1] 3000 memory.size() [1] 11.33070 memory.size(max=TRUE) [1] 13.4375 gc() used (Mb) gc trigger (Mb) max used (Mb) Ncells 209552 5.6 407500 10.9 35 9.4 Vcells 125964 1.0 786432 6.0 496686 3.8 And even trying to increase the memory.limits, i still get and error. Any sugestions? Thanks in advance. jama __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Erro: cannot allocate vector of size 216.0 Mb
On Mon, 21 Jul 2008, Uwe Ligges wrote: Several questions: - Before we go ahead: Are you sure 3 Gb are sufficient for your problem? - Which OS (I guess Windows)? (The only platform on which these functions are supported.) - Which version of R (let's assume R-2.7.1)? - Is your Windows 3GB enabled in the boot flags, or is it a 64-bit version of Windows? (No, or the default memory limit would be higher than 1.5Gb. R by default uses as high a memory limit as is sensible if (as here) the address space is the limiting factor.) Best wishes, Uwe Ligges José Augusto Jr. wrote: Please, I have a 2GB computer and a huge time-series to embedd, and i tried increasing memory.limit() and memory.size(max=TRUE), but nothing. Just before the command: memory.size(max=TRUE) [1] 13.4375 memory.limit() [1] 1535.875 gc() used (Mb) gc trigger (Mb) max used (Mb) Ncells 209552 5.6 407500 10.9 35 9.4 Vcells 125966 1.0 786432 6.0 496686 3.8 I increased the memory limit: memory.limit(3000) NULL memory.limit() [1] 3000 memory.size() [1] 11.33070 memory.size(max=TRUE) [1] 13.4375 gc() used (Mb) gc trigger (Mb) max used (Mb) Ncells 209552 5.6 407500 10.9 35 9.4 Vcells 125964 1.0 786432 6.0 496686 3.8 And even trying to increase the memory.limits, i still get and error. Any sugestions? Thanks in advance. jama __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] asp and ylim
Maybe what I am missing is how to set the device region mentioned in Brian's email. I have tried various searches, but I haven't had any luck in finding a reference to device region. However, I'm not sure that changing the device region will help if one stays with plot (), because the default seems to be that the physical plot region is approximately square, and I haven't found a way to control the size of the physical plot region. The help files for eqscplot() and xyplot () indicate that they are meant for scatter plots. But lots of plots are not scatter plots. I was only using a scatter plot because Rolf's code did so, and he thought I could do what I wanted inside plot(), which I can't at the moment. Perhaps the best solution is to live with plot() as it is. If I need the picture for a paper, I will export data to Matlab or Mathematica or Illustrator, where I can get the control I want. Thanks for all your help. David On 20 Jul, 2008, at 23:14, Prof Brian Ripley wrote: Take a look at eqscplot() in package MASS for a different approach. You last para forgets that once you have set the device region and the margins the physical plot region and hence its aspect ratio is determined -- see the figures in 'An Introduction to R'. On Sun, 20 Jul 2008, David Epstein wrote: #See David Williams' book Weighing the odds, p286 y - c(1.21, 0.51, 0.14, 1.62, -0.8, 0.72, -1.71, 0.84, 0.02, -0.12) ybar - mean(y) ylength - length(y) ybarv - rep(ybar, ylength) x - 1:ylength plot(x,y,asp=1,xlab=position,ylab=ybar,type=n,ylim=c(-1,1)) segments(x[1], ybar, x[ylength], ybar) segments(x,ybarv,x,y) points(x, ybarv, pch=21, bg=white) points(x,y,pch=19,col=black) With asp=1, the value of ylim seems to be totally ignored, as in the above code. With asp not set, R plays close attention to the value of ylim. This is not intuitive behaviour, or is it? How can I set the aspect ratio, and simultaneously set the plot region? The aspect ratio is one number and the plot region is given by four numbers (xleft, xright, yleft, yright). Logically, these 5 numbers are independent of each other and arbitrary, provided xleftxright and yleftyright. This should give a one-to- one bijection between 5-tuples and plots, determined up to a change of scale that is uniform in the x- and y-dirctions. My code above shows the (to me) obvious attempt, which fails. Thanks David __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating Betweenness - Efficiency problem
Senthil, you can try the 'igraph' package. Export your two-column Excel file as a .csv, use 'read.csv' to read that into R, then 'graph.data.frame' to create an igraph graph from it. Finally, call 'betweenness' on the graph. It is really just three/four lines, something like this: tab - read.csv(...) g - graph.data.frame(tab) bet - betweenness(g) bet - data.frame(city=V(g)$name, betweenness=bet) The last line creates a two column data frame with the betweenness score of each city. Best, Gabor On Sat, Jul 19, 2008 at 02:59:07PM -0700, Senthil Purushothaman wrote: Hi Jim, Thank you for the response. Your suggestion will help me avoid the whole text to number conversion process that I perform using LookUp in excel. I will definitely give it a shot. But it still doesn't address the vector conversion since a graph file is drawn only using the vectors. Assuming that I use 'factor' to convert the characters to numbers, how do I convert these numbers into vectors? Thanks, Senthil -Original Message- From: jim holtman [mailto:[EMAIL PROTECTED] Sent: Sat 7/19/2008 4:49 AM To: Senthil Purushothaman Cc: r-help@r-project.org Subject: Re: [R] Calculating Betweenness - Efficiency problem It would seem that you can output the initial file from EXCEL, read it into R with 'read.csv' and then use 'factor' to convert the characters for City1 and City2 to the numbers that you want to use. Have you tried this approach? On Fri, Jul 18, 2008 at 3:51 PM, Senthil Purushothaman [EMAIL PROTECTED] wrote: Hello, I am calculating 'Betweenness' of a large network using R. Currently, I have the node-node information (City1-City2) in an excel file, present in two columns where column A has City1 and column B has City2 that city1 is connected to. These are the steps that I go through to calculate betweenness of my network. a) Convert the City1-City2 (text) into Number1-Number2 in the excel file where every unique city has a unique number. b) Paste all the city-city information separated by comma into c(...) in the R GUI to obtain the corresponding vectors. As you can imagine this copy-paste operation takes a long time. Example: c(1,3,1,5,2,4,2,5). Just fyi, I have a text file that contains all nodes separated by comma based on the appropriate link information. c) Then, I create a graph file with the above vector. d) I use the graph file to calculate betweenness of my network. I am sure there must be a better, more efficient way to calculate betweenness. Ideally, I would like to just have the City1 - City2 (link) information in two columns in an excel file and calculate the betweenness from that file directly. Please provide an optimal solution for this problem. I appreciate your time and help. Thanks, Senthil [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Csardi Gabor [EMAIL PROTECTED]UNIL DGM __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] drawing segments through points with pch=1
A. On Sun, 2008-07-20 at 15:40 +0100, David Epstein wrote: What I don't like about type=b, also suggested by Paul Smith, is that the segments do not go right up to the little circles---a gap is left, which I don't like. So far, Uwes' solution is what suits me best. However, I understand Brian's objection, though it doesn't apply in my case. The discussion makes me fear that it's a very long road ahead before I can get fine control of R graphics. Hi David, If you want to get transparency in the middle of the points and lines that connect them, try this: pointgap-strwidth(o)/2 segments(x[1:ll-1]+pointgap,ybar,x[2:ll]-pointgap,ybar) Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] asp and ylim
Since posting the message immediately below, I took Brian's reference to Introduction to R more seriously, and read through the section on graphics. There I found par(fig=c(xleft,xright,ybottom,ytop)). This seems to be setting the device region which Brian pointed out is a fundamental part of the process, but I haven't tried it yet. I expect this is what I need in order to make plot() do what I want. Thanks again, especially to Brian David Maybe what I am missing is how to set the device region mentioned in Brian's email. I have tried various searches, but I haven't had any luck in finding a reference to device region. However, I'm not sure that changing the device region will help if one stays with plot(), because the default seems to be that the physical plot region is approximately square, and I haven't found a way to control the size of the physical plot region. The help files for eqscplot() and xyplot() indicate that they are meant for scatter plots. But lots of plots are not scatter plots. I was only using a scatter plot because Rolf's code did so, and he thought I could do what I wanted inside plot(), which I can't at the moment. Perhaps the best solution is to live with plot() as it is. If I need the picture for a paper, I will export data to Matlab or Mathematica or Illustrator, where I can get the control I want. Thanks for all your help. David On 20 Jul, 2008, at 23:14, Prof Brian Ripley wrote: Take a look at eqscplot() in package MASS for a different approach. You last para forgets that once you have set the device region and the margins the physical plot region and hence its aspect ratio is determined -- see the figures in 'An Introduction to R'. On Sun, 20 Jul 2008, David Epstein wrote: #See David Williams' book Weighing the odds, p286 y - c(1.21, 0.51, 0.14, 1.62, -0.8, 0.72, -1.71, 0.84, 0.02, -0.12) ybar - mean(y) ylength - length(y) ybarv - rep(ybar, ylength) x - 1:ylength plot(x,y,asp=1,xlab=position,ylab=ybar,type=n,ylim=c(-1,1)) segments(x[1], ybar, x[ylength], ybar) segments(x,ybarv,x,y) points(x, ybarv, pch=21, bg=white) points(x,y,pch=19,col=black) With asp=1, the value of ylim seems to be totally ignored, as in the above code. With asp not set, R plays close attention to the value of ylim. This is not intuitive behaviour, or is it? How can I set the aspect ratio, and simultaneously set the plot region? The aspect ratio is one number and the plot region is given by four numbers (xleft, xright, yleft, yright). Logically, these 5 numbers are independent of each other and arbitrary, provided xleftxright and yleftyright. This should give a one-to- one bijection between 5-tuples and plots, determined up to a change of scale that is uniform in the x- and y-dirctions. My code above shows the (to me) obvious attempt, which fails. Thanks David __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] CART and CHAID
Thanks, yes, understand that PARTY offers a lot lot more than CHAID. mmm, allow me to rephrase it. If I am looking for something similar to CART (to grow tree and prune back) and CHAID (using sig test to stop the tree), I can use RPART and PARTY respectively? And are there any more other R packages that offer improved CART/CHAID technqiues? Thanks. Chua Siang Li Consultant - Operations Research Acceval Pte Ltd Tel: 6297 8740 Email: [EMAIL PROTECTED] Website: www.acceval-intl.com This message and any attachments (the message) are intended for the designated recipient only and may contain information that is confidential and privileged. If you are not the intended recipient, please notify the sender and delete all its contents. Any use , reliance on, reference to, review , disclosure or copying of the message and the information it contains is prohibited. - Original Message From: Prof Brian Ripley [EMAIL PROTECTED] To: Chua Siang Li [EMAIL PROTECTED] Cc: r-help@r-project.org Subject: Re: [R] CART and CHAID Date: 07/21/08 15:17 On Mon, 21 Jul 2008, Chua Siang Li wrote: Can I say that RPART is a modified algo of CART and PARTY a modified of CHAID? Not truthfully. CART is a trademark of commercial software. rpart (sic) is similar but not 'modified' from anything -- it is an independent implementation of the ideas in Breiman, Friedman, Olshen and Stone (with some extra ideas by Terry Therneau and others). party (sic) is very much more general than CHAID. Thanks. Chua Siang Li Consultant - Operations Research Acceval Pte Ltd Tel: 6297 8740 Email: [EMAIL PROTECTED] Website: [2]www.acceval-intl.com -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, [4]http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 References 1. mailto:[EMAIL PROTECTED] 2. http://www.acceval-intl.com/ 3. mailto:[EMAIL PROTECTED] 4. http://www.stats.ox.ac.uk/~ripley/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sampling from a list of values while excluding one
Here is one way of doing it by removing the element from the vector # exclude the 3rd element of the vector sample((1:10)[-3], 10, TRUE) [1] 4 5 7 10 2 10 10 7 7 1 On Mon, Jul 21, 2008 at 5:10 AM, Juliane Struve [EMAIL PROTECTED] wrote: Dear list, I am trying to sample from a list of integers 1:10, but need to exclude one of them. The one to be excluded is a variable called number and can take values 1:10. The line below does not work, but shows what I am trying to do. Would somebody be able to help me with the syntax ? anglenumber=sample(1:10, exclude = number,size=1,replace=TRUE) Thank you very much for a hint. Regards, Juliane Dr. Juliane Struve Environmental Scientist 10, Lynwood Crescent Sunningdale SL5 0BL 01344 620811 __ Not happy with your email address?. Get the one you really want - millions of new email addresses available now at Yahoo! http://uk.docs.yahoo.com/ymail/new.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] fama-macbeth
Hi all, I was wondering whether there is a standard method to carry out fama-macbeth regressions in R. I have spent the last few hours looking around the help pages but nothing seems to be written about this. Thanks a lot! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sampling from a list of values while excluding one
This works great, thank you for your help ! Dr. Juliane Struve Environmental Scientist 10, Lynwood Crescent Sunningdale SL5 0BL 01344 620811 --- On Mon, 21/7/08, jim holtman [EMAIL PROTECTED] wrote: From: jim holtman [EMAIL PROTECTED] Subject: Re: [R] sampling from a list of values while excluding one To: [EMAIL PROTECTED] Cc: r-help@r-project.org Date: Monday, 21 July, 2008, 10:43 AM Here is one way of doing it by removing the element from the vector # exclude the 3rd element of the vector sample((1:10)[-3], 10, TRUE) [1] 4 5 7 10 2 10 10 7 7 1 On Mon, Jul 21, 2008 at 5:10 AM, Juliane Struve [EMAIL PROTECTED] wrote: Dear list, I am trying to sample from a list of integers 1:10, but need to exclude one of them. The one to be excluded is a variable called number and can take values 1:10. The line below does not work, but shows what I am trying to do. Would somebody be able to help me with the syntax ? anglenumber=sample(1:10, exclude = number,size=1,replace=TRUE) Thank you very much for a hint. Regards, Juliane Dr. Juliane Struve Environmental Scientist 10, Lynwood Crescent Sunningdale SL5 0BL 01344 620811 __ Not happy with your email address?. Get the one you really want - millions of new email addresses available now at Yahoo! http://uk.docs.yahoo.com/ymail/new.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ Not happy with your email address?. Get the one you really want - millions of new email addresses available now at Yahoo! http://uk.docs.yahoo.com/ymail/new.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] asp and ylim
DE == David Epstein [EMAIL PROTECTED] on Mon, 21 Jul 2008 09:42:35 +0100 writes: DE Maybe what I am missing is how to set the device DE region mentioned in Brian's email. Play around resizing your graphics window.. This is very instructive, with an 'asp = .' using traditional graphics plot(). I'd say the behavior is quite intuitive for some value of intuitive, but I agree with you that there's also a very valid different value of intuitive, but the two are very incompatible, and traditional graphics in R is the way it is, inherited from a long history of S (and pre S GRZ). You should really get the nice book by Paul Murrell on R Graphics (Chapman Hall/CRC) which explains the traditional graphics vs the modern Grid based graphics (on which 'lattice' or 'ggplot2 are built). DE I have tried various region mentioned in Brian's DE email. I have tried various searches, but I haven't had DE any luck in finding a reference to device DE region. However, I'm not sure that changing the device DE region will help if one stays with plot (), because the DE default seems to be that the physical plot region is DE approximately square, and I haven't found a way to DE control the size of the physical plot region. The help DE files for eqscplot() and xyplot () indicate that they DE are meant for scatter plots. But lots of plots are not DE scatter plots. I was only using a scatter plot because DE Rolf's code did so, and he thought I could do what I DE wanted inside plot(), which I can't at the moment. DE Perhaps the best solution is to live with plot() as it DE is. If I need the picture for a paper, I will export DE data to Matlab or Mathematica or Illustrator, where I DE can get the control I want. Hah, you must be kidding! The control is in R too, of course, you just haven't seen it yet, since it seems you haven't yet been understanding the different graphics models in R sufficiently. The traditional graphics you'd be using by plot() [or MASS:::eqscplot()] does indeed define the plot region as a function of graphics device region (plus various margin setting parameters), and so Brian Ripley's answer (of course) was very accurate. Note that Gabor mentioned a lattice solution, lattice behaving quite differently here {not setting a plot region from the device region}. For a paper plot, e.g., pdf() as I'd recommend nowadays, you can set the device region by 'width' and 'height' ; and if you really want to use traditional graphics here, do something like ## modified by MM from David Epstein's original example myplot - function(y, yb = mean(y), ylim = c(-1,1)) { ybarv - rep.int(yb, length(y)) x - seq_along(y) plot(x,y, asp=1, xlab=position,ylab=ybar, type=n, ylim = ylim) abline(h = ybar)## instead of segments(x[1], ybar, x[ylength], ybar) segments(x, ybarv, x,y) points (x, ybarv, pch=21, bg=white) points (x, y, pch=19, col=black) invisible() } y - c(1.21, 0.51, 0.14, 1.62, -0.8, 0.72, -1.71, 0.84, 0.02, -0.12) myplot(y) ## MM: setting device region so that ylim = c(-1,1) about fits pdf.do(asp-ex.pdf, height= 3.3, width=10) myplot(y) pdf.end() DE Thanks for all your help. David DE On 20 Jul, 2008, at 23:14, Prof Brian Ripley wrote: Take a look at eqscplot() in package MASS for a different approach. You last para forgets that once you have set the device region and the margins the physical plot region and hence its aspect ratio is determined -- see the figures in 'An Introduction to R'. On Sun, 20 Jul 2008, David Epstein wrote: #See David Williams' book Weighing the odds, p286 y - c(1.21, 0.51, 0.14, 1.62, -0.8, 0.72, -1.71, 0.84, 0.02, -0.12) ybar - mean(y) ylength - length(y) ybarv - rep(ybar, ylength) x - 1:ylength plot(x,y,asp=1,xlab=position,ylab=ybar,type=n,ylim=c(-1,1)) segments(x[1], ybar, x[ylength], ybar) segments(x,ybarv,x,y) points(x, ybarv, pch=21, bg=white) points(x,y,pch=19,col=black) With asp=1, the value of ylim seems to be totally ignored, as in the above code. With asp not set, R plays close attention to the value of ylim. This is not intuitive behaviour, or is it? How can I set the aspect ratio, and simultaneously set the plot region? The aspect ratio is one number and the plot region is given by four numbers (xleft, xright, yleft, yright). Logically, these 5 numbers are independent of each other and arbitrary, provided xleftxright and yleftyright. This should give a one-to- one bijection between 5-tuples and plots, determined up to a change of scale that is uniform in the x- and y-dirctions. My code above shows the (to me) obvious attempt, which fails. Thanks David __ R-help@r-project.org mailing list
Re: [R] drawing segments through points with pch=1
DE == David Epstein [EMAIL PROTECTED] on Sun, 20 Jul 2008 15:40:34 +0100 writes: DE What I don't like about type=b, also suggested by Paul DE Smith, is that the segments do not go right up to the DE little circles---a gap is left, which I don't like. The gap is a feature; if you don't want it, use o ([o]verplotting lines and points)instead of b ([b]oth (lines and points) plot(x, ybarv, type=o, pch=21, bg=white) does in one line what Uwe's solution did in three. DE So far, Uwes' solution is what suits me best. However, I DE understand Brian's objection, though it doesn't apply in DE my case. The discussion makes me fear that it's a very DE long road ahead before I can get fine control of R DE graphics. I'd recommend you should really get and read Paul Murrell's book I just mentioned {in the other thread on R-help}, or read through the basic help pages points, lines, plot.default, ... quite carefully, and try and start to understand the many examples / demos etc. DE Thanks David DE On 20 Jul, 2008, at 14:54, Prof Brian Ripley wrote: On Sun, 20 Jul 2008, Uwe Ligges wrote: You probably want to make your code readable, read ?points and go ahead by making the plot without points (plot(., type=n)), drawing segments and at the end paint points with white background colour in order to overwrite the segments: Except that the background is not necessarily white (and you may want it to be transparent or translucent). It looks to me like lines(type=b) might be what was wanted. y - c(1.21, 0.51, 0.14, 1.62, -0.8, 0.72, -1.71, 0.84, 0.02, -0.12) ybar - mean(y) ll - length(y) ybarv - rep(ybar, ll) x - 1:ll plot(x, ybarv, type=n) segments(x[1], ybar, x[ll], ybar) points(x, ybarv, pch=21, bg=white) Uwe Ligges David Epstein wrote: Please excuse me for asking such basic questions: Here is my code y=c(1.21,0.51,0.14,1.62,-0.8,0.72,-1.71,0.84,0.02,-0.12) ybar=mean(y) ll=length(y); ybarv=rep(ybar,ll) x=1:ll plot(x,ybarv,pch=1) segments(x[1],ybar,x[ll],ybar) What I get is a collection of small circles, with a segment on top of the circles, which is almost what I want. But I don't want the segment to be visible inside any small circle. Is there an easy way to arrange for the segment to lie behind the pch=1 markers, as in hidden line removal, so that the circles remain with nothing inside them? I tried putting the segments command first, but then no segment appeared at all. In general, is there a method of laying a drawing on top of another. I tried inserting add=T as an argument to plot, and R objected strongly. Thanks for any help David Epstein __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] CART and CHAID
On Mon, 21 Jul 2008, Chua Siang Li wrote: Thanks, yes, understand that PARTY offers a lot lot more than CHAID. mmm, allow me to rephrase it. If I am looking for something similar to CART (to grow tree and prune back) and CHAID (using sig test to stop the tree), I can use RPART and PARTY respectively? And are there any more other R packages that offer improved CART/CHAID technqiues? See the Recursive Partitioning section in the CRAN task view http://CRAN.R-project.org/view=MachineLearning Z Thanks. Chua Siang Li Consultant - Operations Research Acceval Pte Ltd Tel: 6297 8740 Email: [EMAIL PROTECTED] Website: www.acceval-intl.com This message and any attachments (the message) are intended for the designated recipient only and may contain information that is confidential and privileged. If you are not the intended recipient, please notify the sender and delete all its contents. Any use , reliance on, reference to, review , disclosure or copying of the message and the information it contains is prohibited. - Original Message From: Prof Brian Ripley [EMAIL PROTECTED] To: Chua Siang Li [EMAIL PROTECTED] Cc: r-help@r-project.org Subject: Re: [R] CART and CHAID Date: 07/21/08 15:17 On Mon, 21 Jul 2008, Chua Siang Li wrote: Can I say that RPART is a modified algo of CART and PARTY a modified of CHAID? Not truthfully. CART is a trademark of commercial software. rpart (sic) is similar but not 'modified' from anything -- it is an independent implementation of the ideas in Breiman, Friedman, Olshen and Stone (with some extra ideas by Terry Therneau and others). party (sic) is very much more general than CHAID. Thanks. Chua Siang Li Consultant - Operations Research Acceval Pte Ltd Tel: 6297 8740 Email: [EMAIL PROTECTED] Website: [2]www.acceval-intl.com -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, [4]http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 References 1. mailto:[EMAIL PROTECTED] 2. http://www.acceval-intl.com/ 3. mailto:[EMAIL PROTECTED] 4. http://www.stats.ox.ac.uk/~ripley/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] par(din) vs dev.size()
I don't see why you think it is 'odd'. par() is working with the current working copy of the internal pars, and that is only updated when you plot. It refers to the current state of the device. At least if the display list is turned on, a screen device will replot when it is resized, obviously if there is a plot present. If you are trying to do computations for a plot, call plot.new() first. On Fri, 18 Jul 2008, Sarah Goslee wrote: Hello, I was messing around with graphics, and noted an odd behavior of par(din). If the x11 device is empty, par(din) does not return the correct size if the device has been resized manually. dev.size() works correctly. R version 2.7.1; Fedora 8 # case 1 - empty device x11() dev.size() [1] 6.995263 6.994187 par(din) [1] 6.995263 6.994187 # resize device dev.size() [1] 6.995263 3.401667 par(din) [1] 6.995263 6.994187 dev.off() # case 2, device containing a plot x11() plot(1,1) dev.size() [1] 6.995263 6.994187 par(din) [1] 6.995263 6.994187 # resize device dev.size() [1] 6.995263 2.772976 par(din) [1] 6.995263 2.772976 dev.off() I found some discussion of this from 2000 and 2001, but no explanation or resolution, and I'm curious. Is there a reason for this behavior? Thanks, Sarah -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Erro: cannot allocate vector of size 216.0 Mb
Dear all, Thank you by your attention. 1) I'm using a Core 2 Duo CPU with 2MB physical memory and Windows Vista 2) The main function, that´s causing the error, is embedd(x=data,d,t). 3) The time series that i´m using has 1.000.000 observations of real numbers. 4) Sometimes the function works, sometimes, not. Some things i´m doing: 1) I have put x - NULL and gc() in the end of each memory-intensive routine (embedding), to release memory. 2) I installed Ubuntu linux in the same machine and will try the same routine. Efective men use Unix :) And my plans to the future: 3) If this not work, i will try to rewrite the code to reduce memory requirements. 4) If this not work, i will try to parallelize the code, by using snow ou Rmpi, something like this. 5) If this not work, i will try to use a cluster, with sufficient memory to let this work. Any suggestions? Many thanks. Regards, jamaj 2008/7/21, Prof Brian Ripley [EMAIL PROTECTED]: On Mon, 21 Jul 2008, Uwe Ligges wrote: Several questions: - Before we go ahead: Are you sure 3 Gb are sufficient for your problem? - Which OS (I guess Windows)? (The only platform on which these functions are supported.) - Which version of R (let's assume R-2.7.1)? - Is your Windows 3GB enabled in the boot flags, or is it a 64-bit version of Windows? (No, or the default memory limit would be higher than 1.5Gb. R by default uses as high a memory limit as is sensible if (as here) the address space is the limiting factor.) Best wishes, Uwe Ligges José Augusto Jr. wrote: Please, I have a 2GB computer and a huge time-series to embedd, and i tried increasing memory.limit() and memory.size(max=TRUE), but nothing. Just before the command: memory.size(max=TRUE) [1] 13.4375 memory.limit() [1] 1535.875 gc() used (Mb) gc trigger (Mb) max used (Mb) Ncells 209552 5.6 407500 10.9 35 9.4 Vcells 125966 1.0 786432 6.0 496686 3.8 I increased the memory limit: memory.limit(3000) NULL memory.limit() [1] 3000 memory.size() [1] 11.33070 memory.size(max=TRUE) [1] 13.4375 gc() used (Mb) gc trigger (Mb) max used (Mb) Ncells 209552 5.6 407500 10.9 35 9.4 Vcells 125964 1.0 786432 6.0 496686 3.8 And even trying to increase the memory.limits, i still get and error. Any sugestions? Thanks in advance. jama __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Time Series - Long Memory Estimation
Dear R-Users, I am doing a research on Time Series, especially on the estimation of the fractional exponent in long memory time series (for those who know). However there are three estimators already built-in the fracdiff package (GPH, Sperio, MLE) I was wondering if there is someone who had used an estimation introduced by P.M. Robinson (related paper: Log-Periodogram regression of time series with long range dependence, 1995, The Annals of Statistics, Vol. 23, p. 1048 - 1072) The estimator is similar to GPH and Sperio based on the periodogram. Thank you in advance, Fotis [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Getting plot axes where they should be!
Hi Folks, I've been digging for the solution to this for several hours now. If there is a solution, it must be one of the worst needle-in-a-haystack examples in R documentation! Essentially, I want to make an x-y plot in which the X-axis really is the X-axis (i.e. its vertical position is at y=0), and the Y-axis really is the Y-axis (i.e. its horizontal position is at x=0). Discussion, with toy examples, below. I have sort-of solved this (as stated) for one special case, after a depth-4 search through ?plot -- ?plot.default -- ?par -- ?axis which finally led me to the parameter pos to axis(): ?axis pos: the coordinate at which the axis line is to be drawn: if not 'NA' this overrides the values of both 'line' and 'mgp[3]'. Hence, instead of plot(c(0.5,2.5),c(0.5,2.5),xlim=c(0,3),ylim=c(0,3), frame.plot=FALSE) (where the axes do not meet at the origin (0,0)), I can do plot(c(0.5,2.5),c(0.5,2.5),xlim=c(0,3),ylim=c(0,3), frame.plot=FALSE,pos=0) which is *exactly* what I want in this case. But now I want to do the same, where instead of plotting the two points (0.5,0.5), (2.5,2.5) I want to plot (0.5,2.5), (2.5,4.5). Provided I keep the xlim and ylim to both have lower value 0, a similar solution again works fine: plot(c(0.5,2.5),c(2.5,4.5),xlim=c(0,3),ylim=c(0,5), frame.plot=FALSE,pos=0) But, in this case, what I *really* want is to limit the Y range to the relevant bit: ylim=c(2,5) -- I don't want to have a lot of empty space below the points. So I want a Y-axis running from y=2 to y=5, and X-axis as before from x=0 to x=3, and I want these two axes to meet at (x=0,y=2). But how? By analogy to the above, I need to set a pos=0 for the X-axis, and a pos=2 for the y-axis. And I have not been able to discover how to do this. With thanks, Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 21-Jul-08 Time: 13:13:45 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Getting plot axes where they should be!
on 07/21/2008 07:13 AM (Ted Harding) wrote: Hi Folks, I've been digging for the solution to this for several hours now. If there is a solution, it must be one of the worst needle-in-a-haystack examples in R documentation! Essentially, I want to make an x-y plot in which the X-axis really is the X-axis (i.e. its vertical position is at y=0), and the Y-axis really is the Y-axis (i.e. its horizontal position is at x=0). Discussion, with toy examples, below. I have sort-of solved this (as stated) for one special case, after a depth-4 search through ?plot -- ?plot.default -- ?par -- ?axis which finally led me to the parameter pos to axis(): ?axis pos: the coordinate at which the axis line is to be drawn: if not 'NA' this overrides the values of both 'line' and 'mgp[3]'. Hence, instead of plot(c(0.5,2.5),c(0.5,2.5),xlim=c(0,3),ylim=c(0,3), frame.plot=FALSE) (where the axes do not meet at the origin (0,0)), I can do plot(c(0.5,2.5),c(0.5,2.5),xlim=c(0,3),ylim=c(0,3), frame.plot=FALSE,pos=0) which is *exactly* what I want in this case. Ted, try this: plot(c(0.5,2.5), c(0.5,2.5), xlim=c(0,3), ylim=c(0,3), xaxs = i, yaxs = i) or perhaps this: plot(c(0.5,2.5), c(0.5,2.5), xlim=c(0,3), ylim=c(0,3), xaxs = i, yaxs = i, axes = FALSE, frame.plot = FALSE) axis(1) axis(2) But now I want to do the same, where instead of plotting the two points (0.5,0.5), (2.5,2.5) I want to plot (0.5,2.5), (2.5,4.5). Provided I keep the xlim and ylim to both have lower value 0, a similar solution again works fine: plot(c(0.5,2.5),c(2.5,4.5),xlim=c(0,3),ylim=c(0,5), frame.plot=FALSE,pos=0) Same thing here: plot(c(0.5,2.5), c(2.5,4.5), xlim=c(0,3), ylim=c(0,5), xaxs = i, yaxs = i) But, in this case, what I *really* want is to limit the Y range to the relevant bit: ylim=c(2,5) -- I don't want to have a lot of empty space below the points. So I want a Y-axis running from y=2 to y=5, and X-axis as before from x=0 to x=3, and I want these two axes to meet at (x=0,y=2). But how? plot(c(0.5,2.5), c(2.5,4.5), xlim=c(0,3), ylim=c(2,5), xaxs = i, yaxs = i) By analogy to the above, I need to set a pos=0 for the X-axis, and a pos=2 for the y-axis. And I have not been able to discover how to do this. With thanks, Ted. See ?par and take note of 'xaxs' and 'yaxs', where it is noted that the default 'r' extends the axes by +/- 4% of the data range. Using 'i' gives you axes with the exact range of the data and/or the 'xlim' and 'ylim' settings. HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Getting plot axes where they should be!
On 7/21/2008 8:13 AM, (Ted Harding) wrote: Hi Folks, I've been digging for the solution to this for several hours now. If there is a solution, it must be one of the worst needle-in-a-haystack examples in R documentation! Essentially, I want to make an x-y plot in which the X-axis really is the X-axis (i.e. its vertical position is at y=0), and the Y-axis really is the Y-axis (i.e. its horizontal position is at x=0). Discussion, with toy examples, below. I have sort-of solved this (as stated) for one special case, after a depth-4 search through ?plot -- ?plot.default -- ?par -- ?axis which finally led me to the parameter pos to axis(): ?axis pos: the coordinate at which the axis line is to be drawn: if not 'NA' this overrides the values of both 'line' and 'mgp[3]'. Hence, instead of plot(c(0.5,2.5),c(0.5,2.5),xlim=c(0,3),ylim=c(0,3), frame.plot=FALSE) (where the axes do not meet at the origin (0,0)), I can do plot(c(0.5,2.5),c(0.5,2.5),xlim=c(0,3),ylim=c(0,3), frame.plot=FALSE,pos=0) which is *exactly* what I want in this case. But now I want to do the same, where instead of plotting the two points (0.5,0.5), (2.5,2.5) I want to plot (0.5,2.5), (2.5,4.5). Provided I keep the xlim and ylim to both have lower value 0, a similar solution again works fine: plot(c(0.5,2.5),c(2.5,4.5),xlim=c(0,3),ylim=c(0,5), frame.plot=FALSE,pos=0) But, in this case, what I *really* want is to limit the Y range to the relevant bit: ylim=c(2,5) -- I don't want to have a lot of empty space below the points. So I want a Y-axis running from y=2 to y=5, and X-axis as before from x=0 to x=3, and I want these two axes to meet at (x=0,y=2). But how? By analogy to the above, I need to set a pos=0 for the X-axis, and a pos=2 for the y-axis. And I have not been able to discover how to do this. It may or may not be possible in a single call to plot(), but it is certainly straightforward if you use separate calls to plot() and axis: plot(c(0.5,2.5),c(2.5,4.5),xlim=c(0,3),ylim=c(2,5), axes=F) axis(1, pos=2) axis(2, pos=0) Generally speaking I find it is usually easier not to try to convince plot() to do strange things: I tell it to do nothing, and do the strange things myself. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Getting plot axes where they should be!
On 21-Jul-08 12:25:32, Marc Schwartz wrote: on 07/21/2008 07:13 AM (Ted Harding) wrote: Hi Folks, I've been digging for the solution to this for several hours now. If there is a solution, it must be one of the worst needle-in-a-haystack examples in R documentation! Essentially, I want to make an x-y plot in which the X-axis really is the X-axis (i.e. its vertical position is at y=0), and the Y-axis really is the Y-axis (i.e. its horizontal position is at x=0). Discussion, with toy examples, below. I have sort-of solved this (as stated) for one special case, after a depth-4 search through ?plot -- ?plot.default -- ?par -- ?axis which finally led me to the parameter pos to axis(): ?axis pos: the coordinate at which the axis line is to be drawn: if not 'NA' this overrides the values of both 'line' and 'mgp[3]'. Hence, instead of plot(c(0.5,2.5),c(0.5,2.5),xlim=c(0,3),ylim=c(0,3), frame.plot=FALSE) (where the axes do not meet at the origin (0,0)), I can do plot(c(0.5,2.5),c(0.5,2.5),xlim=c(0,3),ylim=c(0,3), frame.plot=FALSE,pos=0) which is *exactly* what I want in this case. Ted, try this: plot(c(0.5,2.5), c(0.5,2.5), xlim=c(0,3), ylim=c(0,3), xaxs = i, yaxs = i) or perhaps this: plot(c(0.5,2.5), c(0.5,2.5), xlim=c(0,3), ylim=c(0,3), xaxs = i, yaxs = i, axes = FALSE, frame.plot = FALSE) axis(1) axis(2) But now I want to do the same, where instead of plotting the two points (0.5,0.5), (2.5,2.5) I want to plot (0.5,2.5), (2.5,4.5). Provided I keep the xlim and ylim to both have lower value 0, a similar solution again works fine: plot(c(0.5,2.5),c(2.5,4.5),xlim=c(0,3),ylim=c(0,5), frame.plot=FALSE,pos=0) Same thing here: plot(c(0.5,2.5), c(2.5,4.5), xlim=c(0,3), ylim=c(0,5), xaxs = i, yaxs = i) But, in this case, what I *really* want is to limit the Y range to the relevant bit: ylim=c(2,5) -- I don't want to have a lot of empty space below the points. So I want a Y-axis running from y=2 to y=5, and X-axis as before from x=0 to x=3, and I want these two axes to meet at (x=0,y=2). But how? plot(c(0.5,2.5), c(2.5,4.5), xlim=c(0,3), ylim=c(2,5), xaxs = i, yaxs = i) By analogy to the above, I need to set a pos=0 for the X-axis, and a pos=2 for the y-axis. And I have not been able to discover how to do this. With thanks, Ted. See ?par and take note of 'xaxs' and 'yaxs', where it is noted that the default 'r' extends the axes by +/- 4% of the data range. Using 'i' gives you axes with the exact range of the data and/or the 'xlim' and 'ylim' settings. HTH, Marc Schwartz Thanks, Marc! Those hints solved it for me in the end. In fact, a variant on your suggestions is exactly what I want (in the third example): plot(c(0.5,2.5), c(2.5,4.5), xlim=c(0,3), ylim=c(2,5), xaxs = i, yaxs = i,frame.plot=FALSE) I have to say (admit? confess?) that I had read through the help on xaxs and yaxs in ?par, without interpreting it in terms of how the axis itself is positioned -- as written, it rather seems to describe how the annotations are computed and (in the case where xlim/ylim is not given) how long the axis should be. In other words, what are the properties of the axis itself (including its labels) rather than how it is offset perpendicular to itself (which is the issue I was trying to resolve). Re-reading it now that I have your solution, I still find that this interpretation is not explicit, and needs to be guessed. Thanks again, Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 21-Jul-08 Time: 13:52:53 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Getting plot axes where they should be!
On 21-Jul-08 12:43:47, Duncan Murdoch wrote: On 7/21/2008 8:13 AM, (Ted Harding) wrote: Hi Folks, I've been digging for the solution to this for several hours now. If there is a solution, it must be one of the worst needle-in-a-haystack examples in R documentation! Essentially, I want to make an x-y plot in which the X-axis really is the X-axis (i.e. its vertical position is at y=0), and the Y-axis really is the Y-axis (i.e. its horizontal position is at x=0). Discussion, with toy examples, below. I have sort-of solved this (as stated) for one special case, after a depth-4 search through ?plot -- ?plot.default -- ?par -- ?axis which finally led me to the parameter pos to axis(): ?axis pos: the coordinate at which the axis line is to be drawn: if not 'NA' this overrides the values of both 'line' and 'mgp[3]'. Hence, instead of plot(c(0.5,2.5),c(0.5,2.5),xlim=c(0,3),ylim=c(0,3), frame.plot=FALSE) (where the axes do not meet at the origin (0,0)), I can do plot(c(0.5,2.5),c(0.5,2.5),xlim=c(0,3),ylim=c(0,3), frame.plot=FALSE,pos=0) which is *exactly* what I want in this case. But now I want to do the same, where instead of plotting the two points (0.5,0.5), (2.5,2.5) I want to plot (0.5,2.5), (2.5,4.5). Provided I keep the xlim and ylim to both have lower value 0, a similar solution again works fine: plot(c(0.5,2.5),c(2.5,4.5),xlim=c(0,3),ylim=c(0,5), frame.plot=FALSE,pos=0) But, in this case, what I *really* want is to limit the Y range to the relevant bit: ylim=c(2,5) -- I don't want to have a lot of empty space below the points. So I want a Y-axis running from y=2 to y=5, and X-axis as before from x=0 to x=3, and I want these two axes to meet at (x=0,y=2). But how? By analogy to the above, I need to set a pos=0 for the X-axis, and a pos=2 for the y-axis. And I have not been able to discover how to do this. It may or may not be possible in a single call to plot(), but it is certainly straightforward if you use separate calls to plot() and axis: plot(c(0.5,2.5),c(2.5,4.5),xlim=c(0,3),ylim=c(2,5), axes=F) axis(1, pos=2) axis(2, pos=0) Thanks, Duncan! That nicely rounds off Marc's response, and makes sense (and also, in fact, exposes my blind spot when I was reading the documentation in the first place). Generally speaking I find it is usually easier not to try to convince plot() to do strange things: I tell it to do nothing, and do the strange things myself. It depends what you mean by strange, Duncan. To be frank, I find R's default offset axes to be strange, and what I've been trying to achieve to be normal. But that's a matter of taste, I suppose -- unless you need, for instance, to be able to lay a ruler over the plot and see where it meets the axis! Duncan Murdoch Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 21-Jul-08 Time: 14:00:58 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Howto Restart A Function with Try-Error Catch
Hi all, I have a function - let's call it myfunction. This function is based on some random number generator. Now, once in a while the function will break/crash depending on the random number it generate inside the function. To avoid the problem, what I intend to do is the following: 1. Catch the try-error using class. 2. Redo the function if it returns try-error 3. Otherwise keep the output of the function. I'm not sure how to create the above construct. The code I have below doesn't work: __BEGIN__ myfunction - function(the_x) { # do something a = list(output1=val1, output2 = val2) a } out - try(suppressWarnings(myfunction(x)),silent=T) if (class(out) == try-error) { #this clause doesn't seem to redo out - myfunction(X) } else { ll - out$output1 } __END__ - Gundala Viswanath Jakarta - Indonesia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] drawing segments through points with pch=1
Dear Mr. Epstein, This is another solution, two commands have been added to your own code: y=c(1.21,0.51,0.14,1.62,-0.8,0.72,-1.71,0.84,0.02,-0.12) ybar=mean(y) ll=length(y); ybarv=rep(ybar,ll) x=1:ll plot(x,ybarv,pch=1) segments(x[1],ybar,x[ll],ybar) ## cover the points with other points completely white: points(x,ybarv,pch=16,col='white') ## just write a black border over the white points (with pch=1) points(x,ybarv,pch=1,col='black') The two new commands are layered on top of the previous ones. To write on top of a plot command, use the points command. Regards, Paulo Barata Paulo Barata Fundacao Oswaldo Cruz - Oswaldo Cruz Foundation Rua Leopoldo Bulhoes 1480 - 8A 21041-210 Rio de Janeiro - RJ Brazil E-mail: [EMAIL PROTECTED] Alternative e-mail: [EMAIL PROTECTED] Message: 14 Date: Sun, 20 Jul 2008 13:44:27 +0100 From: David Epstein [EMAIL PROTECTED] Subject: [R] drawing segments through points with pch=1 To: r-help@r-project.org Message-ID: [EMAIL PROTECTED] Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Please excuse me for asking such basic questions: Here is my code y=c(1.21,0.51,0.14,1.62,-0.8,0.72,-1.71,0.84,0.02,-0.12) ybar=mean(y) ll=length(y); ybarv=rep(ybar,ll) x=1:ll plot(x,ybarv,pch=1) segments(x[1],ybar,x[ll],ybar) What I get is a collection of small circles, with a segment on top of the circles, which is almost what I want. But I don't want the segment to be visible inside any small circle. Is there an easy way to arrange for the segment to lie behind the pch=1 markers, as in hidden line removal, so that the circles remain with nothing inside them? I tried putting the segments command first, but then no segment appeared at all. In general, is there a method of laying a drawing on top of another. I tried inserting add=T as an argument to plot, and R objected strongly. Thanks for any help David Epstein __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Cross correlation significance test
Dear All, I am doing some cross-correlation analyses on environmental data and wonder if there is a way to get R to compute a test of significance for these? Thanks for your help, Nora -- Nora Hanson Gatty Marine Institute Sea Mammal Research Unit University of St. Andrews St. Andrews Fife KY16 9AL Scotland Mobile: 07846140350 [EMAIL PROTECTED] -- Nora Hanson Gatty Marine Institute Sea Mammal Research Unit University of St. Andrews St. Andrews Fife KY16 9AL Scotland Mobile: 07846140350 [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Subsetting data by date
Hi all, Firstly I appologise if this question has been answered previously, however searching of the archives and the internet generally has not yielded any results. I am looking in to the effects of summer weather conditions (temperature, humidity etc), on the incidences of a breathing disorder brought on through smoking (COPD). I am fairly new to R and completely new to the idea of writing R scripts, subsetting dataframes etc. I am working on a 12 week summer placement at the Met Office, UK, having just finished my second year of a mathematics course at university. Basically I have data between January 1 1997 and December 31 2007. However as I am only interest in the summer months (which I have defined to be between May 1 and September 30), I would like to extract the relevant data in R in a timely manner. Obviously I could go and open my csv files in excel, cut and paste the relevant data, etc, however I would like to maximise R's potential as I feel it will stand me in better stead in the long run. Currently the dates are in the form 1-Apr-1997, 3-Sept-2001, etc. I will create a data.frame with date as one of the variables, the others being (initially) temperature, humidity, and Admissions (the number of hospital admissions for COPD exaserbations). Please could somebody tell me if there is a simple way to extract the data I want, and if so perhaps a sample command to get me going? Do I first need to format the dates to some numeric-only format? As I say, I could use Excel to create the files in the right format, but I will be dealing with a lot more variables in the future (perhaps up to 8) and so this will become a pain-staking process. Please reply either on or off list. Many thanks for any help. Robin Williams Met Office summer intern - Health Forecasting [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] asp and ylim
A really great answer to my concerns! I'll get hold of the Paul Murrell book, and see how far I can get. On 21 Jul, 2008, at 10:48, Martin Maechler wrote: Play around resizing your graphics window.. This is very instructive, with an 'asp = .' using traditional graphics plot(). OK, but I don't know what you mean by asp=.. Does this mean setting asp equal to NULL, or to a default setting? DE Perhaps the best solution is to live with plot() as it DE is. If I need the picture for a paper, I will export DE data to Matlab or Mathematica or Illustrator, where I DE can get the control I want. Hah, you must be kidding! Change kidding to frustrated! For a paper plot, e.g., pdf() as I'd recommend nowadays, you can set the device region by 'width' and 'height' ; and if you really want to use traditional graphics here, do something like ## modified by MM from David Epstein's original example myplot - function(y, yb = mean(y), ylim = c(-1,1)) { ybarv - rep.int(yb, length(y)) x - seq_along(y) plot(x,y, asp=1, xlab=position,ylab=ybar, type=n, ylim = ylim) abline(h = ybar)## instead of segments(x[1], ybar, x[ylength], ybar) segments(x, ybarv, x,y) points (x, ybarv, pch=21, bg=white) points (x, y, pch=19, col=black) invisible() } I learned quite a few things from this code above. y - c(1.21, 0.51, 0.14, 1.62, -0.8, 0.72, -1.71, 0.84, 0.02, -0.12) myplot(y) ## MM: setting device region so that ylim = c(-1,1) about fits pdf.do(asp-ex.pdf, height= 3.3, width=10) myplot(y) pdf.end() I cannot find the functions pdf.do and pdf.end. Are these part of some package that I need to load? Your package? Thanks David __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] asp and ylim
DE == David Epstein [EMAIL PROTECTED] on Mon, 21 Jul 2008 14:32:47 +0100 writes: DE A really great answer to my concerns! I'll get hold of the Paul DE Murrell book, and see how far I can get. DE On 21 Jul, 2008, at 10:48, Martin Maechler wrote: Play around resizing your graphics window.. This is very instructive, with an 'asp = .' using traditional graphics plot(). DE OK, but I don't know what you mean by asp=.. Does this mean setting DE asp equal to NULL, or to a default setting? I meant *any* value such as your example's asp = 1 DE Perhaps the best solution is to live with plot() as it DE is. If I need the picture for a paper, I will export DE data to Matlab or Mathematica or Illustrator, where I DE can get the control I want. Hah, you must be kidding! DE Change kidding to frustrated! well... ... your choice :-) For a paper plot, e.g., pdf() as I'd recommend nowadays, you can set the device region by 'width' and 'height' ; and if you really want to use traditional graphics here, do something like ## modified by MM from David Epstein's original example myplot - function(y, yb = mean(y), ylim = c(-1,1)) { ybarv - rep.int(yb, length(y)) x - seq_along(y) plot(x,y, asp=1, xlab=position,ylab=ybar, type=n, ylim = ylim) abline(h = ybar)## instead of segments(x[1], ybar, x[ylength], ybar) segments(x, ybarv, x,y) points (x, ybarv, pch=21, bg=white) points (x, y, pch=19, col=black) invisible() } DE I learned quite a few things from this code above. y - c(1.21, 0.51, 0.14, 1.62, -0.8, 0.72, -1.71, 0.84, 0.02, -0.12) myplot(y) ## MM: setting device region so that ylim = c(-1,1) about fits pdf.do(asp-ex.pdf, height= 3.3, width=10) myplot(y) pdf.end() DE I cannot find the functions pdf.do and pdf.end. Are these part of DE some package that I need to load? Your package? Oh, that's been an accident : They are part of 'sfsmisc' (a small R-code only package you can quickly install and load) and I use them all the time, but really, I've wanted to just use pdf(asp-ex.pdf, height= 3.3, width=10) myplot(y) ## and now view the pdf file in your favorite viewer Martin DE Thanks DE David __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Subsetting data by date
Try this: Lines - Date,Temp 1-Apr-1997,50 3-Sept-2001,60 library(zoo) # function to reduce 4 char mos to 3 char convert.date - function(x, format) as.Date(sub((-...).-, \\1-, x), format) # z - read.zoo(myfile.csv, header = TRUE, sep = ,, FUN = convert.date, format = %d-%b-%Y) z - read.zoo(textConnection(Lines), header = TRUE, sep = ,, FUN = convert.date, format = %d-%b-%Y) plot(z) If the dates are actually three letters, i.e. Sep and not Sept, then you could eliminate convert.date and simplify the read.zoo line to: z - read.zoo(textConnection(Lines), header = TRUE, sep = ,, format = %d-%b-%Y) See the zoo package documentation and its three vignettes as well as ?read.zoo ?strptime and ?plot.zoo and also look at the dates article in R News 4/1. On Mon, Jul 21, 2008 at 9:31 AM, Williams, Robin [EMAIL PROTECTED] wrote: Hi all, Firstly I appologise if this question has been answered previously, however searching of the archives and the internet generally has not yielded any results. I am looking in to the effects of summer weather conditions (temperature, humidity etc), on the incidences of a breathing disorder brought on through smoking (COPD). I am fairly new to R and completely new to the idea of writing R scripts, subsetting dataframes etc. I am working on a 12 week summer placement at the Met Office, UK, having just finished my second year of a mathematics course at university. Basically I have data between January 1 1997 and December 31 2007. However as I am only interest in the summer months (which I have defined to be between May 1 and September 30), I would like to extract the relevant data in R in a timely manner. Obviously I could go and open my csv files in excel, cut and paste the relevant data, etc, however I would like to maximise R's potential as I feel it will stand me in better stead in the long run. Currently the dates are in the form 1-Apr-1997, 3-Sept-2001, etc. I will create a data.frame with date as one of the variables, the others being (initially) temperature, humidity, and Admissions (the number of hospital admissions for COPD exaserbations). Please could somebody tell me if there is a simple way to extract the data I want, and if so perhaps a sample command to get me going? Do I first need to format the dates to some numeric-only format? As I say, I could use Excel to create the files in the right format, but I will be dealing with a lot more variables in the future (perhaps up to 8) and so this will become a pain-staking process. Please reply either on or off list. Many thanks for any help. Robin Williams Met Office summer intern - Health Forecasting [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] A question on the quandratic programming
Dear all I have a optimization problem as follows. And would appreaciated if someone can give me the reply soon. I aim to optimize the portfolio in considering the transaction cost. Hence the objective function is: Min: 1/2 w^T* omega*w-mu^T*w-c^T*(w-w0) when w[i]wo[i] 1/2 w^T* omega*w-mu^T*w+c^T*(w0-w) when w[i]w0[i] Where w is the update weight vector of the portfolio omiga is the variance-covariance matrix mu is the vector of the return rate wo is the initial vector weight C is the coefficient of transaction cost It is in a bit of emergency. I would be really appreciated if anybody can give me the reply ASAP. Many thanks Yunlei ___ This e-mail may contain information that is confidential, privileged or otherwise protected from disclosure. If you are not an intended recipient of this e-mail, do not duplicate or redistribute it by any means. Please delete it and any attachments and notify the sender that you have received it in error. Unless specifically indicated, this e-mail is not an offer to buy or sell or a solicitation to buy or sell any securities, investment products or other financial product or service, an official confirmation of any transaction, or an official statement of Barclays. Any views or opinions presented are solely those of the author and do not necessarily represent those of Barclays. This e-mail is subject to terms available at the following link: www.barcap.com/emaildisclaimer. By messaging with Barclays you consent to the foregoing. Barclays Capital is the investment banking division of Barclays Bank PLC, a company registered in England (number 1026167) with its registered offi! ce at 1 Churchill Place, London, E14 5HP. This email may relate to or be sent from other members of the Barclays Group. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] On creating grouped data set.
Dear UseRs, I would like to know the way to create grouped data set such as Oats data.frame in nlme package. Specifically, I need to create a grouped data set with PBIB data.frame in SASmixed package. Any help? Looking forward to hearing from you. Best, __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Time Series - Long Memory Estimation
Dear R-Users, I am doing a research on Time Series, especially on the estimation of the fractional exponent in long memory time series (for those who know). However there are three estimators already built-in the fracdiff package (GPH, Sperio, MLE) I was wondering if there is someone who had used an estimation introduced by P.M. Robinson (related paper: Log-Periodogram regression of time series with long range dependence, 1995, The Annals of Statistics, Vol. 23, p. 1048 - 1072) The estimator is similar to GPH and Sperio based on the periodogram. Thank you in advance, Fotis -- fp [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] fama-macbeth
Hi all, I was wondering whether there is a standard method to carry out fama-macbeth regressions in R. I have spent the last few hours looking around the help pages but nothing seems to be written about this. Thanks a lot! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Cross correlation significance test
Dear All, I am doing some cross-correlation analyses on environmental data and wonder if there is a way to get R to compute a test of significance for these? Thanks for your help, Nora -- Nora Hanson Gatty Marine Institute Sea Mammal Research Unit University of St. Andrews St. Andrews Fife KY16 9AL Scotland Mobile: 07846140350 [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help with integrate function
Hi I've used the integrate function to do numerical integration and was wondering exactly how the algorithm works. It states that it is adaptive quadrature, does anyone know how the sampling points and weights are chosen, and what transformation is used to convert infinite intervals into finite ones? Thanks very much Andrew Brown __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] alternate usage of soil.texture (plotrix)
Hi, I have used the soil.texture() function from the plotrix package many times and am very pleased that such a function exists in R. I have a slightly different need this time, and need some pointers on how to accomplish it. Instead of plotting single symbols on the triangle, I would like to outline and fill with a transparent color several textural classes. For example, if an area had both loam and sandy loam, I would like to draw a polygon around the entire loam and sandy loam classes. Looking at the soil.texture() code, this snippet seems to draw the outlines of the texture classes. However, when I try plotting just a couple of the segments in these lists I get lines going off the original plot. h1 - c(85, 70, 80, 52, 52, 50, 20, 8, 52, 45, 45, 65, 45, 20, 20)/100 h3 - c(0, 0, 20, 20, 7, 0, 0, 12, 20, 27, 27, 35, 40, 27, 40)/100 t1 - c(90, 85, 52, 52, 43, 23, 8, 0, 45, 0, 45, 45, 0, 20, 0)/100 t3 - c(10, 15, 20, 7, 7, 27, 12, 12, 27, 27, 55, 35, 40, 40, 60)/100 triax.segments(h1, h3, t1, t2, col.lines) Apart from a purely manual approach using locator(), is there any way to accomplish what I am trying to do with a slight modification to soil.texture() ? Thanks in advance. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] TeachingDemos question: my.symbols() alignment problems in complicated layout
Hi Hadley and Paul, thank you a lot for the suggestions... However, i had also emailed Greg Snow (author of my.symbols() function) personally... and Greg has emailed me back with a fixed version of the my.symbols() function -- to be included in the next release -- that works without a hitch with layout()... I am not posting it here, as i don't know if i am allowed... Cheers, Maria Here are a couple of options: (i) use the 'gridBase' package and do these arrow annotations using the 'grid' package, which allows you to control coordinate systems in a more rational manner. There's an example (perhaps slightly more complicated than you need) in: http://cran.r-project.org/doc/Rnews/Rnews_2003-2.pdf (ii) draw your main plot using 'lattice' and the annotations using 'grid' or possibly 'grImport'. There's a hint of an example of the latter on slide 18 of: http://www.stat.auckland.ac.nz/~paul/Talks/import.pdf (iii) draw the whole thing using 'grid'. You can start to get acquainted with grid here: http://www.stat.auckland.ac.nz/~paul/RGraphics/chapter5.pdf (iv) Use ggplot2 - particularly geom_segment (http://had.co.nz/ggplot2/geom_segment.html) and stat_spoke (http://had.co.nz/ggplot2/stat_spoke.html) Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] manipulate a matrix2
Thanks Jim, that was exactly what I was after. On a second note, do you have any insight into pulling out the duplicates in this type of matrix? I thought that was what the upper=FALSE is in: csv.dis - vegdist(csv.m, method='jaccard', binary=FALSE, diag=FALSE, upper=FALSE). I just need either the lower or upper portion, with the zeros (,3 ,3) being the dividing line. [,3] [,5] [,6] [,9] [,11] [3,]02 3 4 5 [5,]20 8 9 10 [6,]38 0 14 15 [9,]49 14 020 [11,] 5 10 15 20 0 Thanks again, Jon -Original Message- From: jim holtman [mailto:[EMAIL PROTECTED] Sent: Friday, July 18, 2008 9:56 AM To: Jon Hak Cc: r-help@r-project.org Subject: Re: [R] manipulate a matrix2 Is this what you want: x [,3] [,5] [,6] [,9] [,11] [,3] 16 11 1621 [,5] 27 12 1722 [,6] 38 13 1823 [,9] 49 14 1924 [,11]5 10 15 2025 library(reshape) melt(x) X1X2 value 1 [,3] [,3] 1 2 [,5] [,3] 2 3 [,6] [,3] 3 4 [,9] [,3] 4 5 [,11] [,3] 5 6 [,3] [,5] 6 7 [,5] [,5] 7 8 [,6] [,5] 8 9 [,9] [,5] 9 10 [,11] [,5]10 11 [,3] [,6]11 12 [,5] [,6]12 13 [,6] [,6]13 14 [,9] [,6]14 15 [,11] [,6]15 16 [,3] [,9]16 17 [,5] [,9]17 18 [,6] [,9]18 19 [,9] [,9]19 20 [,11] [,9]20 21 [,3] [,11]21 22 [,5] [,11]22 23 [,6] [,11]23 24 [,9] [,11]24 25 [,11] [,11]25 On Fri, Jul 18, 2008 at 11:10 AM, Jon Hak [EMAIL PROTECTED] wrote: Building upon Jim's answer below (Thanks Jim, that helped a lot), I need to pickup where this thread left off. I'm using Vegan to calculate the Jaccard's Index and the Row.Names and column names are represented in my matrix as seen here. [,3] [,5] [,6] [,9] [,11] [3,]06 11 16 21 [5,]20 12 17 22 [6,]38 018 23 [9,]49 14 024 [11,] 5 10 15 20 0 When I use the command; xy - cbind(row=as.vector(row.names(x)), col=as.vector(colnames(x)), value=as.vector(x)) I get the list (the column value is the issue); rowcol value [1,] 3 1 0 [2,] 5 1 2 [3,] 6 1 3 [4,] 9 1 4 [5,] 11 1 5 [6,] 3 2 6 [7,] 5 2 0 [8,] 6 2 8 [9,] 9 2 9 [10,] 11 210 [11,] 3311 [12,] 530 I would really like the col value to equal the actual name, not the column number. What am I missing? The analysis is very large, 6k x6k matrix so automating the process is a high priority. Thanks, Jon From: jim holtman jholtman_at_gmail.com mailto:jholtman_at_gmail.com?Subject=Re:%20%5BR%5D%20manipulate%20a%20m atrix Date: Mon, 25 Jun 2007 12:39:46 -0400 Is this what you want? x [,1] [,2] [,3] [,4] [,5] [1,]16 11 16 21 [2,]27 12 17 22 [3,]38 13 18 23 [4,]49 14 19 24 [5,]5 10 15 20 25 cbind(row=as.vector(row(x)), col=as.vector(col(x)), value=as.vector(x)) row col value [1,] 1 1 1 [2,] 2 1 2 [3,] 3 1 3 [4,] 4 1 4 [5,] 5 1 5 [6,] 1 2 6 [7,] 2 2 7 [8,] 3 2 8 [9,] 4 2 9 [10,] 5 210 [11,] 1 311 [12,] 2 312 [13,] 3 313 [14,] 4 314 [15,] 5 315 [16,] 1 416 [17,] 2 417 [18,] 3 418 [19,] 4 419 [20,] 5 420 [21,] 1 521 [22,] 2 522 [23,] 3 523 [24,] 4 5 24 [25,] 5 5 25 On 6/25/07, Jon Hak Jon_Hak_at_natureserve.org wrote: I have read everything I can find on how to manipulate a results matrix in http://tolstoy.newcastle.edu.au/R/e2/help/07/06/19875.html#19887qlink1 R and I have to admit I'm stumped. I have set up a process to extract a dataset from ArcGIS to compute a similarity index (Jaccards) in Vegan. The dataset is fairly simple, but large, and consists of rows = sample area, and columns = elements. I've been able to view the results in R, but I want to get the results out to a database and a matrix that is 6000-rows x 6000-columns can be very difficult to manipulate in Windows XP. I would to rotate the matrix so that the output would look like the old condensed format in programs like Conoco. Ideally, I would like format to look something like this; Site-row Site-col Jaccard 1 1 1 1 2 .9 1 3 .6 2 1 .9 2 2 1 2 3 .75 Thanks for any help, *** John Hak Senior GIS Analyst/Sr. Ecologist NatureServe 4001 Discovery Drive Boulder, CO 80303 (703) 797-4809 There is perhaps no better demonstration of the folly of human conceits than this distant image of our tiny world. To me, it underscores our responsibility to
Re: [R] Subsetting data by date
Thanks very much for this. The dates are in fact in 3-character form (i.e. sep and not sept). Any suggestions as to where one can start to learn the R language? Up until now, I have only entered simple commands in the terminal. Best wishes, Robin Williams Met Office summer intern - Health Forecasting [EMAIL PROTECTED] -Original Message- From: Gabor Grothendieck [mailto:[EMAIL PROTECTED] Sent: Monday, July 21, 2008 3:26 PM To: Williams, Robin Cc: R-help@r-project.org Subject: Re: [R] Subsetting data by date Continuing on, to just get points from May to Sep mo - as.numeric(format(time(z), %m)) z.summer - z[mo = 5 mo = 9] If in your case z is multivariate rather than univariate (as it is in our example) then it would be: z.summer - z[mo = 5 mo = 9, ] On Mon, Jul 21, 2008 at 9:55 AM, Gabor Grothendieck [EMAIL PROTECTED] wrote: Try this: Lines - Date,Temp 1-Apr-1997,50 3-Sept-2001,60 library(zoo) # function to reduce 4 char mos to 3 char convert.date - function(x, format) as.Date(sub((-...).-, \\1-, x), format) # z - read.zoo(myfile.csv, header = TRUE, sep = ,, FUN = convert.date, format = %d-%b-%Y) z - read.zoo(textConnection(Lines), header = TRUE, sep = ,, FUN = convert.date, format = %d-%b-%Y) plot(z) If the dates are actually three letters, i.e. Sep and not Sept, then you could eliminate convert.date and simplify the read.zoo line to: z - read.zoo(textConnection(Lines), header = TRUE, sep = ,, format = %d-%b-%Y) See the zoo package documentation and its three vignettes as well as ?read.zoo ?strptime and ?plot.zoo and also look at the dates article in R News 4/1. On Mon, Jul 21, 2008 at 9:31 AM, Williams, Robin [EMAIL PROTECTED] wrote: Hi all, Firstly I appologise if this question has been answered previously, however searching of the archives and the internet generally has not yielded any results. I am looking in to the effects of summer weather conditions (temperature, humidity etc), on the incidences of a breathing disorder brought on through smoking (COPD). I am fairly new to R and completely new to the idea of writing R scripts, subsetting dataframes etc. I am working on a 12 week summer placement at the Met Office, UK, having just finished my second year of a mathematics course at university. Basically I have data between January 1 1997 and December 31 2007. However as I am only interest in the summer months (which I have defined to be between May 1 and September 30), I would like to extract the relevant data in R in a timely manner. Obviously I could go and open my csv files in excel, cut and paste the relevant data, etc, however I would like to maximise R's potential as I feel it will stand me in better stead in the long run. Currently the dates are in the form 1-Apr-1997, 3-Sept-2001, etc. I will create a data.frame with date as one of the variables, the others being (initially) temperature, humidity, and Admissions (the number of hospital admissions for COPD exaserbations). Please could somebody tell me if there is a simple way to extract the data I want, and if so perhaps a sample command to get me going? Do I first need to format the dates to some numeric-only format? As I say, I could use Excel to create the files in the right format, but I will be dealing with a lot more variables in the future (perhaps up to 8) and so this will become a pain-staking process. Please reply either on or off list. Many thanks for any help. Robin Williams Met Office summer intern - Health Forecasting [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Subsetting data by date
And following on from my original question: In another file I have the dates in numeric format, (ddmmyy). How could I extract the data between 01/05/97 and 30/09/97, etc? Do I need to implement some sort of loop? Robin Williams Met Office summer intern - Health Forecasting [EMAIL PROTECTED] -Original Message- From: Gabor Grothendieck [mailto:[EMAIL PROTECTED] Sent: Monday, July 21, 2008 3:26 PM To: Williams, Robin Cc: R-help@r-project.org Subject: Re: [R] Subsetting data by date Continuing on, to just get points from May to Sep mo - as.numeric(format(time(z), %m)) z.summer - z[mo = 5 mo = 9] If in your case z is multivariate rather than univariate (as it is in our example) then it would be: z.summer - z[mo = 5 mo = 9, ] On Mon, Jul 21, 2008 at 9:55 AM, Gabor Grothendieck [EMAIL PROTECTED] wrote: Try this: Lines - Date,Temp 1-Apr-1997,50 3-Sept-2001,60 library(zoo) # function to reduce 4 char mos to 3 char convert.date - function(x, format) as.Date(sub((-...).-, \\1-, x), format) # z - read.zoo(myfile.csv, header = TRUE, sep = ,, FUN = convert.date, format = %d-%b-%Y) z - read.zoo(textConnection(Lines), header = TRUE, sep = ,, FUN = convert.date, format = %d-%b-%Y) plot(z) If the dates are actually three letters, i.e. Sep and not Sept, then you could eliminate convert.date and simplify the read.zoo line to: z - read.zoo(textConnection(Lines), header = TRUE, sep = ,, format = %d-%b-%Y) See the zoo package documentation and its three vignettes as well as ?read.zoo ?strptime and ?plot.zoo and also look at the dates article in R News 4/1. On Mon, Jul 21, 2008 at 9:31 AM, Williams, Robin [EMAIL PROTECTED] wrote: Hi all, Firstly I appologise if this question has been answered previously, however searching of the archives and the internet generally has not yielded any results. I am looking in to the effects of summer weather conditions (temperature, humidity etc), on the incidences of a breathing disorder brought on through smoking (COPD). I am fairly new to R and completely new to the idea of writing R scripts, subsetting dataframes etc. I am working on a 12 week summer placement at the Met Office, UK, having just finished my second year of a mathematics course at university. Basically I have data between January 1 1997 and December 31 2007. However as I am only interest in the summer months (which I have defined to be between May 1 and September 30), I would like to extract the relevant data in R in a timely manner. Obviously I could go and open my csv files in excel, cut and paste the relevant data, etc, however I would like to maximise R's potential as I feel it will stand me in better stead in the long run. Currently the dates are in the form 1-Apr-1997, 3-Sept-2001, etc. I will create a data.frame with date as one of the variables, the others being (initially) temperature, humidity, and Admissions (the number of hospital admissions for COPD exaserbations). Please could somebody tell me if there is a simple way to extract the data I want, and if so perhaps a sample command to get me going? Do I first need to format the dates to some numeric-only format? As I say, I could use Excel to create the files in the right format, but I will be dealing with a lot more variables in the future (perhaps up to 8) and so this will become a pain-staking process. Please reply either on or off list. Many thanks for any help. Robin Williams Met Office summer intern - Health Forecasting [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Subsetting data by date
Here are a few very simple notes I wrote up for someone a little while ago. They may be a bit of help. R documentation is fairly confusing. The Help is almost alwas very complete but not necessarily easy to understand for neophite. :) Probably the first thing to do is to look into getting a decent editor for R. http://www.mediafire.com/?npzjnlgzg2y --- On Mon, 7/21/08, Williams, Robin [EMAIL PROTECTED] wrote: From: Williams, Robin [EMAIL PROTECTED] Any suggestions as to where one can start to learn the R language? Up until now, I have only entered simple commands in the terminal. Best wishes, Robin Williams Met Office summer intern - Health Forecasting [EMAIL PROTECTED] -Original Message- From: Gabor Grothendieck [mailto:[EMAIL PROTECTED] Sent: Monday, July 21, 2008 3:26 PM To: Williams, Robin Cc: R-help@r-project.org Subject: Re: [R] Subsetting data by date Continuing on, to just get points from May to Sep mo - as.numeric(format(time(z), %m)) z.summer - z[mo = 5 mo = 9] If in your case z is multivariate rather than univariate (as it is in our example) then it would be: z.summer - z[mo = 5 mo = 9, ] On Mon, Jul 21, 2008 at 9:55 AM, Gabor Grothendieck [EMAIL PROTECTED] wrote: Try this: Lines - Date,Temp 1-Apr-1997,50 3-Sept-2001,60 library(zoo) # function to reduce 4 char mos to 3 char convert.date - function(x, format) as.Date(sub((-...).-, \\1-, x), format) # z - read.zoo(myfile.csv, header = TRUE, sep = ,, FUN = convert.date, format = %d-%b-%Y) z - read.zoo(textConnection(Lines), header = TRUE, sep = ,, FUN = convert.date, format = %d-%b-%Y) plot(z) If the dates are actually three letters, i.e. Sep and not Sept, then you could eliminate convert.date and simplify the read.zoo line to: z - read.zoo(textConnection(Lines), header = TRUE, sep = ,, format = %d-%b-%Y) See the zoo package documentation and its three vignettes as well as ?read.zoo ?strptime and ?plot.zoo and also look at the dates article in R News 4/1. On Mon, Jul 21, 2008 at 9:31 AM, Williams, Robin [EMAIL PROTECTED] wrote: Hi all, Firstly I appologise if this question has been answered previously, however searching of the archives and the internet generally has not yielded any results. I am looking in to the effects of summer weather conditions (temperature, humidity etc), on the incidences of a breathing disorder brought on through smoking (COPD). I am fairly new to R and completely new to the idea of writing R scripts, subsetting dataframes etc. I am working on a 12 week summer placement at the Met Office, UK, having just finished my second year of a mathematics course at university. Basically I have data between January 1 1997 and December 31 2007. However as I am only interest in the summer months (which I have defined to be between May 1 and September 30), I would like to extract the relevant data in R in a timely manner. Obviously I could go and open my csv files in excel, cut and paste the relevant data, etc, however I would like to maximise R's potential as I feel it will stand me in better stead in the long run. Currently the dates are in the form 1-Apr-1997, 3-Sept-2001, etc. I will create a data.frame with date as one of the variables, the others being (initially) temperature, humidity, and Admissions (the number of hospital admissions for COPD exaserbations). Please could somebody tell me if there is a simple way to extract the data I want, and if so perhaps a sample command to get me going? Do I first need to format the dates to some numeric-only format? As I say, I could use Excel to create the files in the right format, but I will be dealing with a lot more variables in the future (perhaps up to 8) and so this will become a pain-staking process. Please reply either on or off list. Many thanks for any help. Robin Williams Met Office summer intern - Health Forecasting [EMAIL PROTECTED] __ [[elided Yahoo spam]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Subsetting data by date
?subset is one of several ways. You don't need a loop. Loops are BAD in R :) --- On Mon, 7/21/08, Williams, Robin [EMAIL PROTECTED] wrote: From: Williams, Robin [EMAIL PROTECTED] Subject: Re: [R] Subsetting data by date To: Gabor Grothendieck [EMAIL PROTECTED] Cc: R-help@r-project.org Received: Monday, July 21, 2008, 10:41 AM And following on from my original question: In another file I have the dates in numeric format, (ddmmyy). How could I extract the data between 01/05/97 and 30/09/97, etc? Do I need to implement some sort of loop? Robin Williams Met Office summer intern - Health Forecasting [EMAIL PROTECTED] -Original Message- From: Gabor Grothendieck [mailto:[EMAIL PROTECTED] Sent: Monday, July 21, 2008 3:26 PM To: Williams, Robin Cc: R-help@r-project.org Subject: Re: [R] Subsetting data by date Continuing on, to just get points from May to Sep mo - as.numeric(format(time(z), %m)) z.summer - z[mo = 5 mo = 9] If in your case z is multivariate rather than univariate (as it is in our example) then it would be: z.summer - z[mo = 5 mo = 9, ] On Mon, Jul 21, 2008 at 9:55 AM, Gabor Grothendieck [EMAIL PROTECTED] wrote: Try this: Lines - Date,Temp 1-Apr-1997,50 3-Sept-2001,60 library(zoo) # function to reduce 4 char mos to 3 char convert.date - function(x, format) as.Date(sub((-...).-, \\1-, x), format) # z - read.zoo(myfile.csv, header = TRUE, sep = ,, FUN = convert.date, format = %d-%b-%Y) z - read.zoo(textConnection(Lines), header = TRUE, sep = ,, FUN = convert.date, format = %d-%b-%Y) plot(z) If the dates are actually three letters, i.e. Sep and not Sept, then you could eliminate convert.date and simplify the read.zoo line to: z - read.zoo(textConnection(Lines), header = TRUE, sep = ,, format = %d-%b-%Y) See the zoo package documentation and its three vignettes as well as ?read.zoo ?strptime and ?plot.zoo and also look at the dates article in R News 4/1. On Mon, Jul 21, 2008 at 9:31 AM, Williams, Robin [EMAIL PROTECTED] wrote: Hi all, Firstly I appologise if this question has been answered previously, however searching of the archives and the internet generally has not yielded any results. I am looking in to the effects of summer weather conditions (temperature, humidity etc), on the incidences of a breathing disorder brought on through smoking (COPD). I am fairly new to R and completely new to the idea of writing R scripts, subsetting dataframes etc. I am working on a 12 week summer placement at the Met Office, UK, having just finished my second year of a mathematics course at university. Basically I have data between January 1 1997 and December 31 2007. However as I am only interest in the summer months (which I have defined to be between May 1 and September 30), I would like to extract the relevant data in R in a timely manner. Obviously I could go and open my csv files in excel, cut and paste the relevant data, etc, however I would like to maximise R's potential as I feel it will stand me in better stead in the long run. Currently the dates are in the form 1-Apr-1997, 3-Sept-2001, etc. I will create a data.frame with date as one of the variables, the others being (initially) temperature, humidity, and Admissions (the number of hospital admissions for COPD exaserbations). Please could somebody tell me if there is a simple way to extract the data I want, and if so perhaps a sample command to get me going? Do I first need to format the dates to some numeric-only format? As I say, I could use Excel to create the files in the right format, but I will be dealing with a lot more variables in the future (perhaps up to 8) and so this will become a pain-staking process. Please reply either on or off list. Many thanks for any help. Robin Williams Met Office summer intern - Health Forecasting [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ [[elided Yahoo spam]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal,
Re: [R] Subsetting data by date
You need to review all the items I already mentioned. The % codes are in ?strptime so just use those appropraite to your format. The R manuals, the books listed here: http://www.r-project.org/doc/bib/R-publications.html and the Contributed Documentation: http://cran.r-project.org/other-docs.html are among the items you can use to learn R. On Mon, Jul 21, 2008 at 10:41 AM, Williams, Robin [EMAIL PROTECTED] wrote: And following on from my original question: In another file I have the dates in numeric format, (ddmmyy). How could I extract the data between 01/05/97 and 30/09/97, etc? Do I need to implement some sort of loop? Robin Williams Met Office summer intern - Health Forecasting [EMAIL PROTECTED] -Original Message- From: Gabor Grothendieck [mailto:[EMAIL PROTECTED] Sent: Monday, July 21, 2008 3:26 PM To: Williams, Robin Cc: R-help@r-project.org Subject: Re: [R] Subsetting data by date Continuing on, to just get points from May to Sep mo - as.numeric(format(time(z), %m)) z.summer - z[mo = 5 mo = 9] If in your case z is multivariate rather than univariate (as it is in our example) then it would be: z.summer - z[mo = 5 mo = 9, ] On Mon, Jul 21, 2008 at 9:55 AM, Gabor Grothendieck [EMAIL PROTECTED] wrote: Try this: Lines - Date,Temp 1-Apr-1997,50 3-Sept-2001,60 library(zoo) # function to reduce 4 char mos to 3 char convert.date - function(x, format) as.Date(sub((-...).-, \\1-, x), format) # z - read.zoo(myfile.csv, header = TRUE, sep = ,, FUN = convert.date, format = %d-%b-%Y) z - read.zoo(textConnection(Lines), header = TRUE, sep = ,, FUN = convert.date, format = %d-%b-%Y) plot(z) If the dates are actually three letters, i.e. Sep and not Sept, then you could eliminate convert.date and simplify the read.zoo line to: z - read.zoo(textConnection(Lines), header = TRUE, sep = ,, format = %d-%b-%Y) See the zoo package documentation and its three vignettes as well as ?read.zoo ?strptime and ?plot.zoo and also look at the dates article in R News 4/1. On Mon, Jul 21, 2008 at 9:31 AM, Williams, Robin [EMAIL PROTECTED] wrote: Hi all, Firstly I appologise if this question has been answered previously, however searching of the archives and the internet generally has not yielded any results. I am looking in to the effects of summer weather conditions (temperature, humidity etc), on the incidences of a breathing disorder brought on through smoking (COPD). I am fairly new to R and completely new to the idea of writing R scripts, subsetting dataframes etc. I am working on a 12 week summer placement at the Met Office, UK, having just finished my second year of a mathematics course at university. Basically I have data between January 1 1997 and December 31 2007. However as I am only interest in the summer months (which I have defined to be between May 1 and September 30), I would like to extract the relevant data in R in a timely manner. Obviously I could go and open my csv files in excel, cut and paste the relevant data, etc, however I would like to maximise R's potential as I feel it will stand me in better stead in the long run. Currently the dates are in the form 1-Apr-1997, 3-Sept-2001, etc. I will create a data.frame with date as one of the variables, the others being (initially) temperature, humidity, and Admissions (the number of hospital admissions for COPD exaserbations). Please could somebody tell me if there is a simple way to extract the data I want, and if so perhaps a sample command to get me going? Do I first need to format the dates to some numeric-only format? As I say, I could use Excel to create the files in the right format, but I will be dealing with a lot more variables in the future (perhaps up to 8) and so this will become a pain-staking process. Please reply either on or off list. Many thanks for any help. Robin Williams Met Office summer intern - Health Forecasting [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Coefficients of Logistic Regression from bootstrap - how to get them?
Hello all, I am trying to optimize my logistic regression model by using bootstrap. I was previously using SAS for this kind of tasks, but I am now switching to R. My data frame consists of 5 columns and has 109 rows. Each row is a single record composed of the following values: Subject_name, numeric1, numeric2, numeric3 and outcome (yes or no). All three numerics are used to predict outcome using LR. In SAS I have written a macro, that was splitting the dataset, running LR on one half of data and making predictions on second half. Then it was collecting the equation coefficients from each iteration of bootstrap. Later I was just taking medians of these coefficients from all iterations, and used them as an optimal model - it really worked well! Now I want to do the same in R. I tried to use the 'validate' or 'calibrate' functions from package Design, and I also experimented with function 'sm.binomial.bootstrap' from package sm. I tried also the function 'boot' from package boot, though without success - in my case it randomly selected _columns_ from my data frame, while I wanted it to select _rows_. Though the main point here is the optimized LR equation. I would appreciate any help on how to extract the LR equation coefficients from any of these bootstrap functions, in the same form as given by 'glm' or 'lrm'. Many thanks in advance! -- Michal J. Figurski HUP, Pathology Laboratory Medicine Xenobiotics Toxicokinetics Research Laboratory 3400 Spruce St. 7 Maloney Philadelphia, PA 19104 tel. (215) 662-3413 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to solve systems of nonlinear equations in R?
Also take a look at the package BB for solving nonlinear systems. Ravi. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Berend Hasselman Sent: Saturday, July 19, 2008 7:45 AM To: r-help@r-project.org Subject: Re: [R] How to solve systems of nonlinear equations in R? François Aucoin wrote: Hey, I was wondering if there existed a R function similar to 'fsolve' or 'fzero' Matlab functions? For a single function of one variable you can use uniroot. Use ?uniroot in R to find out more. You can also try general purpose optimisation algorithms such as optim/nlm, but they don't always find a solution and they are not very efficient for systems of equations. I am working on a package for solving (dense) non linear systems of equations using Broyden/Newton and global search methods. But it isn't ready yet and it will take time ... Berend -- View this message in context: http://www.nabble.com/How-to-solve-systems-of-nonlinear-equations-in-R--tp18 539798p18543785.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Avoid loop with three-dimensional array
Dear R users, I'm trying to find a solution for optimizing my code. I have to run a 50.000 iteration long simulation and it is absolutely necessary to have an optimized code. I have to do this operation *sum_t ( t(X_t) %*% A %*% X_t )* where X_t is a (d*k) matrix which changes in time and A is a constant in time (d*d) matrix. I have put all my X_t in a three dimensional array X of dimension (d,k,T). At the moment for computing the sum over time I'm doing a for loop and saving the resulting (k*k) matrix in a list and at the end I sum the T matrices in this list. I'm wondering if there is a better way to do this. Here an example of what I have to do: *d=3 k=2 T=4 X = array(rnorm(d*k*T),dim=c(d,k,T)) A = matrix(rnorm(d*d),d,d) e1 = list() for (t in 1:T){ #I would like to avoid this e1[[t]] = t(X[,,t])%*%A%*%X[,,t] } ## #Function for doing the sum of matrices in a list ## sumMatrices - function(matrices){ if (length(matrices) 2) matrices[[1]] + Recall(matrices[-1]) else matrices[[1]] + matrices[[2]] } ## result = sumMatrices(e1) * Thank you in advance for all your help, best Michela [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Creation of png=problems
Hi everybody, I am currently working with R and I would like to create jpeg graphs with it. I am working on Windows and Unix but I would like to be able to create graphs (jpeg, png, bitmap...) under Unix. I am working on Solaris version 8. The documentation for R states that the latest version of or R known to compile on Solaris 8 is version 2.6.2. I have been able to compile and install R version 2.6.2. under Unix. I also installed Hmisc package needed by my script. The problem is that when I launch my script to create graphs from text files I get this error : Create_Graph(File) Error in X11(paste(png::, filename, sep = ), width, height, pointsize, : unable to start device PNG In addition: Warning message: In png(paste(PATH, /, filename, sep = ), 800, 600) : no png support in this version of R The thing is that I have tried to run my script under Windows with the same version (2.6.2) and everything is going well and the graph are created. Is it because (under Unix): capabilities() jpeg pngtcltk X11 http/ftp sockets libxml fifo FALSEFALSEFALSE TRUE TRUE TRUE TRUE TRUE If so, could someone explain to me how to change these capabilities in order to ba able to set them TRUE. I have also tried with other version of R but the error is the same one. If you have any other ideas about why it doesn't work I would be very grateful. The fact is that I have created a programm under Unix to make automatic the creation of the text files and I would like to create automatically the correspondant graphs. I hope someone will be able to help me. Thanks for your time. Have a good day. R. La route des vacances en quelques clics grâce à Voila ! http://itineraire.voila.fr/itineraire.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Time Series - Long Memory Estimation
thanks for the reply. right, there are some functions in Rmetrics concerning long memory, however the estimator I am searching for is not included. {For anyone who's interested in long memory, the Rmetrics package-collection offers pretty much the same as several other packages do separately. Actually, I think that Rmetrics are a bit more efficient} If there is anyone else who knows anything on Robinson's estim. please send me an e-mail. Thanks again, On Mon, Jul 21, 2008 at 4:26 PM, [EMAIL PROTECTED] wrote: there maybe something for that in one of the Rmetrics packages possibly ? 've not used them but I know that they have some long memory functions. On Mon, Jul 21, 2008 at 7:35 AM, Fotis Papailias wrote: Dear R-Users, I am doing a research on Time Series, especially on the estimation of the fractional exponent in long memory time series (for those who know). However there are three estimators already built-in the fracdiff package (GPH, Sperio, MLE) I was wondering if there is someone who had used an estimation introduced by P.M. Robinson (related paper: Log-Periodogram regression of time series with long range dependence, 1995, The Annals of Statistics, Vol. 23, p. 1048 - 1072) The estimator is similar to GPH and Sperio based on the periodogram. Thank you in advance, Fotis -- fp [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- fp [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] manipulate a matrix2
I am not familiar with the vegdist function. What defines a duplicate in the matrix? There are ways if identifying if more than one row meets the criteria duplicates and then removing them. Can you give an illustration of what you mean with a before/after data representation. On Mon, Jul 21, 2008 at 10:22 AM, Jon Hak [EMAIL PROTECTED] wrote: Thanks Jim, that was exactly what I was after. On a second note, do you have any insight into pulling out the duplicates in this type of matrix? I thought that was what the upper=FALSE is in: csv.dis - vegdist(csv.m, method='jaccard', binary=FALSE, diag=FALSE, upper=FALSE). I just need either the lower or upper portion, with the zeros (,3 ,3) being the dividing line. [,3] [,5] [,6] [,9] [,11] [3,]02 3 4 5 [5,]20 8 9 10 [6,]38 0 14 15 [9,]49 14 020 [11,] 5 10 15 20 0 Thanks again, Jon -Original Message- From: jim holtman [mailto:[EMAIL PROTECTED] Sent: Friday, July 18, 2008 9:56 AM To: Jon Hak Cc: r-help@r-project.org Subject: Re: [R] manipulate a matrix2 Is this what you want: x [,3] [,5] [,6] [,9] [,11] [,3] 16 11 1621 [,5] 27 12 1722 [,6] 38 13 1823 [,9] 49 14 1924 [,11]5 10 15 2025 library(reshape) melt(x) X1X2 value 1 [,3] [,3] 1 2 [,5] [,3] 2 3 [,6] [,3] 3 4 [,9] [,3] 4 5 [,11] [,3] 5 6 [,3] [,5] 6 7 [,5] [,5] 7 8 [,6] [,5] 8 9 [,9] [,5] 9 10 [,11] [,5]10 11 [,3] [,6]11 12 [,5] [,6]12 13 [,6] [,6]13 14 [,9] [,6]14 15 [,11] [,6]15 16 [,3] [,9]16 17 [,5] [,9]17 18 [,6] [,9]18 19 [,9] [,9]19 20 [,11] [,9]20 21 [,3] [,11]21 22 [,5] [,11]22 23 [,6] [,11]23 24 [,9] [,11]24 25 [,11] [,11]25 On Fri, Jul 18, 2008 at 11:10 AM, Jon Hak [EMAIL PROTECTED] wrote: Building upon Jim's answer below (Thanks Jim, that helped a lot), I need to pickup where this thread left off. I'm using Vegan to calculate the Jaccard's Index and the Row.Names and column names are represented in my matrix as seen here. [,3] [,5] [,6] [,9] [,11] [3,]06 11 16 21 [5,]20 12 17 22 [6,]38 018 23 [9,]49 14 024 [11,] 5 10 15 20 0 When I use the command; xy - cbind(row=as.vector(row.names(x)), col=as.vector(colnames(x)), value=as.vector(x)) I get the list (the column value is the issue); rowcol value [1,] 3 1 0 [2,] 5 1 2 [3,] 6 1 3 [4,] 9 1 4 [5,] 11 1 5 [6,] 3 2 6 [7,] 5 2 0 [8,] 6 2 8 [9,] 9 2 9 [10,] 11 210 [11,] 3311 [12,] 530 I would really like the col value to equal the actual name, not the column number. What am I missing? The analysis is very large, 6k x6k matrix so automating the process is a high priority. Thanks, Jon From: jim holtman jholtman_at_gmail.com mailto:jholtman_at_gmail.com?Subject=Re:%20%5BR%5D%20manipulate%20a%20m atrix Date: Mon, 25 Jun 2007 12:39:46 -0400 Is this what you want? x [,1] [,2] [,3] [,4] [,5] [1,]16 11 16 21 [2,]27 12 17 22 [3,]38 13 18 23 [4,]49 14 19 24 [5,]5 10 15 20 25 cbind(row=as.vector(row(x)), col=as.vector(col(x)), value=as.vector(x)) row col value [1,] 1 1 1 [2,] 2 1 2 [3,] 3 1 3 [4,] 4 1 4 [5,] 5 1 5 [6,] 1 2 6 [7,] 2 2 7 [8,] 3 2 8 [9,] 4 2 9 [10,] 5 210 [11,] 1 311 [12,] 2 312 [13,] 3 313 [14,] 4 314 [15,] 5 315 [16,] 1 416 [17,] 2 417 [18,] 3 418 [19,] 4 419 [20,] 5 420 [21,] 1 521 [22,] 2 522 [23,] 3 523 [24,] 4 5 24 [25,] 5 5 25 On 6/25/07, Jon Hak Jon_Hak_at_natureserve.org wrote: I have read everything I can find on how to manipulate a results matrix in http://tolstoy.newcastle.edu.au/R/e2/help/07/06/19875.html#19887qlink1 R and I have to admit I'm stumped. I have set up a process to extract a dataset from ArcGIS to compute a similarity index (Jaccards) in Vegan. The dataset is fairly simple, but large, and consists of rows = sample area, and columns = elements. I've been able to view the results in R, but I want to get the results out to a database and a matrix that is 6000-rows x 6000-columns can be very difficult to manipulate in Windows XP. I would to rotate the matrix so that the output would look like the old condensed format in programs like Conoco. Ideally, I would like format to look something like this; Site-row Site-col Jaccard 1 1 1
[R] xyplot: distance between axis and axis-label gets wrong
Hi! I just started reading the wonderful Lattice book and I finally found a quite elegant solution for nicer log-ticks. However, there are some problems with the spacing between the axis label and the axis tick marks. It seems that lattice estimates the space wich gets used by the labels before it calls the yscale.components function. However, this function can change what is supposed to be plotted and therefore the width can change making previous calculations void. Here is an example: ## ## functions for nice log-axis ## logTicks - function (lim, loc = c(1, 5), base=10) { ii -floor(log(range(lim), base)) + c(-1, 2) main - base^(ii[1]:ii[2]) r - as.numeric(outer(loc, main, *)) r[lim[1] = r r = lim[2]] } xyscale.components.log - function(lim, ..., side=c(bottom), base=10, majorTickFac=1.5, loc=c(1,5)) { if(side %in% c(left, right)) ans - yscale.components.default(lim = lim, ...) if(side %in% c(bottom, top)) ans - xscale.components.default(lim = lim, ...) tick.at - logTicks(base^lim, loc = loc, base) tick.at.major - logTicks(base^lim, loc = 1, base) major.powers - log(tick.at.major, base) major.labels - parse(text=paste(base, ^, major.powers, sep=)) major - tick.at %in% tick.at.major ans[[side]]$ticks$at - log(tick.at, 10) ans[[side]]$ticks$tck - ifelse(major, majorTickFac, 1.0) ans[[side]]$labels$at - log(tick.at, 10) ans[[side]]$labels$labels[major] - major.labels ans[[side]]$labels$labels[!major] - ans[[side]]$labels$check.overlap - FALSE ans } xyscale.components.log.custom - function(...) { args - list(...) function(...) { dots - list(...) do.call(xyscale.components.log, modifyList(dots, args)) } } x - 1:100 y - x^3 xyplot(y ~ x, scales=list(log=T), xscale.component=xyscale.components.log.custom(side=bottom, loc=c(1,3,8)), yscale.component=xyscale.components.log.custom(side=left, loc=c(1,3,8)), ylab=expression(rho) ) xyplot(y ~ x, scales=list(log=T), ylab=expression(rho) ) In the first plot, the axis labels are too far away from the nicely formatted 10^x expression. What can I do about it? A solution would be to put in the labels via the scales-argument into the lattice-machinery, but this is not very nice as I would have no automatic calculation of the limits ... Greetings, Sebastian Weber __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] On creating grouped data set.
Can you provide an example of what you are asking for. There are various ways of partitioning a dataframe based on some criteria. Is this what you are asking? A before/after example would be helpful. On Mon, Jul 21, 2008 at 6:32 AM, Dong-hyun Oh [EMAIL PROTECTED] wrote: Dear UseRs, I would like to know the way to create grouped data set such as Oats data.frame in nlme package. Specifically, I need to create a grouped data set with PBIB data.frame in SASmixed package. Any help? Looking forward to hearing from you. Best, __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] manipulate a matrix2
The vegan matrix produces values of similarity between sample sites. Because the matrix uses the same samples for the row names and the column header it has duplicates on either side of the base diagonal (below). 3 7 8 11 12 3 0 0.6 1 0.3 0.85 7 0.660 1 0.650.95 8 1 1 0 1 1 11 0.3 0.651 0 0.9 12 0.850.951 0.9 0 Ideally, the matrix should look like; 3 7 8 11 12 3 0 7 0.660 8 1 1 0 11 0.3 0.651 0 12 0.850.951 0.9 0 This is probably a question for the Vegan developers, but I really appreciate your (and the lists) insight. -Original Message- From: jim holtman [mailto:[EMAIL PROTECTED] Sent: Monday, July 21, 2008 9:18 AM To: Jon Hak Cc: r-help@r-project.org Subject: Re: [R] manipulate a matrix2 I am not familiar with the vegdist function. What defines a duplicate in the matrix? There are ways if identifying if more than one row meets the criteria duplicates and then removing them. Can you give an illustration of what you mean with a before/after data representation. On Mon, Jul 21, 2008 at 10:22 AM, Jon Hak [EMAIL PROTECTED] wrote: Thanks Jim, that was exactly what I was after. On a second note, do you have any insight into pulling out the duplicates in this type of matrix? I thought that was what the upper=FALSE is in: csv.dis - vegdist(csv.m, method='jaccard', binary=FALSE, diag=FALSE, upper=FALSE). I just need either the lower or upper portion, with the zeros (,3 ,3) being the dividing line. [,3] [,5] [,6] [,9] [,11] [3,]02 3 4 5 [5,]20 8 9 10 [6,]38 0 14 15 [9,]49 14 020 [11,] 5 10 15 20 0 Thanks again, Jon -Original Message- From: jim holtman [mailto:[EMAIL PROTECTED] Sent: Friday, July 18, 2008 9:56 AM To: Jon Hak Cc: r-help@r-project.org Subject: Re: [R] manipulate a matrix2 Is this what you want: x [,3] [,5] [,6] [,9] [,11] [,3] 16 11 1621 [,5] 27 12 1722 [,6] 38 13 1823 [,9] 49 14 1924 [,11]5 10 15 2025 library(reshape) melt(x) X1X2 value 1 [,3] [,3] 1 2 [,5] [,3] 2 3 [,6] [,3] 3 4 [,9] [,3] 4 5 [,11] [,3] 5 6 [,3] [,5] 6 7 [,5] [,5] 7 8 [,6] [,5] 8 9 [,9] [,5] 9 10 [,11] [,5]10 11 [,3] [,6]11 12 [,5] [,6]12 13 [,6] [,6]13 14 [,9] [,6]14 15 [,11] [,6]15 16 [,3] [,9]16 17 [,5] [,9]17 18 [,6] [,9]18 19 [,9] [,9]19 20 [,11] [,9]20 21 [,3] [,11]21 22 [,5] [,11]22 23 [,6] [,11]23 24 [,9] [,11]24 25 [,11] [,11]25 On Fri, Jul 18, 2008 at 11:10 AM, Jon Hak [EMAIL PROTECTED] wrote: Building upon Jim's answer below (Thanks Jim, that helped a lot), I need to pickup where this thread left off. I'm using Vegan to calculate the Jaccard's Index and the Row.Names and column names are represented in my matrix as seen here. [,3] [,5] [,6] [,9] [,11] [3,]06 11 16 21 [5,]20 12 17 22 [6,]38 018 23 [9,]49 14 024 [11,] 5 10 15 20 0 When I use the command; xy - cbind(row=as.vector(row.names(x)), col=as.vector(colnames(x)), value=as.vector(x)) I get the list (the column value is the issue); rowcol value [1,] 3 1 0 [2,] 5 1 2 [3,] 6 1 3 [4,] 9 1 4 [5,] 11 1 5 [6,] 3 2 6 [7,] 5 2 0 [8,] 6 2 8 [9,] 9 2 9 [10,] 11 210 [11,] 3311 [12,] 530 I would really like the col value to equal the actual name, not the column number. What am I missing? The analysis is very large, 6k x6k matrix so automating the process is a high priority. Thanks, Jon From: jim holtman jholtman_at_gmail.com mailto:jholtman_at_gmail.com?Subject=Re:%20%5BR%5D%20manipulate%20a%20m atrix Date: Mon, 25 Jun 2007 12:39:46 -0400 Is this what you want? x [,1] [,2] [,3] [,4] [,5] [1,]16 11 16 21 [2,]27 12 17 22 [3,]38 13 18 23 [4,]49 14 19 24 [5,]5 10 15 20 25 cbind(row=as.vector(row(x)), col=as.vector(col(x)), value=as.vector(x)) row col value [1,] 1 1 1 [2,] 2 1 2 [3,] 3 1 3 [4,] 4 1 4 [5,] 5 1 5 [6,] 1 2 6 [7,] 2 2 7 [8,] 3 2 8 [9,] 4 2 9 [10,] 5 210 [11,] 1 311 [12,] 2 312 [13,] 3 313 [14,] 4 314 [15,] 5 315 [16,] 1 416 [17,] 2 417 [18,] 3
Re: [R] Creation of png=problems
Romain You might get a fuller answer from others, but one thing you could try is using bitmap() rather than png(). Cheers Richard. Romain wrote: Hi everybody, I am currently working with R and I would like to create jpeg graphs with it. I am working on Windows and Unix but I would like to be able to create graphs (jpeg, png, bitmap...) under Unix. I am working on Solaris version 8. The documentation for R states that the latest version of or R known to compile on Solaris 8 is version 2.6.2. I have been able to compile and install R version 2.6.2. under Unix. I also installed Hmisc package needed by my script. The problem is that when I launch my script to create graphs from text files I get this error : Create_Graph(File) Error in X11(paste(png::, filename, sep = ), width, height, pointsize, : unable to start device PNG In addition: Warning message: In png(paste(PATH, /, filename, sep = ), 800, 600) : no png support in this version of R The thing is that I have tried to run my script under Windows with the same version (2.6.2) and everything is going well and the graph are created. Is it because (under Unix): capabilities() jpeg pngtcltk X11 http/ftp sockets libxml fifo FALSEFALSEFALSE TRUE TRUE TRUE TRUE TRUE If so, could someone explain to me how to change these capabilities in order to ba able to set them TRUE. I have also tried with other version of R but the error is the same one. If you have any other ideas about why it doesn't work I would be very grateful. The fact is that I have created a programm under Unix to make automatic the creation of the text files and I would like to create automatically the correspondant graphs. I hope someone will be able to help me. Thanks for your time. Have a good day. R. La route des vacances en quelques clics grâce à Voila ! http://itineraire.voila.fr/itineraire.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Richard D. Pearson [EMAIL PROTECTED] School of Computer Science,http://www.cs.man.ac.uk/~pearsonr University of Manchester, Tel: +44 161 275 6178 Oxford Road, Mob: +44 7971 221181 Manchester M13 9PL, UK.Fax: +44 161 275 6204 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] dev2bitmap error, 'gs' cannot be found
Dear List, I am using the bioconductor package Category to do some gene enrichment analysis, and usually save my KEGGmnplot's using a dev2bitmap command. This has worked just fine, until suddenly earlier today I got this error-message: dev2bitmap(04610_080721.jpg,type=jpeg, height = 10, width = 10, res = 200) Error in dev2bitmap(04610_CSF080721.jpg, type = jpeg, height = 10, : sorry, 'gs' cannot be found I don't know what this means, it seems to be something about my environment. (From dev2bitmap function:) gsexe - Sys.getenv(R_GSCMD) if (is.null(gsexe) || !nzchar(gsexe)) { gsexe - gs rc - system(paste(shQuote(gsexe), -help /dev/null)) if (rc != 0) stop(sorry, 'gs' cannot be found) } I cant figure out how to fix this. I am not an experienced programmer. Any help or tips would be greatly appreciated. Thank you, Boel --~*~**~***~*~***~**~*~-- Boel Brynedal, MSc, PhD student Karolinska Institutet Department of Clinical neuroscience __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Parameter names in nls
Dear R-help, Could you please examine the following code, and see if I have discovered a bug or not, or am just doing something silly. I am trying to create a package to do fish stock assessment using the nls() function to fit the modelled stock size to the various pieces of information that we have. The main problem with this sort of task is that the number and type of parameters that go into the model are highly variable between stocks, but the method needs to be intelligent enough to handle this. The way I have chosen to handle this is through the names in my parameter vector, and using code inside the objective function to figure out which parameter is which. The problem I have encountered is that I don't think nls() always passes a named vector - indeed, after the first set of function evaluations, it drops the names from the parameters vector altogether. I believe this to be a bug - it certaintly plays havoc with my code! As a demonstration of this problem, consider the piece of code below. It is basically fitting a straight line to some synthetic data (with noise). I have setup the objective function so that it prints the names of the parameters every time that it is called. As you can see, the names are there to begin with, but rapidly disappear after the first step is made. Is this a bug? Or is it intended behaviour? Or is this a completely daft approach I am taking? I look forward to your comments. cheers, Mark rm(list=ls()) fitting.fn -function(x,params) { #The model - so that it works y - params[1] + x*params[2] #How I would prefer it to work # y - params[a] + x*params[b] #Display information about function eval cat(paste(Evaluation # :,counter,\t Names :)) print(names(params)) counter - counter +1 return(y) } counter - 1 data.x - 1:50 data.y - pi*data.x + rnorm(50,sd=20) plot(data.x,data.y) ips - c(a=0,b=0) nls(data.y~fitting.fn(data.x,params),data=data.frame(data.x,data.y), start=list(params=ips),trace=TRUE,control=nls.control(tol=1e-8)) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] manipulate a matrix2
Will this do it for you: ?print.default x [,1] [,2] [,3] [,4] [,5] [1,]16 11 16 21 [2,]27 12 17 22 [3,]38 13 18 23 [4,]49 14 19 24 [5,]5 10 15 20 25 x[upper.tri(x)] - NA x [,1] [,2] [,3] [,4] [,5] [1,]1 NA NA NA NA [2,]27 NA NA NA [3,]38 13 NA NA [4,]49 14 19 NA [5,]5 10 15 20 25 print(x, na.print='.') [,1] [,2] [,3] [,4] [,5] [1,]1.... [2,]27... [3,]38 13.. [4,]49 14 19. [5,]5 10 15 20 25 On Mon, Jul 21, 2008 at 11:47 AM, Jon Hak [EMAIL PROTECTED] wrote: The vegan matrix produces values of similarity between sample sites. Because the matrix uses the same samples for the row names and the column header it has duplicates on either side of the base diagonal (below). 3 7 8 11 12 3 0 0.6 1 0.3 0.85 7 0.660 1 0.650.95 8 1 1 0 1 1 11 0.3 0.651 0 0.9 12 0.850.951 0.9 0 Ideally, the matrix should look like; 3 7 8 11 12 3 0 7 0.660 8 1 1 0 11 0.3 0.651 0 12 0.850.951 0.9 0 This is probably a question for the Vegan developers, but I really appreciate your (and the lists) insight. -Original Message- From: jim holtman [mailto:[EMAIL PROTECTED] Sent: Monday, July 21, 2008 9:18 AM To: Jon Hak Cc: r-help@r-project.org Subject: Re: [R] manipulate a matrix2 I am not familiar with the vegdist function. What defines a duplicate in the matrix? There are ways if identifying if more than one row meets the criteria duplicates and then removing them. Can you give an illustration of what you mean with a before/after data representation. On Mon, Jul 21, 2008 at 10:22 AM, Jon Hak [EMAIL PROTECTED] wrote: Thanks Jim, that was exactly what I was after. On a second note, do you have any insight into pulling out the duplicates in this type of matrix? I thought that was what the upper=FALSE is in: csv.dis - vegdist(csv.m, method='jaccard', binary=FALSE, diag=FALSE, upper=FALSE). I just need either the lower or upper portion, with the zeros (,3 ,3) being the dividing line. [,3] [,5] [,6] [,9] [,11] [3,]02 3 4 5 [5,]20 8 9 10 [6,]38 0 14 15 [9,]49 14 020 [11,] 5 10 15 20 0 Thanks again, Jon -Original Message- From: jim holtman [mailto:[EMAIL PROTECTED] Sent: Friday, July 18, 2008 9:56 AM To: Jon Hak Cc: r-help@r-project.org Subject: Re: [R] manipulate a matrix2 Is this what you want: x [,3] [,5] [,6] [,9] [,11] [,3] 16 11 1621 [,5] 27 12 1722 [,6] 38 13 1823 [,9] 49 14 1924 [,11]5 10 15 2025 library(reshape) melt(x) X1X2 value 1 [,3] [,3] 1 2 [,5] [,3] 2 3 [,6] [,3] 3 4 [,9] [,3] 4 5 [,11] [,3] 5 6 [,3] [,5] 6 7 [,5] [,5] 7 8 [,6] [,5] 8 9 [,9] [,5] 9 10 [,11] [,5]10 11 [,3] [,6]11 12 [,5] [,6]12 13 [,6] [,6]13 14 [,9] [,6]14 15 [,11] [,6]15 16 [,3] [,9]16 17 [,5] [,9]17 18 [,6] [,9]18 19 [,9] [,9]19 20 [,11] [,9]20 21 [,3] [,11]21 22 [,5] [,11]22 23 [,6] [,11]23 24 [,9] [,11]24 25 [,11] [,11]25 On Fri, Jul 18, 2008 at 11:10 AM, Jon Hak [EMAIL PROTECTED] wrote: Building upon Jim's answer below (Thanks Jim, that helped a lot), I need to pickup where this thread left off. I'm using Vegan to calculate the Jaccard's Index and the Row.Names and column names are represented in my matrix as seen here. [,3] [,5] [,6] [,9] [,11] [3,]06 11 16 21 [5,]20 12 17 22 [6,]38 018 23 [9,]49 14 024 [11,] 5 10 15 20 0 When I use the command; xy - cbind(row=as.vector(row.names(x)), col=as.vector(colnames(x)), value=as.vector(x)) I get the list (the column value is the issue); rowcol value [1,] 3 1 0 [2,] 5 1 2 [3,] 6 1 3 [4,] 9 1 4 [5,] 11 1 5 [6,] 3 2 6 [7,] 5 2 0 [8,] 6 2 8 [9,] 9 2 9 [10,] 11 210 [11,] 3311 [12,] 530 I would really like the col value to equal the actual name, not the column number. What am I missing? The analysis is very large, 6k x6k matrix so automating the process is a high priority. Thanks, Jon From: jim holtman jholtman_at_gmail.com mailto:jholtman_at_gmail.com?Subject=Re:%20%5BR%5D%20manipulate%20a%20m atrix Date: Mon, 25
Re: [R] manipulate a matrix2
That's awesome!! I was trying to use the tri function, but not successfully. -Original Message- From: jim holtman [mailto:[EMAIL PROTECTED] Sent: Monday, July 21, 2008 10:10 AM To: Jon Hak Cc: r-help@r-project.org Subject: Re: [R] manipulate a matrix2 Will this do it for you: ?print.default x [,1] [,2] [,3] [,4] [,5] [1,]16 11 16 21 [2,]27 12 17 22 [3,]38 13 18 23 [4,]49 14 19 24 [5,]5 10 15 20 25 x[upper.tri(x)] - NA x [,1] [,2] [,3] [,4] [,5] [1,]1 NA NA NA NA [2,]27 NA NA NA [3,]38 13 NA NA [4,]49 14 19 NA [5,]5 10 15 20 25 print(x, na.print='.') [,1] [,2] [,3] [,4] [,5] [1,]1.... [2,]27... [3,]38 13.. [4,]49 14 19. [5,]5 10 15 20 25 On Mon, Jul 21, 2008 at 11:47 AM, Jon Hak [EMAIL PROTECTED] wrote: The vegan matrix produces values of similarity between sample sites. Because the matrix uses the same samples for the row names and the column header it has duplicates on either side of the base diagonal (below). 3 7 8 11 12 3 0 0.6 1 0.3 0.85 7 0.660 1 0.650.95 8 1 1 0 1 1 11 0.3 0.651 0 0.9 12 0.850.951 0.9 0 Ideally, the matrix should look like; 3 7 8 11 12 3 0 7 0.660 8 1 1 0 11 0.3 0.651 0 12 0.850.951 0.9 0 This is probably a question for the Vegan developers, but I really appreciate your (and the lists) insight. -Original Message- From: jim holtman [mailto:[EMAIL PROTECTED] Sent: Monday, July 21, 2008 9:18 AM To: Jon Hak Cc: r-help@r-project.org Subject: Re: [R] manipulate a matrix2 I am not familiar with the vegdist function. What defines a duplicate in the matrix? There are ways if identifying if more than one row meets the criteria duplicates and then removing them. Can you give an illustration of what you mean with a before/after data representation. On Mon, Jul 21, 2008 at 10:22 AM, Jon Hak [EMAIL PROTECTED] wrote: Thanks Jim, that was exactly what I was after. On a second note, do you have any insight into pulling out the duplicates in this type of matrix? I thought that was what the upper=FALSE is in: csv.dis - vegdist(csv.m, method='jaccard', binary=FALSE, diag=FALSE, upper=FALSE). I just need either the lower or upper portion, with the zeros (,3 ,3) being the dividing line. [,3] [,5] [,6] [,9] [,11] [3,]02 3 4 5 [5,]20 8 9 10 [6,]38 0 14 15 [9,]49 14 020 [11,] 5 10 15 20 0 Thanks again, Jon -Original Message- From: jim holtman [mailto:[EMAIL PROTECTED] Sent: Friday, July 18, 2008 9:56 AM To: Jon Hak Cc: r-help@r-project.org Subject: Re: [R] manipulate a matrix2 Is this what you want: x [,3] [,5] [,6] [,9] [,11] [,3] 16 11 1621 [,5] 27 12 1722 [,6] 38 13 1823 [,9] 49 14 1924 [,11]5 10 15 2025 library(reshape) melt(x) X1X2 value 1 [,3] [,3] 1 2 [,5] [,3] 2 3 [,6] [,3] 3 4 [,9] [,3] 4 5 [,11] [,3] 5 6 [,3] [,5] 6 7 [,5] [,5] 7 8 [,6] [,5] 8 9 [,9] [,5] 9 10 [,11] [,5]10 11 [,3] [,6]11 12 [,5] [,6]12 13 [,6] [,6]13 14 [,9] [,6]14 15 [,11] [,6]15 16 [,3] [,9]16 17 [,5] [,9]17 18 [,6] [,9]18 19 [,9] [,9]19 20 [,11] [,9]20 21 [,3] [,11]21 22 [,5] [,11]22 23 [,6] [,11]23 24 [,9] [,11]24 25 [,11] [,11]25 On Fri, Jul 18, 2008 at 11:10 AM, Jon Hak [EMAIL PROTECTED] wrote: Building upon Jim's answer below (Thanks Jim, that helped a lot), I need to pickup where this thread left off. I'm using Vegan to calculate the Jaccard's Index and the Row.Names and column names are represented in my matrix as seen here. [,3] [,5] [,6] [,9] [,11] [3,]06 11 16 21 [5,]20 12 17 22 [6,]38 018 23 [9,]49 14 024 [11,] 5 10 15 20 0 When I use the command; xy - cbind(row=as.vector(row.names(x)), col=as.vector(colnames(x)), value=as.vector(x)) I get the list (the column value is the issue); rowcol value [1,] 3 1 0 [2,] 5 1 2 [3,] 6 1 3 [4,] 9 1 4 [5,] 11 1 5 [6,] 3 2 6 [7,] 5 2 0 [8,] 6 2 8 [9,] 9 2 9 [10,] 11 210 [11,] 3311 [12,] 530 I would really like the col value to equal the actual name, not the column number. What
Re: [R] dev2bitmap error, 'gs' cannot be found
WHat OS is this (please do read the posting guide)? The posting guide also asks you to read the help: it says You will need 'ghostscript': the full path to the executable can be set by the environment variable 'R_GSCMD'. (If this is unset the command 'gs' is used, which will work if it is in your path.) If you don't know what that means, please ask your computing support desk for help. On Mon, 21 Jul 2008, Boel Brynedal wrote: Dear List, I am using the bioconductor package Category to do some gene enrichment analysis, and usually save my KEGGmnplot's using a dev2bitmap command. This has worked just fine, until suddenly earlier today I got this error-message: dev2bitmap(04610_080721.jpg,type=jpeg, height = 10, width = 10, res = 200) Error in dev2bitmap(04610_CSF080721.jpg, type = jpeg, height = 10, : sorry, 'gs' cannot be found I don't know what this means, it seems to be something about my environment. (From dev2bitmap function:) gsexe - Sys.getenv(R_GSCMD) if (is.null(gsexe) || !nzchar(gsexe)) { gsexe - gs rc - system(paste(shQuote(gsexe), -help /dev/null)) if (rc != 0) stop(sorry, 'gs' cannot be found) } I cant figure out how to fix this. I am not an experienced programmer. Any help or tips would be greatly appreciated. Thank you, Boel --~*~**~***~*~***~**~*~-- Boel Brynedal, MSc, PhD student Karolinska Institutet Department of Clinical neuroscience __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] avoid loop with three-dimensional array
Dear R user, I'm trying to find a solution for optimizing my code. I have to run a 50.000 iteration long simulation and it is absolutely necessary to have an optimized code. I have to do this operation *sum_t ( t(X_t) %*% A %*% X_t )* where X_t is a (d*k) matrix which changes in time and A is a constant in time (d*d) matrix. I have put all my X_t in a three dimensional array X of dimension (d,k,T). At the moment for computing the sum over time I'm doing a for loop and saving the resulting (k*k) matrix in a list and at the end I sum the T matrices in this list. I'm wondering if there is a better way to do this. Here an example of what I have to do: *d=3 k=2 T=4 X = array(rnorm(d*k*T),dim=c(d,k,T)) A = matrix(rnorm(d*d),d,d) e1 = list() for (t in 1:T){ #I would like to avoid this e1[[t]] = t(X[,,t])%*%A%*%X[,,t] } ## #Function for doing the sum of matrices in a list ## sumMatrices - function(matrices){ if (length(matrices) 2) matrices[[1]] + Recall(matrices[-1]) else matrices[[1]] + matrices[[2]] } ## result = sumMatrices(e1) * Thank you in advance for all your help, best Michela [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] portfolio optimization problem - use R
How to use R to solve the optimisaton problem Minimize: ½*w^T*omega*w+mu^T*w+c^T(w-w0) for ww0 long position ½*w^T*omega*w+mu^T*w-c^T(w-w0) for ww0 short position W: is the update weight of portfolio Wo is the initial weight of portfolio Omega is the variance covariance matrix mu is the vector of return rate of stocks in the portfolio C is the vector coefficient of transaction cost Is it a quandratic programming problem? Then how to write the objective function? Or any other method to solve this? -- View this message in context: http://www.nabble.com/portfolio-optimization-problem---use-R-tp18570399p18570399.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Output Nicely formatted tables from R
Hi there, I've spent a while searching for ways of outputting table data from R in presentable formats, such as colored backgrounds for column headings, bold fonts etc. It appears that this is not possible, but I would be interested to learn if in fact there was a way of achieving this. Many thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Mclust - which cluster is each observation in?
I'm trying to test a method of identifying individuals (birds) based on measured data (their calls). I have test data from known individual birds, and I am using the Mclust package to see if the program can correctly identify which calls come from different birds. So far, mclust has correctly ID'd the number of birds in the test data set (i.e., the correct # of clusters). However I also need to correctly assign each call to the right bird (i.e., data rows (calls) 1 - 10 are in cluster (bird) 1; rows 2 - 20 are in cluster 2, etc.). Is there a way to get mclust to show the cluster assignments of each observation? Thank you -- View this message in context: http://www.nabble.com/Mclust---which-cluster-is-each-observation-in--tp18571300p18571300.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] portfolio optimization problem - use R
You could try the fPortfolio package. Wish helps. jamaj 2008/7/21, fzp2008 [EMAIL PROTECTED]: How to use R to solve the optimisaton problem Minimize: ½*w^T*omega*w+mu^T*w+c^T(w-w0) for ww0 long position ½*w^T*omega*w+mu^T*w-c^T(w-w0) for ww0 short position W: is the update weight of portfolio Wo is the initial weight of portfolio Omega is the variance covariance matrix mu is the vector of return rate of stocks in the portfolio C is the vector coefficient of transaction cost Is it a quandratic programming problem? Then how to write the objective function? Or any other method to solve this? -- View this message in context: http://www.nabble.com/portfolio-optimization-problem---use-R-tp18570399p18570399.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] TeachingDemos question: my.symbols() alignment problems in complicated layout
The original poster also contacted me offline and now has a copy of my.symbols that works with layout (instead of resetting all the graphical parameters, it only resets the ones it changes). The fixed version will be in the next release of the package, or anyone who would like a copy before then can e-mail me. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] (801) 408-8111 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Paul Murrell Sent: Sunday, July 20, 2008 6:06 PM To: [EMAIL PROTECTED] Cc: r-help@r-project.org Subject: Re: [R] TeachingDemos question: my.symbols() alignment problems in complicated layout Hi [EMAIL PROTECTED] wrote: Hello, After usefull suggestions by Paul Murrell, i have been trying to use my.symbols to plot arrows of varying angles on my plot in order to create a time series of wind direction... however,i have been unable to figure out how the allignment of symbols works... below i have included a simplified code that illustrates my problem: (i am creating a .ps file ... so i've included this in my mock code, in case it might be part of the problem...) # TEST CODE # postscript(test.ps, horizontal=FALSE, width=7.5, height=11.5,pointsize=10, paper = special ) library(TeachingDemos) opar - par(omi=c(0,0.1,0.7,0.1),xpd=T,mar=par()$mar+c(0,-1.5,-1,5)) layout_mat = matrix(c(1,2,3,4),nrow=4,ncol=1,byrow=TRUE) my_layout - layout(layout_mat,widths=c(1,1,1,1),heights=c(1.0,0.45,1.0,1.2),respec t=FALSE) plot(1,1) plot(2,2) plot(3,3) plot(1:100,rep(c(9,1.5,2,8),25)) points(40,4,col=red) points(50,8,col=red) my.symbols(40,4,ms.polygon,n=3,inches=0.2,add=TRUE) my.symbols(40,4,ms.arrows,angle=pi/2,inches=0.2,add=TRUE) my.symbols(50,8,ms.arrows,angle=pi/4,inches=0.2,add=TRUE) dev.off() END of TEST CODE ### If i look at the output test.ps ... the first symbol is exactly where i want it to be i.e. at x=40,y=4 ... while the position of the two other ones is offset by some mysterious(!!) value then if i comment out the first my.symbols(...) command... the second symbol is at the right place, while the third is offset ...etc However if i simply do: plot(1:100,rep(c(9,1.5,2,8),25)) points(40,4,col=red) points(50,8,col=red) my.symbols(40,4,ms.polygon,n=3,inches=0.2,add=TRUE) my.symbols(40,4,ms.arrows,angle=pi/2,inches=0.2,add=TRUE) my.symbols(50,8,ms.arrows,angle=pi/4,inches=0.2,add=TRUE then everything is exactly where i want it ... suggesting there is a problem with the layout ??? I think the issue is that my.symbols() does a lot of this ... op - par(no.readonly = TRUE) on.exit(par(op)) ... which is not absolutely guaranteed to get you back to where you started (see the comment in the 'Value' section of the help page ?par). Fixing that will need the cooperation of the author of TeachingDemos. Here are a couple of options: (i) use the 'gridBase' package and do these arrow annotations using the 'grid' package, which allows you to control coordinate systems in a more rational manner. There's an example (perhaps slightly more complicated than you need) in: http://cran.r-project.org/doc/Rnews/Rnews_2003-2.pdf (ii) draw your main plot using 'lattice' and the annotations using 'grid' or possibly 'grImport'. There's a hint of an example of the latter on slide 18 of: http://www.stat.auckland.ac.nz/~paul/Talks/import.pdf (iii) draw the whole thing using 'grid'. You can start to get acquainted with grid here: http://www.stat.auckland.ac.nz/~paul/RGraphics/chapter5.pdf Unfortunately, all of these require a reasonable amount of extra learning on your part. Paul Any help would be really appreciated, Thank You a lot, maria __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dr Paul Murrell Department of Statistics The University of Auckland Private Bag 92019 Auckland New Zealand 64 9 3737599 x85392 [EMAIL PROTECTED] http://www.stat.auckland.ac.nz/~paul/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal,
Re: [R] Output Nicely formatted tables from R
Please look at http://biostat.mc.vanderbilt.edu/twiki/pub/Main/StatReport/latexFineControl.pdf for ways to do fine control of tabular data formatting via Sweave. Bill Cunliffe wrote: Hi there, I've spent a while searching for ways of outputting table data from R in presentable formats, such as colored backgrounds for column headings, bold fonts etc. It appears that this is not possible, but I would be interested to learn if in fact there was a way of achieving this. Many thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] asp and ylim
Look at the squishplot function in the TeachingDemos package (probably not the best named function around, but somewhat descriptive and noone has suggested a better one): x - runif(100) y - runif(100) squishplot( xlim=c(0,1), ylim=c(0.5, 0.7), asp=1 ) plot(x,y, xlim=c(0,1), ylim=c(0.5, 0.7)) Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] (801) 408-8111 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of David Epstein Sent: Sunday, July 20, 2008 11:27 AM To: r-help@r-project.org Subject: [R] asp and ylim #See David Williams' book Weighing the odds, p286 y - c(1.21, 0.51, 0.14, 1.62, -0.8, 0.72, -1.71, 0.84, 0.02, -0.12) ybar - mean(y) ylength - length(y) ybarv - rep(ybar, ylength) x - 1:ylength plot(x,y,asp=1,xlab=position,ylab=ybar,type=n,ylim=c(-1,1)) segments(x[1], ybar, x[ylength], ybar) segments(x,ybarv,x,y) points(x, ybarv, pch=21, bg=white) points(x,y,pch=19,col=black) With asp=1, the value of ylim seems to be totally ignored, as in the above code. With asp not set, R plays close attention to the value of ylim. This is not intuitive behaviour, or is it? How can I set the aspect ratio, and simultaneously set the plot region? The aspect ratio is one number and the plot region is given by four numbers (xleft, xright, yleft, yright). Logically, these 5 numbers are independent of each other and arbitrary, provided xleftxright and yleftyright. This should give a one-to-one bijection between 5-tuples and plots, determined up to a change of scale that is uniform in the x- and y-dirctions. My code above shows the (to me) obvious attempt, which fails. Thanks David __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Editor fpr Mac OS
Hi, is there a good editor for Mac Os? Thank Angelo Scozzarella __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mclust - which cluster is each observation in?
Well, you are dealing with probability based clustering, so for each bird you will get a probability of belonging to each cluster. If your clusters are well defined, then each bird should have a very high probability of belonging to one of the clusters. You can get this probability matrix from your mclust object. For the iris dataset example, my.clusters=Mclust(iris[,-5]) This will give you the probability matrix my.clusters$z You can assign membership based on these probabilities (i.e. each bird belongs to the cluster with highest probability). You can obtain this membership by doing my.clusters$membership Hope this helps, Julian cnagy wrote: I'm trying to test a method of identifying individuals (birds) based on measured data (their calls). I have test data from known individual birds, and I am using the Mclust package to see if the program can correctly identify which calls come from different birds. So far, mclust has correctly ID'd the number of birds in the test data set (i.e., the correct # of clusters). However I also need to correctly assign each call to the right bird (i.e., data rows (calls) 1 - 10 are in cluster (bird) 1; rows 2 - 20 are in cluster 2, etc.). Is there a way to get mclust to show the cluster assignments of each observation? Thank you __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Control parameter of the optim( ): parscale
Hi everybody, I am using the L-BFGS-B method of the mle2() function to estimate the values of 6 parameters. mle2 uses the methods implemented in optim. As I got it from the descriptions available online, one can use the parscale parameter to tell R somehow what the values of the estimated parameters should be . . . Could somebody please help me understand what one has to do actually with the parscale parameter so that it works right? I am very grateful for an answer - R leaves sometimes some of the parameters unchanged (it is OK since it is a feature of the L-BFGS-B algorithm), but often these are the parameters that should have greater values! Thank you in advance! Zoe [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lattice with bwplot
Dear All, How come par(mfrow=c(1,2)) works with boxplot, but not with bwplot? This works par(mfrow=c(1,2)) boxplot(dv~index, depend) boxplot(dv~index, depend2) This does not work par(mfrow=c(1,2)) bwplot(index~dv, depend) bwplot(index~dv, depend2) Thanks, Kyle Dr. J. Kyle Roberts Department of Literacy, Language and Learning School of Education and Human Development Southern Methodist University P.O. Box 750381 Dallas, TX 75275 214-768-4494 http://www.hlm-online.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Coefficients of Logistic Regression from bootstrap - how to get them?
Michal Figurski wrote: Hello all, I am trying to optimize my logistic regression model by using bootstrap. I was previously using SAS for this kind of tasks, but I am now switching to R. My data frame consists of 5 columns and has 109 rows. Each row is a single record composed of the following values: Subject_name, numeric1, numeric2, numeric3 and outcome (yes or no). All three numerics are used to predict outcome using LR. In SAS I have written a macro, that was splitting the dataset, running LR on one half of data and making predictions on second half. Then it was collecting the equation coefficients from each iteration of bootstrap. Later I was just taking medians of these coefficients from all iterations, and used them as an optimal model - it really worked well! Why not use maximum likelihood estimation, i.e., the coefficients from the original fit. How does the bootstrap improve on that? Now I want to do the same in R. I tried to use the 'validate' or 'calibrate' functions from package Design, and I also experimented with function 'sm.binomial.bootstrap' from package sm. I tried also the function 'boot' from package boot, though without success - in my case it randomly selected _columns_ from my data frame, while I wanted it to select _rows_. validate and calibrate in Design do resampling on the rows Resampling is mainly used to get a nearly unbiased estimate of the model performance, i.e., to correct for overfitting. Frank Harrell Though the main point here is the optimized LR equation. I would appreciate any help on how to extract the LR equation coefficients from any of these bootstrap functions, in the same form as given by 'glm' or 'lrm'. Many thanks in advance! -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] vector help
hi I have vector test. It has 3 elements. I want to join the three into one vector. Geneset=HSA04910_INSULIN_SIGNALING_PATHWAY-157- 20. how can i do it. class(test) [1] character test [1] Geneset=HSA04910_INSULIN_SIGNALING_PATHWAY 157 [3] 20 Ramya -- View this message in context: http://www.nabble.com/vector-help-tp18575055p18575055.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Editor fpr Mac OS
Angelo Scozzarella angeloscozzarella at tiscali.it writes: Hi, is there a good editor for Mac Os? We had the same discussion in December 2007 and again in June 2008. Please consider the recommendations given in these threads. My personal favorite: TextMate http://macromates.com/, though commercial, is excellent -- except you are an experienced Emacs user, in that case take also Aquamacs http://aquamacs.org/ into account. Thank Angelo Scozzarella __ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hans Werner Borchers ABB Corporate Research __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] vector help
Try: paste(test, collapse='-') On Mon, Jul 21, 2008 at 3:49 PM, Rajasekaramya [EMAIL PROTECTED] wrote: hi I have vector test. It has 3 elements. I want to join the three into one vector. Geneset=HSA04910_INSULIN_SIGNALING_PATHWAY-157- 20. how can i do it. class(test) [1] character test [1] Geneset=HSA04910_INSULIN_SIGNALING_PATHWAY 157 [3] 20 Ramya -- View this message in context: http://www.nabble.com/vector-help-tp18575055p18575055.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Editor fpr Mac OS
Check the archives of the r-sig-mac mailing list. People have made suggestions there. -Don At 7:39 PM +0200 7/21/08, Angelo Scozzarella wrote: Hi, is there a good editor for Mac Os? Thank Angelo Scozzarella __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- -- Don MacQueen Environmental Protection Department Lawrence Livermore National Laboratory Livermore, CA, USA 925-423-1062 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] vector help
Try this: test=c(Geneset=HSA04910_INSULIN_SIGNALING_PATHWAY,157,20) paste(test, collapse=-,sep=) HTH, Jorge On Mon, Jul 21, 2008 at 2:49 PM, Rajasekaramya [EMAIL PROTECTED] wrote: hi I have vector test. It has 3 elements. I want to join the three into one vector. Geneset=HSA04910_INSULIN_SIGNALING_PATHWAY-157- 20. how can i do it. class(test) [1] character test [1] Geneset=HSA04910_INSULIN_SIGNALING_PATHWAY 157 [3] 20 Ramya -- View this message in context: http://www.nabble.com/vector-help-tp18575055p18575055.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lattice with bwplot
Maybe you can use split argument in plot.trellis. See ?print.trellis. On Mon, Jul 21, 2008 at 2:28 PM, Roberts, Kyle [EMAIL PROTECTED] wrote: Dear All, How come par(mfrow=c(1,2)) works with boxplot, but not with bwplot? This works par(mfrow=c(1,2)) boxplot(dv~index, depend) boxplot(dv~index, depend2) This does not work par(mfrow=c(1,2)) bwplot(index~dv, depend) bwplot(index~dv, depend2) Thanks, Kyle Dr. J. Kyle Roberts Department of Literacy, Language and Learning School of Education and Human Development Southern Methodist University P.O. Box 750381 Dallas, TX 75275 214-768-4494 http://www.hlm-online.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] vector help
try this: paste(test, collapse = -) Best, Dimitris Dimitris Rizopoulos Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/(0)16/336899 Fax: +32/(0)16/337015 Web: http://med.kuleuven.be/biostat/ http://perswww.kuleuven.be/dimitris_rizopoulos/ Quoting Rajasekaramya [EMAIL PROTECTED]: hi I have vector test. It has 3 elements. I want to join the three into one vector. Geneset=HSA04910_INSULIN_SIGNALING_PATHWAY-157- 20. how can i do it. class(test) [1] character test [1] Geneset=HSA04910_INSULIN_SIGNALING_PATHWAY 157 [3] 20 Ramya -- View this message in context: http://www.nabble.com/vector-help-tp18575055p18575055.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Coefficients of Logistic Regression from bootstrap - how to get them?
Frank, How does bootstrap improve on that? I don't know, but I have an idea. Since the data in my set are just a small sample of a big population, then if I use my whole dataset to obtain max likelihood estimates, these estimates may be best for this dataset, but far from ideal for the whole population. I used bootstrap to virtually increase the size of my dataset, it should result in estimates more close to that from the population - isn't it the purpose of bootstrap? When I use such median coefficients on another dataset (another sample from population), the predictions are better, than using max likelihood estimates. I have already tested that and it worked! I am not a statistician and I don't feel what overfitting is, but it may be just another word for the same idea. Nevertheless, I would still like to know how can I get the coeffcients for the model that gives the nearly unbiased estimates. I greatly appreciate your help. -- Michal J. Figurski HUP, Pathology Laboratory Medicine Xenobiotics Toxicokinetics Research Laboratory 3400 Spruce St. 7 Maloney Philadelphia, PA 19104 tel. (215) 662-3413 Frank E Harrell Jr wrote: Michal Figurski wrote: Hello all, I am trying to optimize my logistic regression model by using bootstrap. I was previously using SAS for this kind of tasks, but I am now switching to R. My data frame consists of 5 columns and has 109 rows. Each row is a single record composed of the following values: Subject_name, numeric1, numeric2, numeric3 and outcome (yes or no). All three numerics are used to predict outcome using LR. In SAS I have written a macro, that was splitting the dataset, running LR on one half of data and making predictions on second half. Then it was collecting the equation coefficients from each iteration of bootstrap. Later I was just taking medians of these coefficients from all iterations, and used them as an optimal model - it really worked well! Why not use maximum likelihood estimation, i.e., the coefficients from the original fit. How does the bootstrap improve on that? Now I want to do the same in R. I tried to use the 'validate' or 'calibrate' functions from package Design, and I also experimented with function 'sm.binomial.bootstrap' from package sm. I tried also the function 'boot' from package boot, though without success - in my case it randomly selected _columns_ from my data frame, while I wanted it to select _rows_. validate and calibrate in Design do resampling on the rows Resampling is mainly used to get a nearly unbiased estimate of the model performance, i.e., to correct for overfitting. Frank Harrell Though the main point here is the optimized LR equation. I would appreciate any help on how to extract the LR equation coefficients from any of these bootstrap functions, in the same form as given by 'glm' or 'lrm'. Many thanks in advance! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Coefficients of Logistic Regression from bootstrap - how to get them?
I used bootstrap to virtually increase the size of my dataset, it should result in estimates more close to that from the population - isn't it the purpose of bootstrap? No, not really. The bootstrap is a resampling method for variance estimation. It is often used when there is not an easy way, or a closed form expression, for estimating the sampling variance of a statistic. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Coefficients of Logistic Regression from bootstrap - how to get them?
Hi Doran, Maybe I am wrong, but I think bootstrap is a general resampling method which can be used for different purposes...Usually it works well when you do not have a presentative sample set (maybe with limited number of samples). Therefore, I am positive with Michal... P.S., overfitting, in my opinion, is used to depict when you got a model which is quite specific for the training dataset but cannot be generalized with new samples.. Thanks, --Jerry 2008/7/21 Doran, Harold [EMAIL PROTECTED]: I used bootstrap to virtually increase the size of my dataset, it should result in estimates more close to that from the population - isn't it the purpose of bootstrap? No, not really. The bootstrap is a resampling method for variance estimation. It is often used when there is not an easy way, or a closed form expression, for estimating the sampling variance of a statistic. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Coefficients of Logistic Regression from bootstrap - how
There is one aspect for which bootstrap or re-sampling is useful, which is not provided by maximum likelihood estimation (and the usual MLE estimates of SEs of the coefficients. That is, that the SEs of the coefficients are conditional on the values of the covariates in the sample. The only random variation that is considered in producing the SEs in standard regression is that of the response variable, as implied by the model being fitted. Hence the MLE will tell you about the uncertainty in the coefficients due to random response, but with only the exact covariate values which are present in the sample. In practice, as has been indicated by other responses, the data are from a population in which the covariates vary and not all have been observed, and there is interest in assessing the uncertainty about the population coefficients due to this. An indication of this (with somewhat uncertain reliability) can be obtained by a bootstrap procedure, on the basis that sampling from the sample will have some resemblance to sampling from the population. Ted. On 21-Jul-08 19:56:16, Áõ½Ü wrote: Hi Doran, Maybe I am wrong, but I think bootstrap is a general resampling method which can be used for different purposes...Usually it works well when you do not have a presentative sample set (maybe with limited number of samples). Therefore, I am positive with Michal... P.S., overfitting, in my opinion, is used to depict when you got a model which is quite specific for the training dataset but cannot be generalized with new samples.. Thanks, --Jerry 2008/7/21 Doran, Harold [EMAIL PROTECTED]: I used bootstrap to virtually increase the size of my dataset, it should result in estimates more close to that from the population - isn't it the purpose of bootstrap? No, not really. The bootstrap is a resampling method for variance estimation. It is often used when there is not an easy way, or a closed form expression, for estimating the sampling variance of a statistic. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/po sting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 21-Jul-08 Time: 21:11:10 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Editor fpr Mac OS
On 22/07/2008, at 5:39 AM, Angelo Scozzarella wrote: Hi, is there a good editor for Mac Os? Yes; vi[m]. cheers, Rolf Turner ## Attention:\ This e-mail message is privileged and confid...{{dropped:9}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Parameter names in nls
On 22/07/2008, at 3:49 AM, [EMAIL PROTECTED] wrote: Dear R-help, Could you please examine the following code, and see if I have discovered a bug or not, or am just doing something silly. I am trying to create a package to do fish stock assessment using the nls() function to fit the modelled stock size to the various pieces of information that we have. The main problem with this sort of task is that the number and type of parameters that go into the model are highly variable between stocks, but the method needs to be intelligent enough to handle this. The way I have chosen to handle this is through the names in my parameter vector, and using code inside the objective function to figure out which parameter is which. The problem I have encountered is that I don't think nls() always passes a named vector - indeed, after the first set of function evaluations, it drops the names from the parameters vector altogether. I believe this to be a bug - it certaintly plays havoc with my code! As a demonstration of this problem, consider the piece of code below. It is basically fitting a straight line to some synthetic data (with noise). I have setup the objective function so that it prints the names of the parameters every time that it is called. As you can see, the names are there to begin with, but rapidly disappear after the first step is made. Is this a bug? Or is it intended behaviour? Or is this a completely daft approach I am taking? I think the latter. You are simply not using nls correctly. Try fit - nls(data.y ~ a + b*data.x, start=ips) (and compare with the result of lm(data.y ~ data.x)). cheers, Rolf Turner I look forward to your comments. cheers, Mark rm(list=ls()) fitting.fn -function(x,params) { #The model - so that it works y - params[1] + x*params[2] #How I would prefer it to work # y - params[a] + x*params[b] #Display information about function eval cat(paste(Evaluation # :,counter,\t Names :)) print(names(params)) counter - counter +1 return(y) } counter - 1 data.x - 1:50 data.y - pi*data.x + rnorm(50,sd=20) plot(data.x,data.y) ips - c(a=0,b=0) nls(data.y~fitting.fn(data.x,params),data=data.frame(data.x,data.y), start=list(params=ips),trace=TRUE,control=nls.control(tol=1e-8)) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. ## Attention:\ This e-mail message is privileged and confid...{{dropped:9}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Coefficients of Logistic Regression from bootstrap - how to get them?
Well, here is a good source--wikipedia. http://en.wikipedia.org/wiki/Bootstrapping_(statistics) From: Áõ½Ü [mailto:[EMAIL PROTECTED] Sent: Monday, July 21, 2008 3:56 PM To: Doran, Harold Cc: Michal Figurski; Frank E Harrell Jr; r-help@r-project.org Subject: Re: [R] Coefficients of Logistic Regression from bootstrap - how to get them? Hi Doran, Maybe I am wrong, but I think bootstrap is a general resampling method which can be used for different purposes...Usually it works well when you do not have a presentative sample set (maybe with limited number of samples). Therefore, I am positive with Michal... P.S., overfitting, in my opinion, is used to depict when you got a model which is quite specific for the training dataset but cannot be generalized with new samples.. Thanks, --Jerry 2008/7/21 Doran, Harold [EMAIL PROTECTED]: I used bootstrap to virtually increase the size of my dataset, it should result in estimates more close to that from the population - isn't it the purpose of bootstrap? No, not really. The bootstrap is a resampling method for variance estimation. It is often used when there is not an easy way, or a closed form expression, for estimating the sampling variance of a statistic. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Howto Restart A Function with Try-Error Catch
On Mon, 21-Jul-2008 at 10:12PM +0900, Gundala Viswanath wrote: | Hi all, | | I have a function - let's call it myfunction. This function is based | on some random | number generator. Now, once in a while the function will break/crash depending | on the random number it generate inside the function. | | To avoid the problem, what I intend to do is the following: | | 1. Catch the try-error using class. | 2. Redo the function if it returns try-error | 3. Otherwise keep the output of the function. | | I'm not sure how to create the above construct. | The code I have below doesn't work: | | __BEGIN__ | | myfunction - function(the_x) { | # do something | a = list(output1=val1, output2 = val2) | a | } | |out - try(suppressWarnings(myfunction(x)),silent=T) | | if (class(out) == try-error) { | #this clause doesn't seem to redo If it were to redo, it would get the same result (depending on what your function does). Assuming your function is using a random number, you could use do and while to continue trying according to your if test. HTH -- ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. ___Patrick Connolly {~._.~} Great minds discuss ideas _( Y )_Middle minds discuss events (:_~*~_:)Small minds discuss people (_)-(_) . Anon ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] avoid loop with three-dimensional array
On Mon, 21 Jul 2008, Michela Cameletti wrote: Dear R user, I'm trying to find a solution for optimizing my code. I have to run a 50.000 iteration long simulation and it is absolutely necessary to have an optimized code. What you have is just a quadratic form, so t( y ) %*% A %*% y will do it. You need to rearrange your subscripts, this is one approach: X2 - matrix( aperm(X, c(1,3,2) ),nr=d ) # X is as you defined it. nother.result - crossprod( matrix(A%*%X2,nc=k), matrix( X2, nc=k) ) Using dim( X2 ) - ... rather than matrix() will probably speed this further. HTH, Chuck p.s. This is the road to perdition. You can optimize the heck out of some piece of linear algebra in R only to find that it needs to be written in C, but the R code that you have written is so hard to read that you have to start from scratch. IIRC, writing block-trace algorithms for mixed models takes one down this path. I have to do this operation *sum_t ( t(X_t) %*% A %*% X_t )* where X_t is a (d*k) matrix which changes in time and A is a constant in time (d*d) matrix. I have put all my X_t in a three dimensional array X of dimension (d,k,T). At the moment for computing the sum over time I'm doing a for loop and saving the resulting (k*k) matrix in a list and at the end I sum the T matrices in this list. I'm wondering if there is a better way to do this. Here an example of what I have to do: *d=3 k=2 T=4 X = array(rnorm(d*k*T),dim=c(d,k,T)) A = matrix(rnorm(d*d),d,d) e1 = list() for (t in 1:T){ #I would like to avoid this e1[[t]] = t(X[,,t])%*%A%*%X[,,t] } ## #Function for doing the sum of matrices in a list ## sumMatrices - function(matrices){ if (length(matrices) 2) matrices[[1]] + Recall(matrices[-1]) else matrices[[1]] + matrices[[2]] } ## result = sumMatrices(e1) * Thank you in advance for all your help, best Michela [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:[EMAIL PROTECTED] UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Table of tables?
I have occasion to want to produce a ``table of tables''. I.e. I have a number of subjects; to each subject there corresponds a 4 x 5 table of numbers. I would like to arrange these tables into a (say) 4 x 4 array. I would also like (said he, wistfully) to have each of these 16 tables in the array distinguished by a header identifying the ``subject''; a string of the form ``Subject xxx'', ``Subject yyy'' etc. I could bind together my 16 tables into a 16 x 20 numerical matrix, output this to a file, and then manually mark it up with the appropriate LaTeX commands to get the effect I want. But that would be a most tedious experience and since I have to do something like 12 or 15 of these, it could take a while. So I thought I'd ask if any of those clever people out there had written functions that might produce such a table of tables. Preferably with a LaTeX filter at the end. Are there such functions anywhere? I have of course looked at the xtable package and I can't see a facility for doing what I want. cheers, Rolf Turner ## Attention:\ This e-mail message is privileged and confid...{{dropped:9}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Using integrate
I have a function, say: f-function(x) exp(x) and I would like to obtain the integration of the function while adding a few operations within the integration (I need point-by-point integration), say: t-seq(0,40, by=1) z-array(0,length(t)) for (i in 1:40){ z[i] - integrate ( f * (t [i+1] - t) , (i-1) , i ) In R, I can only have the function in 'integrate' i.e. z[i] - integrate ( f , (i-1) , i ) but I want to add additional operations within the 'integrate' command, which I cannot just add to the function f. any input is appreciated. Ayman [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Control parameter of the optim( ): parscale
Zornitsa Luleva zornitsa.luleva at gmail.com writes: I am using the L-BFGS-B method of the mle2() function to estimate the values of 6 parameters. mle2 uses the methods implemented in optim. As I got it from the descriptions available online, one can use the parscale parameter to tell R somehow what the values of the estimated parameters should be . . . Could somebody please help me understand what one has to do actually with the parscale parameter so that it works right? I am very grateful for an answer - R leaves sometimes some of the parameters unchanged (it is OK since it is a feature of the L-BFGS-B algorithm), but often these are the parameters that should have greater values! I'm the author of mle2 (in the bbmle package), but as you correctly infer your problem is with optim and not with mle2 per se. As far as I can tell, you're a little bit confused about the purpose of the parscale parameter -- and I'm a little confused about what you want to do. The parscale parameter is a way of telling R what the expected sensitivity/magnitude of different parameters is likely to be. For example, if you have two parameters that have expected values of 1e6 and 1e-6, the optimization is likely to work much better if you give control=list(parscale=c(1e6,1e-6)) as one of the arguments to mle2. It sounds like you instead want to force some of the parameters to have particular values. I don't know exactly *why* you want to do this, but you can use e.g. fixed=list(fixedpar=27) to force one of the parameters of the function to be set rather than optimized. Ben Bolker __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] y-axis number format on plot, barplot etc.
I am trying to change the number format shown on the y-axis from scientific 5e05, to 500,000 etc. Does anyone know how to do this? Is there something I can add as an argument to barplot, or would it be through par? barplot(data$Value, names.arg = as.vector(data$Field), main=strTitle) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to draw multiples curses (for given formulae) in lattice
Dear R Users, Could you please write a piece of code to draw Figure 5.9 of Mixed-Effects Models in S and S-Plus? Thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] name returned by lapply
If you return the value as named list, you get your answer using unlist(res, recursive=F): res - lapply(1:2, function(i) {val - list(i); names(val) - paste(Hugo, i, sep=_); return(val)}) unlist(res, rec=F) $Hugo_1 [1] 1 $Hugo_2 [1] 2 Antje wrote: Oh true, this would solve the problem too :-) Thanks a lot for the suggestions! Antje Martin Morgan schrieb: Antje [EMAIL PROTECTED] writes: Thanks a lot for your help! I know that I cannot directly access the list created, I just was not sure if there is any format of the return value which could provide additionally a name for the returned list. I tried to return the values as list with the appropriate name but then I end up with a list entry as list entry... Okay, then I'll solve it with a loop and thanks for the hint with the article maybe this: res - lapply(1:5, function(i) list(key=paste(Hugo, i, sep=_), val=i)) val - lapply(res, [[, val) names(val) - lapply(res, [[, key) val $Hugo_1 [1] 1 $Hugo_2 [1] 2 $Hugo_3 [1] 3 $Hugo_4 [1] 4 $Hugo_5 [1] 5 Martin Ciao, Antje Gavin Simpson schrieb: On Fri, 2008-07-18 at 14:19 +0200, Antje wrote: Hi Gavin, thanks a lot for your answer. Maybe I did not explain very well what I want to do and probably chose a bad example. I don't mind spaces or names starting with a number. I could even name it: Hugo1, Hugo2, ... My biggest problem is, that not only the values are calculated/estimated within my function but also the names (Yes, in reality my funtion is more complicated). Maybe it's easier to explain like this. the parameter x can be a coordinate position of mountains on earth. Within the funtion the height of the mountain is estimated and it's name. In the end, I'd like to get a list, where the entry is named like the mountain and it contains its height (or other measurements...) ## now that we have a list, we change the names to what you want names(ret) - paste(1:10, info_within_function) so this would not work, because I don't have the information anymore about the naming... OK, so you can't do what you want to do in the manner you tried, via lapply as you don't have control of how the list is produced once the loop over 1:10 has been performed. At the stage that 'test' is being applied, all it knows about is 'x' and it doesn;t have access to the list being built up by lapply(). The *apply family of functions help us to *not* write out formal loops in R, but here this is causing you a problem. So we can specify an explicit loop and fill in information as and when we want from within the loop ## create list to hold results n - 10 ret - vector(mode = list, length = n) ## initialise loop for(i in seq_len(n)) { ## do whatever you need to do here, but this line just ## replicates what 'test' did earlier ret[[i]] - c(1,2,3,4,5) ## now add the name in names(ret)[i] - paste(Mountain, i, sep = ) } ret Alternatively, collect a vector of names during the loop and then once the loop is finished do a single call to names(ret) to replace all the names at once: n - 10 ret - vector(mode = list, length = n) ## new vector to hold vector of names name.vec - character(n) for(i in seq_len(n)) { ret[[i]] - c(1,2,3,4,5) ## now we just fill in this vector as we go name.vec[i] - paste(Mountain, i, sep = ) } ## now replace all the names at once names(ret) - name.vec ret This latter version is likely to more efficient if n is big so you don't incur the overhead of the repeated calls to names() The moral of the story is to not jump to using *apply all the time to avoid loops. Loops in R are just fine, so use the tool that helps you do the job most efficiently *and* most transparently. Take a look at the R Help Desk article by Uwe Ligges and John Fox in the current issue of RNews: http://www.r-project.org/doc/Rnews/Rnews_2008-1.pdf Which goes into this in much more detail HTH G __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Large number of dummy variables
Hello, I'm trying to run a regression predicting trade flows between importers and exporters. I wish to include both year-importer dummies and year-exporter dummies. The former includes 1378 levels, and the latter includes 1390 levels. I have roughly 100,000 total observations. When I'm using lm() to run a simple regression, it give me a cannot allocate ___ error. I've been able to get around time-demeaning over one large group, but since I have two, it doesn't work in the correct way. Is there a more efficient way to handling a model matrix this large in R? Thanks for your help. Alan Spearot -- Alan Spearot Assistant Professor - International Economics University of California - Santa Cruz 1156 High Street 453 Engineering 2 Santa Cruz, CA 95064 Office: (831) 459-1530 [EMAIL PROTECTED] http://people.ucsc.edu/~aspearot [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Large number of dummy variables
Well, at the risk of entering a debate I really don't have time for (I'm doing it anyway) why not consider a random coefficient model? If your response has anything like, well, random effects and fixed effects are correlated and so the estimates are biased but OLS is consistent and unbiased via an appeal to Gauss-Markov then I will probably make time for this discussion :) I have experienced this problem, though. In what you're doing, you are first creating the model matrix and then doing the demeaning, correct? I do recall Doug Bates was, at one point, doing some work where the model matrix for the fixed effects was immediately created as a sparse matrix for OLS models. I think doing the work on the sparse matrix is a better analytical method than time-demeaning. I don't remember where that work is, though. There is a package called sparseM which had functions for doing OLS with sparse matrices. I don't know its status, but vaguely recall the author of sparseM at one point noting that the work of Bates and Maechler would be the go to package for work with large, sparse model matrices. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Alan Spearot Sent: Monday, July 21, 2008 5:59 PM To: r-help@r-project.org Subject: [R] Large number of dummy variables Hello, I'm trying to run a regression predicting trade flows between importers and exporters. I wish to include both year-importer dummies and year-exporter dummies. The former includes 1378 levels, and the latter includes 1390 levels. I have roughly 100,000 total observations. When I'm using lm() to run a simple regression, it give me a cannot allocate ___ error. I've been able to get around time-demeaning over one large group, but since I have two, it doesn't work in the correct way. Is there a more efficient way to handling a model matrix this large in R? Thanks for your help. Alan Spearot -- Alan Spearot Assistant Professor - International Economics University of California - Santa Cruz 1156 High Street 453 Engineering 2 Santa Cruz, CA 95064 Office: (831) 459-1530 [EMAIL PROTECTED] http://people.ucsc.edu/~aspearot [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.