[R] Interpreting Rprof output
Hello! I have run Rprof on a function of mine and the results look very strange, to say the least. At the end I of this email is an output of summaryRprof. Can someone help me interpret this output? I have read the appropriate section in the manual Writing R Extensions and help pages. If I understand this output correctly, it is saying that unlist has been active in every interval and all functions or the functions they have called have been active in every interval. Is that correct? It seams strange that the larges time is 0.02, since the function that was Rprof-ed ran for at least an hour if not more. I am also surprised that some functions are not listed here, especially for, if, as.character, and some others. Could it be that this is the result of the type of my function? The function that was originally called is opt.random.par. This function in turn called function opt.par.check.to.skip.iter via do.call 100 times in a for loop. This later function calls function crit.fun form about 70 to 300 times with double for loop in each iteration and does usually do 1 to 10 iterations. This crit.fun itself is quite quick, on these data it takes about 0.04s. I know it would be helpful to provide the functions and the data that produced this results, however I can not disclose them at this point. Thank you in advance for any suggestions. Ales Ziberna output of summaryRprof: $by.self self.time self.pct total.time total.pct unlist 0.02 100 0.02 100 any 0.000 0.02 100 crit.fun0.000 0.02 100 diag0.000 0.02 100 do.call 0.000 0.02 100 opt.par.check.to.skip.iter 0.000 0.02 100 opt.random.par 0.000 0.02 100 sapply 0.000 0.02 100 sum 0.000 0.02 100 $by.total total.time total.pct self.time self.pct unlist 0.02 100 0.02 100 any 0.02 100 0.000 crit.fun 0.02 100 0.000 diag 0.02 100 0.000 do.call 0.02 100 0.000 opt.par.check.to.skip.iter 0.02 100 0.000 opt.random.par 0.02 100 0.000 sapply 0.02 100 0.000 sum 0.02 100 0.000 $sampling.time [1] 0.02 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Creating a custom connection to read from multiple files
Hello, is it possible to create my own connection which I could use with read.table or scan ? I would like to create a connection that would read from multiple files in sequence (like if they were concatenated), possibly with an option to skip first n lines of each file. I would like to avoid using platform specific scripts for that... (currently I invoke /bin/cat from R to create a concatenation of all those files). Thanks, Tomas __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Creating a custom connection to read from multiple files
On Thu, 20 Jan 2005, Tomas Kalibera wrote: is it possible to create my own connection which I could use with Yes. In a sense, all the connections are custom connections written by someone. read.table or scan ? I would like to create a connection that would read from multiple files in sequence (like if they were concatenated), possibly with an option to skip first n lines of each file. I would like to avoid using platform specific scripts for that... (currently I invoke /bin/cat from R to create a concatenation of all those files). I would use pipes, but a pure R solution is to process the files to an anonymous file() connection and then read that. However, what is wrong with reading a file at a time and combining the results in R using rbind? -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Creating a custom connection to read from multiple files
Dear Prof Ripley, thanks for your suggestions, it's very nice one can create custom connections directly in R and I think it is what I need just now. However, what is wrong with reading a file at a time and combining the results in R using rbind? Well, the problem is performance. If I concatenate all those files, they have around 8MB, can grow to tens of MBs in near future. Both concatenating and reading from a single file by scan takes 5 seconds (which is almost OK). However, reading individual files by read.table and rbinding one by one ( samples=rbind(samples, newSamples ) takes minutes. The same is when I concatenate lists manually. Scan does not help significantly. I guess there is some overhead in detecting dimensions of objects in rbind (?) or re-allocation or copying data ? Best regards, Tomas Kalibera __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Creating a custom connection to read from multiple files
On Thu, 20 Jan 2005, Tomas Kalibera wrote: Dear Prof Ripley, thanks for your suggestions, it's very nice one can create custom connections directly in R and I think it is what I need just now. However, what is wrong with reading a file at a time and combining the results in R using rbind? Well, the problem is performance. If I concatenate all those files, they have around 8MB, can grow to tens of MBs in near future. Both concatenating and reading from a single file by scan takes 5 seconds (which is almost OK). However, reading individual files by read.table and rbinding one by one ( samples=rbind(samples, newSamples ) takes minutes. The same is when I concatenate lists manually. Scan does not help significantly. I guess there is some overhead in detecting dimensions of objects in rbind (?) or re-allocation or copying data ? rbind is vectorized so you are using it (way) suboptimally. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Problem loading a library
Hi. I have R (Ver 2.0) correctly running on a Suse 9.0 Linux machine. I correclty installed the Logic Regression LogicReg library (by the command: R CMD INSTALL LogicReg) developed by Ingo Ruczinski and Charles Kooperberg : http://bear.fhcrc.org/~ingor/logic/html/program.html When I try to load the library in R by the command: library(LogicReg) I get the following error: Error in dyn.load(x, as.logical(local), as.logical(now)) : unable to load shared library /usr/lib/R/library/LogicReg/libs/LogicReg.so: /usr/lib/R/library/LogicReg/libs/LogicReg.so: cannot map zero-fill pages: Cannot allocate memory Error in library(LogicReg) : .First.lib failed How could I solve the problem? Thanks in advance for your kind help. Marco __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Creating a custom connection to read from multiple files
From: Prof Brian Ripley On Thu, 20 Jan 2005, Tomas Kalibera wrote: Dear Prof Ripley, thanks for your suggestions, it's very nice one can create custom connections directly in R and I think it is what I need just now. However, what is wrong with reading a file at a time and combining the results in R using rbind? Well, the problem is performance. If I concatenate all those files, they have around 8MB, can grow to tens of MBs in near future. Both concatenating and reading from a single file by scan takes 5 seconds (which is almost OK). However, reading individual files by read.table and rbinding one by one ( samples=rbind(samples, newSamples ) takes minutes. The same is when I concatenate lists manually. Scan does not help significantly. I guess there is some overhead in detecting dimensions of objects in rbind (?) or re-allocation or copying data ? rbind is vectorized so you are using it (way) suboptimally. Here's an example: ## Create a 500 x 100 data matrix. x - matrix(rnorm(5e4), 500, 100) ## Generate 50 filenames. fname - paste(f, formatC(1:50, width=2, flag=0), .txt, sep=) ## Write the data to files 50 times. for (f in fname) write(t(x), file=f, ncol=ncol(x)) ## Read the files into a list of data frames. system.time(datList - lapply(fname, read.table, header=FALSE), gcFirst=TRUE) [1] 11.91 0.05 12.33NANA ## Specify colClasses to speed up. system.time(datList - lapply(fname, read.table, colClasses=rep(numeric, 100)), + gcFirst=TRUE) [1] 10.69 0.07 10.79NANA ## Stack them together. system.time(dat - do.call(rbind, datList), gcFirst=TRUE) [1] 5.34 0.09 5.45 NA NA ## Use matrices instead of data frames. system.time(datList - lapply(fname, + function(f) matrix(scan(f), ncol=100, byrow=TRUE)), gcFirst=TRUE) Read 5 items ... Read 5 items [1] 9.49 0.08 15.06NANA system.time(dat - do.call(rbind, datList), gcFirst=TRUE) [1] 0.09 0.03 0.12 NA NA ## Clean up the files. unlink(fname) A couple of points: - Usually specifying colClasses will make read.table() quite a bit faster, even though it's only marginally faster here. Look back in the list archive to see examples. - If your data files are all numerics (as in this example), storing them in matrices will be much more efficient. Note the difference in rbind()ing the 50 data frames and 50 matrices (5.34 seconds vs. 0.09!). rbind.data.frame() needs to ensure that the resulting data frame has unique rownames (a requirement for a legit data frame), and that's probably taking a big chunk of the time. Andy -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Constructing Matrices
Dear List: I am working to construct a matrix of a particular form. For the most part, developing the matrix is simple and is built as follows: vl.mat-matrix(c(0,0,0,0,0,64,0,0,0,0,64,0,0,0,0,64),nc=4) Now to expand this matrix to be block-diagonal, I do the following: sample.size - 100 # number of individual students I- diag(sample.size) bd.mat-kronecker(I,vl.mat) This creates a block-diagonal matrix with variances along the diagonal and covariances within-student to be zero (I am working with longitudinal student achievement data). However, across student, I want to have the correlation equal to 1 for each variance term. To illustrate, here is a matrix for 2 students. The goal is for the correlation between the second variance term for student 1 to be perfectly correlated with the variance term for student 2. In other words, I need to plug in 64 at position (6,2) and (2,6), another 64 at position (7,3) and (3,7) and another 64 at positions (8,4) and (4,8). I'm having some difficulty conceptualizing how to construct this part of the matrix and would appreciate any thoughts. Thank you, Harold [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [1,]00000000 [2,]0 64000000 [3,]00 6400000 [4,]000 640000 [5,]00000000 [6,]00000 6400 [7,]000000 640 [8,]0000000 64 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Problem loading a library
On Thu, 20 Jan 2005, Marco Sandri wrote: Hi. I have R (Ver 2.0) correctly running on a Suse 9.0 Linux machine. 32- or 64-bit? I correclty installed the Logic Regression LogicReg library (by the command: R CMD INSTALL LogicReg) developed by Ingo Ruczinski and Charles Kooperberg : http://bear.fhcrc.org/~ingor/logic/html/program.html When I try to load the library in R by the command: library(LogicReg) I get the following error: Error in dyn.load(x, as.logical(local), as.logical(now)) : unable to load shared library /usr/lib/R/library/LogicReg/libs/LogicReg.so: /usr/lib/R/library/LogicReg/libs/LogicReg.so: cannot map zero-fill pages: Cannot allocate memory Error in library(LogicReg) : .First.lib failed How could I solve the problem? Use a different machine? That package works on all of mine, 32- and 64-bit. BTW, the posting guide does suggest you contact the package authors first, so what do they say? -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Constructing Matrices
I'm still not clear on exactly what your question is. If you can plug in the numbers you want in, say, the lower triangular portion, you can copy those to the upper triangular part easily; something like: m[upper.tri(m)] - m[lower.tri(m)] Is that what you're looking for? Andy From: Doran, Harold Dear List: I am working to construct a matrix of a particular form. For the most part, developing the matrix is simple and is built as follows: vl.mat-matrix(c(0,0,0,0,0,64,0,0,0,0,64,0,0,0,0,64),nc=4) Now to expand this matrix to be block-diagonal, I do the following: sample.size - 100 # number of individual students I- diag(sample.size) bd.mat-kronecker(I,vl.mat) This creates a block-diagonal matrix with variances along the diagonal and covariances within-student to be zero (I am working with longitudinal student achievement data). However, across student, I want to have the correlation equal to 1 for each variance term. To illustrate, here is a matrix for 2 students. The goal is for the correlation between the second variance term for student 1 to be perfectly correlated with the variance term for student 2. In other words, I need to plug in 64 at position (6,2) and (2,6), another 64 at position (7,3) and (3,7) and another 64 at positions (8,4) and (4,8). I'm having some difficulty conceptualizing how to construct this part of the matrix and would appreciate any thoughts. Thank you, Harold [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [1,]00000000 [2,]0 64000000 [3,]00 6400000 [4,]000 640000 [5,]00000000 [6,]00000 6400 [7,]000000 640 [8,]0000000 64 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] References
Where can I get the literature on Multiple Imputation using Additive Regressing, Bootstrapping, Predictive Mean Matching __ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Subsetting a data frame by a factor, using the level that occurs the most times
I think that title makes sense... I hope it does... I have a data frame, one of the columns of which is a factor. I want the rows of data that correspond to the level in that factor which occurs the most times. I can get a list by doing: by(data,data$pattern,subset) And go through each element of the list counting the rows, to find the maximum BUT I can't help thinking there's a more elegant way of doing this The second part is figuring out the rows which have the maximum number of consecutive patterns which are the same... Now that I would love some help with... :-) Thanks Mick __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Reference Material for Multiple Imputation
Any particular site where I get some examples and references in Multiple Imputation using Bootstrapping - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Constructing Matrices
I should probably have explained my data and model a little better. Assume I have student achievement scores across four time points. I estimate the model using gls() as follows fm1 - gls(score ~ time, long, correlation=corAR1(form=~1|stuid), method='ML') I can now extract the variance-covariance matrix for this model as follows: var.mat-getVarCov(fm1) Assume for sake of argument I have a sample size of 100 students. I can expand this to the full matrix as follows I-diag(100) V-kronecker(I,var.mat) For my particular model, the scores within each student are assumed correlated (AR1), but across student are uncorrelated. Now, for a particular problem I am dealing with I need to make some adjustments to this matrix, V, and reestimate the gls(). The adjustments I need to make cannot be done using any of the existing varFunc classes, so I am having to do this manually. What I need to do is create a new matrix manually, add it to V, then reestimate the gls. Creating this new matrix is the challenge I currently face, let's call it v.prime. The issue at hand is creating v.prime to have non-zero covariance terms across students in very specific places. The matrix I used below is only for two students. But assume I am doing this for thousands of students. My goal is to create a full block-diagonal covariance matrix where the correlation across students at time two is always perfectly correlated and the correlation at time three is always perfectly correlated across students. So, within each block of v.prime, the variances are uncorrelated, but across each block the variances are correlated. So, I need to construct v.prime such that it is one the same order of V to make them conformable for addition. More importantly, I need the off-diagonal elements across students to represent a perfect correlation in very specific places. In the example below, if there was a 64 at position (2,6) this would represent a perfect correlation between student 1 and 2 at this point in time since the variance along the diagonal at time 2 is 64. Since I am doing this for many students, there would need to be a 64 between student 1 and all other students (not just student 2) and so on. From here I can use R's matrix facilities to reestimate the gls. I hope this clarifies a bit. Harold -Original Message- From: Liaw, Andy [mailto:[EMAIL PROTECTED] Sent: Thursday, January 20, 2005 8:41 AM To: Doran, Harold; r-help@stat.math.ethz.ch Subject: RE: [R] Constructing Matrices I'm still not clear on exactly what your question is. If you can plug in the numbers you want in, say, the lower triangular portion, you can copy those to the upper triangular part easily; something like: m[upper.tri(m)] - m[lower.tri(m)] Is that what you're looking for? Andy From: Doran, Harold Dear List: I am working to construct a matrix of a particular form. For the most part, developing the matrix is simple and is built as follows: vl.mat-matrix(c(0,0,0,0,0,64,0,0,0,0,64,0,0,0,0,64),nc=4) Now to expand this matrix to be block-diagonal, I do the following: sample.size - 100 # number of individual students I- diag(sample.size) bd.mat-kronecker(I,vl.mat) This creates a block-diagonal matrix with variances along the diagonal and covariances within-student to be zero (I am working with longitudinal student achievement data). However, across student, I want to have the correlation equal to 1 for each variance term. To illustrate, here is a matrix for 2 students. The goal is for the correlation between the second variance term for student 1 to be perfectly correlated with the variance term for student 2. In other words, I need to plug in 64 at position (6,2) and (2,6), another 64 at position (7,3) and (3,7) and another 64 at positions (8,4) and (4,8). I'm having some difficulty conceptualizing how to construct this part of the matrix and would appreciate any thoughts. Thank you, Harold [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [1,]00000000 [2,]0 64000000 [3,]00 6400000 [4,]000 640000 [5,]00000000 [6,]00000 6400 [7,]000000 640 [8,]0000000 64 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Notice: This e-mail message, together with any attachments,...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide!
[R] Cauchy's theorem
In complex analysis, Cauchy's integral theorem states (loosely speaking) that the path integral of any entire differentiable function, around any closed curve, is zero. I would like to see this numerically, using R (and indeed I would like to use the residue theorem as well). Has anyone coded up path integration? -- Robin Hankin Uncertainty Analyst Southampton Oceanography Centre European Way, Southampton SO14 3ZH, UK tel 023-8059-7743 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] (no subject)
Hello I would like to compare the results obtained with a classical non parametric proportionnal hazard model with a parametric proportionnal hazard model using a Weibull. How can we obtain the equivalence of the parameters using coxph(non parametric model) and survreg(parametric model) ? Thanks Virginie __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] (no subject)
On Thu, Jan 20, 2005 at 03:18:53PM +0100, Virginie Rondeau wrote: Hello I would like to compare the results obtained with a classical non parametric proportionnal hazard model with a parametric proportionnal hazard model using a Weibull. How can we obtain the equivalence of the parameters using coxph(non parametric model) and survreg(parametric model) ? One way of avoiding this problem is to fit the Weibull model with 'weibreg' in the package eha. -- Göran Broströmtel: +46 90 786 5223 Department of Statistics fax: +46 90 786 6614 Umeå University http://www.stat.umu.se/egna/gb/ SE-90187 Umeå, Sweden e-mail: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Johnson transformation
Hello, I'm Carla, an italian student, I'm looking for a package to transform non normal data to normality. I tried to use Box Cox, but it's not ok. There is a package to use Johnson families' transormation? Can you give me any suggestions to find free software as R that use this trasform? Thank yuo very much Carla 6X velocizzare la tua navigazione a 56k? 6X Web Accelerator di Libero! Scaricalo su INTERNET GRATIS 6X http://www.libero.it __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] (no subject)
Virginie Rondeau wrote: Hello I would like to compare the results obtained with a classical non parametric proportionnal hazard model with a parametric proportionnal hazard model using a Weibull. How can we obtain the equivalence of the parameters using coxph(non parametric model) and survreg(parametric model) ? Thanks Virginie In the Design package look at the pphsm function that converts a survreg Weibull fit (fitted by the psm function which is an adaptation of survreg) to PH form. -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Johnson transformation
Greetings, Carla: While it is possible to map any proper density into a normal through their CDFs, that may not be useful in your case. I suggest that you first plot your data. ?qqnorm (Type ?qqnorm on the R command line and hit Enter.) Are your data continuous, or do they occur in groups? Do the data curve? Do they look like two (or more) distinct lines? If your data have only one mode and if they are smooth then the Box-Cox transform should provide a symmetrical result. Not all symmetrical densities are normal, of course. And if your data are discrete then using a continuous density like the normal (or Johnson family) is inappropriate. The purpose of fitting a distribution to data is usually to permit some probability statement, like Prob(x X) = alpha. Why do you want to use the Johnson family? I am not aware of convenient methods for making such probability statements for them. Best wishes. Charles Annis, P.E. [EMAIL PROTECTED] phone: 561-352-9699 eFax: 614-455-3265 http://www.StatisticalEngineering.com -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: Thursday, January 20, 2005 10:16 AM To: r-help Subject: [R] Johnson transformation Hello, I'm Carla, an italian student, I'm looking for a package to transform non normal data to normality. I tried to use Box Cox, but it's not ok. There is a package to use Johnson families' transormation? Can you give me any suggestions to find free software as R that use this trasform? Thank yuo very much Carla 6X velocizzare la tua navigazione a 56k? 6X Web Accelerator di Libero! Scaricalo su INTERNET GRATIS 6X http://www.libero.it __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Johnson transformation
[EMAIL PROTECTED] wrote: Hello, I'm Carla, an italian student, I'm looking for a package to transform non normal data to normality. I tried to use Box Cox, but it's not ok. There is a package to use Johnson families' transormation? Can you give me any suggestions to find free software as R that use this trasform? Thank yuo very much Carla 6X velocizzare la tua navigazione a 56k? 6X Web Accelerator di Libero! Scaricalo su INTERNET GRATIS 6X http://www.libero.it __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html The Johnson system is in the SuppDists package. -- Bob Wheeler --- http://www.bobwheeler.com/ ECHIP, Inc. --- Randomness comes in bunches. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Cauchy's theorem
I don't know about the 'in R' bit, but ISTR that Monte-Carlo (or pseudo Monte-Carlo) Integration is a way of doing this 'numerically'. I know that Mathematica implements the (pseudo Monte-Carlo) Halton-Hammersley-Wozniakowski algorithm as Nintegrate. Perhaps something equivalent has been coded by someone for WINBUGS (OPENBUGS) (accessible from R via the BRUGS package). HTH Mike -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Robin Hankin Sent: 20 January 2005 14:14 To: R-help@stat.math.ethz.ch Subject: [R] Cauchy's theorem In complex analysis, Cauchy's integral theorem states (loosely speaking) that the path integral of any entire differentiable function, around any closed curve, is zero. I would like to see this numerically, using R (and indeed I would like to use the residue theorem as well). Has anyone coded up path integration? -- Robin Hankin Uncertainty Analyst Southampton Oceanography Centre European Way, Southampton SO14 3ZH, UK tel 023-8059-7743 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] glm and percentage data with many zero values
Dear all, I am interested in correctly testing effects of continuous environmental variables and ordered factors on bacterial abundance. Bacterial abundance is derived from counts and expressed as percentage. My problem is that the abundance data contain many zero values: Bacteria - c(2.23,0,0.03,0.71,2.34,0,0.2,0.2,0.02,2.07,0.85,0.12,0,0.59,0.02,2.3,0.29,0.39,1.32,0.07,0.52,1.2,0,0.85,1.09,0,0.5,1.4,0.08,0.11,0.05,0.17,0.31,0,0.12,0,0.99,1.11,1.78,0,0,0,2.33,0.07,0.66,1.03,0.15,0.15,0.59,0,0.03,0.16,2.86,0.2,1.66,0.12,0.09,0.01,0,0.82,0.31,0.2,0.48,0.15) First I tried transforming the data (e.g., logit) but because of the zeros I was not satisfied. Next I converted the percentages into integer values by round(Bacteria*10) or ceiling(Bacteria*10) and calculated a glm with a Poisson error structure; however, I am not very happy with this approach because it changes the original percentage data substantially (e.g., 0.03 becomes either 0 or 1). The same is true for converting the percentages into factors and calculating a multinomial or proportional-odds model (anyway, I do not know if this would be a meaningful approach). I was searching the web and the best answer I could get was http://www.biostat.wustl.edu/archives/html/s-news/1998-12/msg00010.html in which several persons suggested quasi-likelihood. Would it be reasonable to use a glm with quasipoisson? If yes, how I can I find the appropriate variance function? Any other suggestions? Many thanks in advance, Christian Christian Kamenik Institute of Plant Sciences University of Bern Altenbergrain 21 3013 Bern Switzerland __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Subsetting a data frame by a factor, using the level that occurs the most times
From: Douglas Bates michael watson (IAH-C) wrote: I think that title makes sense... I hope it does... I have a data frame, one of the columns of which is a factor. I want the rows of data that correspond to the level in that factor which occurs the most times. So first you want to determine the mode (in the sense of the most frequently occuring value) of the factor. One way to do this is names(which.max(table(fac))) Use this comparison for the subset as subset(data, pattern == names(which.max(table(pattern Just be careful that if there are ties (i.e., more than one level having the max) which.max() will randomly pick one of them. That may or may not be what's desired. If that is a possibility, Mick will need to think what he wants in such cases. Andy __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] ROracle error
I am running R 2.0.0 on a SunOs 5.9 machine and using Oracle 8i.1.7.0.0 (enterprise edition) and when I try to load ROracle I receive the following error: require(ROracle) Loading required package: ROracle Loading required package: DBI Error in dyn.load(x, as.logical(local), as.logical(now)) : unable to load shared library /bioinfo/local/R/lib/R/library/ROracle/libs/ROracle.so: ld.so.1: /bioinfo/local/R/lib/R/bin/exec/R: fatal: relocation error: file /bioinfo/local/R/lib/R/library/ROracle/libs/ROracle.so: symbol ncrov: referenced symbol not found [1] FALSE Any help is appreciated Regards, Sabrina Sabrina Carpentier Service Bioinformatique Institut Curie - Bat. Trouillet Rossignol (4e étage) 26 rue d'Ulm - 75248 Paris Cedex 5 - FRANCE [EMAIL PROTECTED] Tel : +33 1 42 34 65 21 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] easing out of Excel
Paul Sorenson [EMAIL PROTECTED] 01/19/05 03:18PM I know enough about R to be dangerous and our marketing people have asked me to automate some reporting. Data comes from an SQL source and graphs and various summaries are currently created manually in Excel. The raw information is invoicing records and the reporting is basically summaries by customer, region, product line etc. With function such as aggregate(), hist() and pareto() (which someone on this list kindly pointed me at) I can produce something roughly equivalent to the current reports. My question is, are there any neat R lock out features people here like to use on this kind of info, particularly when the output is very visual (report is intended for marketing people). Another way of looking at this is, What kind of hidden information can I extract with R that the Excel solution hasn't touched? Since you are looking for summaries within groups, you should look at the lattice package and some of the plots that you can produce with it (maybe for each product line you can produce a lattice/trellis graph with each panel representing a region and different colors symbols within panels to represent different customers). If we had more of an idea of what you are looking for, we could give better suggestions. For example, even the pareto plot mentioned earlier is something the Excel guys haven't thought of or can't easily produce. regards BTW the tool chain I am using goes something like: Production (run daily): DB - SQL/python - CSV - R/python - images - network Presentation: network - CGI/python - browser It looks like you want the reports fully automated and the final result as HTML (to be viewed with a browser), I suggest you look at the R2HTML package and the sweave function (this lets you write a report in HTML with r-code in place of graphs and output, then a quick run through sweave and you have a final report in HTML ready to be viewed). There are also several tools available for running R through CGI, go to: http://www-r.project.org/ and click on R web-servers under the Related Projects heading in the left column to get details. Hope this helps, Greg Snow, Ph.D. Statistical Data Center [EMAIL PROTECTED] (801) 408-8111 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] ROracle error
On Thu, 20 Jan 2005, Sabrina Carpentier wrote: I am running R 2.0.0 on a SunOs 5.9 machine and using Oracle 8i.1.7.0.0 (enterprise edition) and when I try to load ROracle I receive the following error: require(ROracle) Loading required package: ROracle Loading required package: DBI Error in dyn.load(x, as.logical(local), as.logical(now)) : unable to load shared library /bioinfo/local/R/lib/R/library/ROracle/libs/ROracle.so: ld.so.1: /bioinfo/local/R/lib/R/bin/exec/R: fatal: relocation error: file /bioinfo/local/R/lib/R/library/ROracle/libs/ROracle.so: symbol ncrov: referenced symbol not found [1] FALSE It's not an R issue, so please ask your sysadmins for help. But ldd /bioinfo/local/R/lib/R/library/ROracle/libs/ROracle.so would be a good start as I suspect your Oracle client libraries are not being found. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Subsetting a data frame by a factor, using the level that occurs the most times
Liaw, Andy wrote: From: Douglas Bates michael watson (IAH-C) wrote: I think that title makes sense... I hope it does... I have a data frame, one of the columns of which is a factor. I want the rows of data that correspond to the level in that factor which occurs the most times. So first you want to determine the mode (in the sense of the most frequently occuring value) of the factor. One way to do this is names(which.max(table(fac))) Use this comparison for the subset as subset(data, pattern == names(which.max(table(pattern Just be careful that if there are ties (i.e., more than one level having the max) which.max() will randomly pick one of them. That may or may not be what's desired. If that is a possibility, Mick will need to think what he wants in such cases. According to the documentation it picks the first one. Also, that's what Martin Maechler told me and he wrote the code so I trust him on that. I figure that if you have to trust someone to be meticulous and precise then a German-speaking Swiss is a good choice. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Valasz: patched
Kedves Ügyfelünk! Az [EMAIL PROTECTED] címre küldött levelét rendszerünk fogadta, munkatársunk hamarosan válaszol rá. Amennyiben Ön rendelési, szállítási vagy egyéb adminisztrációs problémával kapcsolatban írt nekünk, kérjük újra küldje el a levelet az [EMAIL PROTECTED] címre, hogy a rendelések feldolgozását végzõ kollégáink minél hamarabb segíteni tudjanak! Üdvözlettel: NetPiac Ügyfélszolgálat --- NetPiac DVD és VHS Online Áruház http://www.netpiac.hu --- [EMAIL PROTECTED] | Tel.: 239 4517 1135 Budapest, Szt. László út 60-64. Levélcím: 1325 Budapest Pf. 222. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Subsetting a data frame by a factor, using the level that occurs the most times
From: Douglas Bates Liaw, Andy wrote: From: Douglas Bates michael watson (IAH-C) wrote: I think that title makes sense... I hope it does... I have a data frame, one of the columns of which is a factor. I want the rows of data that correspond to the level in that factor which occurs the most times. So first you want to determine the mode (in the sense of the most frequently occuring value) of the factor. One way to do this is names(which.max(table(fac))) Use this comparison for the subset as subset(data, pattern == names(which.max(table(pattern Just be careful that if there are ties (i.e., more than one level having the max) which.max() will randomly pick one of them. That may or may not be what's desired. If that is a possibility, Mick will need to think what he wants in such cases. According to the documentation it picks the first one. Also, that's what Martin Maechler told me and he wrote the code so I trust him on that. I figure that if you have to trust someone to be meticulous and precise then a German-speaking Swiss is a good choice. My apologies! I got it mixed up with max.col, which does the tie-breaking. Andy __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Windows Front end-crash error
Dear List: First, many thanks to those who offered assistance while I constructed code for the simulation. I think I now have code that resolves most of the issues I encountered with memory. While the code works perfectly for smallish datasets with small sample sizes, it arouses a windows-based error with samples of 5,000 and 250 datasets. The error is a dialogue box with the following: R for Windows terminal front-end has encountered a problem and needs to close. We are sorry for the inconvenience. If you were in the middle of something, the information you were working on might be lost. The new code is below. Can anyone suggest whether this error is derived from inefficient code, or is it derived based on a windows specific issue that can somehow be resolved and if so, how. Thanks Harold library(MASS) library(nlme) mu-c(100,150,200,250) Sigma-matrix(c(400,80,80,80,80,400,80,80,80,80,400,80,80,80,80,400),4,4 ) mu2-c(0,0,0) LE-8^2 #Linking Error Sigma2-diag(LE,3) sample.size-5000 N-100 #Number of datasets #Take a single draw from VL distribution vl.error-mvrnorm(n=N, mu2, Sigma2) intercept1 - 0 slope1 - 0 intercept2 - 0 slope2 - 0 for(i in 1:N){ temp - data.frame(ID=seq(1:sample.size),mvrnorm(n=sample.size, mu,Sigma)) temp$X5 - temp$X1 temp$X6 - temp$X2 + vl.error[i,1] temp$X7 - temp$X3 + vl.error[i,2] temp$X8 - temp$X4 + vl.error[i,3] long-reshape(temp, idvar=ID, varying=list(c(X1,X2,X3,X4),c(X5,X6,X7,X8)), v.names=c(score.1,score.2),direction='long') glsrun1 - gls(score.1~I(time-1), data=long, correlation=corAR1(form=~1|ID), method='ML') glsrun2 - gls(score.2~I(time-1), data=long, correlation=corAR1(form=~1|ID), method='ML') intercept1[[i]] - glsrun1$coefficient[1] slope1[[i]] - glsrun1$coefficient[2] intercept2[[i]] - glsrun2$coefficient[1] slope2[[i]] - glsrun2$coefficient[2] } cat(Sample Size =, sample.size, \n) cat(Number of Datasets =, N, \n) cat(Vertical Linking Error =, LE, \n) cat(Original Standard Errors,\n, Intercept,\t, sd(intercept1),\n,Slope,\t,\t, sd(slope1),\n) cat(Modified Standard Errors,\n, Intercept,\t, sd(intercept2),\n,Slope,\t,\t, sd(slope2),\n) rm(list=ls()) gc() [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Cross-validation accuracy in SVM
Hi all - I am trying to tune an SVM model by optimizing the cross-validation accuracy. Maximizing this value doesn't necessarily seem to minimize the number of misclassifications. Can anyone tell me how the cross-validation accuracy is defined? In the output below, for example, cross-validation accuracy is 92.2%, while the number of correctly classified samples is (1476+170)/(1476+170+4) = 99.7% !? Thanks for any help. Regards - Ton --- Parameters: SVM-Type: C-classification SVM-Kernel: radial cost: 8 gamma: 0.007 Number of Support Vectors: 1015 ( 148 867 ) Number of Classes: 2 Levels: false true 5-fold cross-validation on training data: Total Accuracy: 92.24242 Single Accuracies: 90 93.3 94.84848 92.72727 90.30303 Contingency Table predclasses origclasses false true false 1476 0 true 4 170 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Cross-validation accuracy in SVM
The 99.7% accuracy you quoted, I take it, is the accuracy on the training set. If so, that number hardly means anything (other than, perhaps, self-fulfilling prophecy). Usually what one would want is for the model to be able to predict data that weren't used to train the model with high accuracy. That's what cross-validation tries to emulate. It gives you an estimate of how well you can expect your model to do on data that the model has not seen. Andy From: Ton van Daelen Hi all - I am trying to tune an SVM model by optimizing the cross-validation accuracy. Maximizing this value doesn't necessarily seem to minimize the number of misclassifications. Can anyone tell me how the cross-validation accuracy is defined? In the output below, for example, cross-validation accuracy is 92.2%, while the number of correctly classified samples is (1476+170)/(1476+170+4) = 99.7% !? Thanks for any help. Regards - Ton --- Parameters: SVM-Type: C-classification SVM-Kernel: radial cost: 8 gamma: 0.007 Number of Support Vectors: 1015 ( 148 867 ) Number of Classes: 2 Levels: false true 5-fold cross-validation on training data: Total Accuracy: 92.24242 Single Accuracies: 90 93.3 94.84848 92.72727 90.30303 Contingency Table predclasses origclasses false true false 1476 0 true 4 170 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Straight-line fitting with errors in both coordinates
Hi All, I want to fit a straight line into a group of two-dimensional data points with errors in both x and y coordinates. I found there is an algorithm provided in NUMERICAL RECIPES IN C http://www.library.cornell.edu/nr/bookcpdf/c15-3.pdf I'm wondering if there is a similar function for this implemented in R. And how can I change the objective function, from example, from sum of squared error to sum of absolute error? Regards, Chongle __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] easing out of Excel
Definitely check out the lattice package. One other option is to use sweave/latex mixed with RODBC. This can be used to produce PDF's for easy distribution as well. I would also consider operating this in a batch mode, the R/sweave/latex works very well this way. Shawn Way, PE Engineering Manager -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Greg Snow Sent: Thursday, January 20, 2005 10:52 AM To: r-help@stat.math.ethz.ch; [EMAIL PROTECTED] Subject: Re: [R] easing out of Excel Paul Sorenson [EMAIL PROTECTED] 01/19/05 03:18PM I know enough about R to be dangerous and our marketing people have asked me to automate some reporting. Data comes from an SQL source and graphs and various summaries are currently created manually in Excel. The raw information is invoicing records and the reporting is basically summaries by customer, region, product line etc. With function such as aggregate(), hist() and pareto() (which someone on this list kindly pointed me at) I can produce something roughly equivalent to the current reports. My question is, are there any neat R lock out features people here like to use on this kind of info, particularly when the output is very visual (report is intended for marketing people). Another way of looking at this is, What kind of hidden information can I extract with R that the Excel solution hasn't touched? Since you are looking for summaries within groups, you should look at the lattice package and some of the plots that you can produce with it (maybe for each product line you can produce a lattice/trellis graph with each panel representing a region and different colors symbols within panels to represent different customers). If we had more of an idea of what you are looking for, we could give better suggestions. For example, even the pareto plot mentioned earlier is something the Excel guys haven't thought of or can't easily produce. regards BTW the tool chain I am using goes something like: Production (run daily): DB - SQL/python - CSV - R/python - images - network Presentation: network - CGI/python - browser It looks like you want the reports fully automated and the final result as HTML (to be viewed with a browser), I suggest you look at the R2HTML package and the sweave function (this lets you write a report in HTML with r-code in place of graphs and output, then a quick run through sweave and you have a final report in HTML ready to be viewed). There are also several tools available for running R through CGI, go to: http://www-r.project.org/ and click on R web-servers under the Related Projects heading in the left column to get details. Hope this helps, Greg Snow, Ph.D. Statistical Data Center [EMAIL PROTECTED] (801) 408-8111 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Cross-validation accuracy in SVM
Ton van Daelen wrote: Hi all - I am trying to tune an SVM model by optimizing the cross-validation accuracy. Maximizing this value doesn't necessarily seem to minimize the number of misclassifications. Can anyone tell me how the cross-validation accuracy is defined? In the output below, for example, cross-validation accuracy is 92.2%, while the number of correctly classified samples is (1476+170)/(1476+170+4) = 99.7% !? Thanks for any help. Regards - Ton Percent correctly classified is an improper scoring rule. The percent is maximized when the predicted values are bogus. In addition, one can add a very important predictor and have the % actually decrease. Frank Harrell --- Parameters: SVM-Type: C-classification SVM-Kernel: radial cost: 8 gamma: 0.007 Number of Support Vectors: 1015 ( 148 867 ) Number of Classes: 2 Levels: false true 5-fold cross-validation on training data: Total Accuracy: 92.24242 Single Accuracies: 90 93.3 94.84848 92.72727 90.30303 Contingency Table predclasses origclasses false true false 1476 0 true 4 170 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Successful installation of R 2.0.1 on SUSE 9.1
Hi, We managed to compile R 2.0.1 on 64-bit SUSE Linux 9.1 on a HP Proliant setup fairly uneventfully by following instructions on the R installation guide. We did encounter a minor hiccup in setting up x11, a problem which we note has been raised 4 or 5 times previously, but this was overcome thanks to this recent post by Peter Dalgaard on SUSE 9.1 and R.https://stat.ethz.ch/pipermail/r-help/2005-January/062397.html, as well as previous comments on the mailing list. One clarification on that post may be helpful: there are only 3 additional developmental packages required for successful X11 installation. XFree86-devel-4.3.99.902-30 fontconfig-devel freetype2-devel These were not available in YAST (9.1 SUSE), but were located in http://ftp.suse.com/pub/suse/ Once again, thanks to the assorted R gurus and wizards for making this mailing list such a great resource. Regards, Min-Han Tan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to call R in delphi?
Dieter Menne [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] To call R from Delphi, you may try http://www.menne-biomed.de/download/RDComDelphi.zip. I downloaded this file and tried to compile the RDCom project using Delphi 5 and Delphi 7 but I get this message from both compilers: [Fatal Error] STATCONNECTORCLNTLib_TLB.pas(406): Could not create output file 'c:\program files\borland\delphi7\Twain\d5\dcu\STATCONNECTORCLNTLib_TLB.dcu' The \db\dcu in the path in this error message was a bit curious so I looked at Project | Options | Directories/Conditionals Unit output directory: $(DELPHI)\Twain\d5\dcu Search path: $(DELPHI)\Compon;C:\D2Pr\CascCont\COMPON;$(DELPHI)\Source\Toolsapi On my vanilla Delphi 5 and Delphi 7 installations all of the directories for the Unit output directory and Search path are invalid for the RDCom.dpr project. If I delete the Unit output directory, I then get 17 compilation errors, all like this: [Error] RCom.pas(115): Undeclared identifier: 'VarType' [Error] RCom.pas(141): Undeclared identifier: 'VarArrayDimCount' [Error] RCom.pas(123): Undeclared identifier: 'VarArrayHighBound' . . . All of the above seems to happen whether or not I install STATCONNECTORCLNTLib_TLB.pas and STATCONNECTORSRVLib_TLP.pas as components (i.e., Component | Install Component | browse to .pas file | Open | OK | Compile). Am I supposed to do this at some point? Can you give me any clues how to make this work? Something seems to be missing. Example program showing use of R from Delphi. Connecting to R via COM using Neuwirth's StatConnectorSrvLib Uses RCom.pas, which is a simple Delphi wrapper for passing commands, integer and double arrays. See http://cran.r-project.org/contrib/extra/dcom By: [EMAIL PROTECTED] I'm not sure I understand this either. I went to http://cran.r-project.org/contrib/extra/dcom I read this documentation: http://cran.r-project.org/contrib/extra/dcom/RSrv135.html I downloaded and installed the R(COM) server (and rebooted) http://cran.r-project.org/contrib/extra/dcom/RSrv135.exe So, how I can I call R from Delphi using R(COM)? Something seems to be missing. Duncan Murdoch's suggestion about direct calls to R.dll looks interesting, but a complete working example would be nice. Thanks for any help with this. efg Earl F. Glynn Scientific Programmer Stowers Institute for Medical Research __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Barplot at the axes of another plot
Hi, I want to draw a barplot at the axes of another plot. I saw that with two histogramms and a scatterplot in a R graphics tutorial somewhere on the net, seemed to be a 2d histogramm. Can someone figure out what I mean and give me a hint to create such a graphic? Thank you very much, Robin [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Barplot at the axes of another plot
On Thu, 2005-01-20 at 23:53 +0100, Robin Gruna wrote: Hi, I want to draw a barplot at the axes of another plot. I saw that with two histogramms and a scatterplot in a R graphics tutorial somewhere on the net, seemed to be a 2d histogramm. Can someone figure out what I mean and give me a hint to create such a graphic? Thank you very much, Robin See the examples in ?layout, which has the scatterplot with the marginal histograms. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Windows Front end-crash error
On Thu, 20 Jan 2005 13:16:13 -0500, Doran, Harold [EMAIL PROTECTED] wrote : Dear List: First, many thanks to those who offered assistance while I constructed code for the simulation. I think I now have code that resolves most of the issues I encountered with memory. While the code works perfectly for smallish datasets with small sample sizes, it arouses a windows-based error with samples of 5,000 and 250 datasets. The error is a dialogue box with the following: R for Windows terminal front-end has encountered a problem and needs to close. We are sorry for the inconvenience. If you were in the middle of something, the information you were working on might be lost. The new code is below. Can anyone suggest whether this error is derived from inefficient code, or is it derived based on a windows specific issue that can somehow be resolved and if so, how. It looks to me like an nlme bug. I get the error in R-patched (built Jan 15). DrMingw shows this at the time of the crash: Rgui.exe caused an Access Violation at location 01c8ae4b in module nlme.dll Reading from location 7f1e8f18. Registers: eax=7f210020 ebx= ecx=01368c50 edx=b1df esi=4e20 edi=01108918 eip=01c8ae4b esp=0022d1d0 ebp=0022d208 iopl=0 nv up ei ng nz ac po nc cs=001b ss=0023 ds=0023 es=0023 fs=003b gs= efl=0296 Call stack: 01C8AE4B nlme.dll:01C8AE4B gls_loglik 004E5E77 R.dll:004E5E77 do_dotCode ... I changed the loop to print some status lines, and it failed after the first time it printed gls2... library(MASS) library(nlme) set.seed(123) mu-c(100,150,200,250) Sigma-matrix(c(400,80,80,80,80,400,80,80,80,80,400,80,80,80,80,400),4,4 ) mu2-c(0,0,0) LE-8^2 #Linking Error Sigma2-diag(LE,3) sample.size-5000 N-100 #Number of datasets #Take a single draw from VL distribution vl.error-mvrnorm(n=N, mu2, Sigma2) intercept1 - 0 slope1 - 0 intercept2 - 0 slope2 - 0 for(i in 1:N){ print(i) flush.console() temp - data.frame(ID=seq(1:sample.size),mvrnorm(n=sample.size, mu,Sigma)) temp$X5 - temp$X1 temp$X6 - temp$X2 + vl.error[i,1] temp$X7 - temp$X3 + vl.error[i,2] temp$X8 - temp$X4 + vl.error[i,3] print(reshape...) flush.console() long-reshape(temp, idvar=ID, varying=list(c(X1,X2,X3,X4),c(X5,X6,X7,X8)), v.names=c(score.1,score.2),direction='long') print(gls1...) flush.console() glsrun1 - gls(score.1~I(time-1), data=long, correlation=corAR1(form=~1|ID), method='ML') print(gls2...) flush.console() glsrun2 - gls(score.2~I(time-1), data=long, correlation=corAR1(form=~1|ID), method='ML') intercept1[[i]] - glsrun1$coefficient[1] slope1[[i]] - glsrun1$coefficient[2] intercept2[[i]] - glsrun2$coefficient[1] slope2[[i]] - glsrun2$coefficient[2] } Hopefully this will let someone more familiar with nlme track it down. Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Plots with same x-axes
Hi, I want to plot two graphics on top of each other with layout(), a scatterplot and a barplot. The problems are the different x-axes ratios of the plots. How can I align the two x-axes? Thank you very much, Robin [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Plotting points from two vectors onto the same graph
Hello, I have three vectors defined as follows: x-c(10,20,30,40,50) y1-c(154,143,147,140,148) y2-c(178,178,171,188,180) I would like to plot y1 vs x and y2 vs x on the same graph. How might I do this? I have looked through a help file on plots but could not find the answer to plotting multiple plots on the same graph. Thank you for your help, K __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Need help to transform data into co-occurence matix
Dear R experts, I have the data in the following fomat(from some kind of card sorting process) ID Category Card numbers 1 1 1,2,5 1 2 3,4 2 1 1,2 2 2 3 2 3 4,5 I want to transform this data into two co-occurence matrix (one for each ID) -- For ID 1 1 2 3 4 5 1 1 1 0 0 1 2 1 1 0 0 1 3 0 0 1 1 0 4 0 0 1 1 0 5 1 1 0 0 1 -- For ID 2 1 2 3 4 5 1 1 1 0 0 0 2 1 1 0 0 0 3 0 0 1 0 0 4 0 0 0 1 1 5 0 0 0 1 1 The columns and rows are representing the card numbers. All 0s mean the card numbers are not in the same category, vice versa. Is there any way I can to this in R? I would really appreciate your help. Judie, Tie - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Plotting points from two vectors onto the same graph
?points has this example plot(-4:4, -4:4, type = n)# setting up coord. system points(rnorm(200), rnorm(200), col = red) points(rnorm(100)/2, rnorm(100)/2, col = blue, cex = 1.5) In general you might want to check out the keyword section of the help, in particular the Graphics section which has an entry called aplot for ways to add to existing plots. Tom -Original Message- From: K Fernandes [mailto:[EMAIL PROTECTED] Sent: Friday, 21 January 2005 9:51 AM To: r-help@stat.math.ethz.ch Subject: [R] Plotting points from two vectors onto the same graph Hello, I have three vectors defined as follows: x-c(10,20,30,40,50) y1-c(154,143,147,140,148) y2-c(178,178,171,188,180) I would like to plot y1 vs x and y2 vs x on the same graph. How might I do this? I have looked through a help file on plots but could not find the answer to plotting multiple plots on the same graph. Thank you for your help, K __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Plotting points from two vectors onto the same graph
On Thu, 2005-01-20 at 20:51 -0500, K Fernandes wrote: Hello, I have three vectors defined as follows: x-c(10,20,30,40,50) y1-c(154,143,147,140,148) y2-c(178,178,171,188,180) I would like to plot y1 vs x and y2 vs x on the same graph. How might I do this? I have looked through a help file on plots but could not find the answer to plotting multiple plots on the same graph. Thank you for your help, K First, when posting a new query, please do not do so by replying to an existing post. Your post is now listed in the archive linked to an entirely different thread. The easiest way to do this is to use the matplot() function: x - c(10,20,30,40,50) y1 - c(154,143,147,140,148) y2 - c(178,178,171,188,180) # now do the plot. cbind() the two sets of y values # and the x values with be cycled for each matplot(x, cbind(y1, y2), col = c(red, blue)) See ?matplot for more information. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Plots with same x-axes
On Fri, 2005-01-21 at 01:48 +0100, Robin Gruna wrote: Hi, I want to plot two graphics on top of each other with layout(), a scatterplot and a barplot. The problems are the different x-axes ratios of the plots. How can I align the two x-axes? Thank you very much, Robin Robin, Here is an example: # Set the layout, smaller plot on top for the # barplot region nf - layout(c(2, 1), heights = c(1, 3)) layout.show(nf) # Create the data x - rnorm(50) y - rnorm(50) # Set the margins for the scatterplot so that they will match with the # barplot settings par(mar = c(3, 3, 0, 3)) # now do the scatterplot plot(x, y) # Get the hist data for x xhist - hist(x, plot = FALSE) # Set the margins for the barplot to use more of the plot # region par(mar = c(0, 3, 1, 3)) # now plot that barplot on top # Set the 'space' argument to 0 so that the bars are # next to each other barplot(xhist$counts, axes = FALSE, space = 0) HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] easing out of Excel
Thanks for the responses to this question, I fully realise it is a rather open question and the open pointers are the kind of thing I am looking for. I will look into the lattice package and layout. Regarding the HTML output, the current tool chain assets that I have have been refactored over time and are almost totally driven by config files so they suit my purposes very well. I will look into other possibilities at a later date. For those looking for a more rigorous specification of the problem, you are well justified in this. I was deliberately fuzzy since managers just want stuff and I thought casting a wide net would pay off. The problem is to summarise information which is nothing more than sales data. The kinds of columns I am dealing with look like: date, customer, invoice_no, product, amount, sales_region, etc etc. Managers want to know things like: - which products are doing well - which regions are doing well - who are good customers - etc To me these are simple aggregates and sorts, with visual presentations to match. I figure a bit of effort, R can extract considerably more useful information from the data. To be honest I am just evolving it as I go, using an existing spreadsheet as a basis. I try something and if it is useful then great, if not, put it down to learning. cheers __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Need help to transform data into co-occurence matix
From: Judie Z Dear R experts, I have the data in the following fomat(from some kind of card sorting process) ID Category Card numbers 1 1 1,2,5 1 2 3,4 2 1 1,2 2 2 3 2 3 4,5 I want to transform this data into two co-occurence matrix (one for each ID) -- For ID 1 1 2 3 4 5 1 1 1 0 0 1 2 1 1 0 0 1 3 0 0 1 1 0 4 0 0 1 1 0 5 1 1 0 0 1 -- For ID 2 1 2 3 4 5 1 1 1 0 0 0 2 1 1 0 0 0 3 0 0 1 0 0 4 0 0 0 1 1 5 0 0 0 1 1 The columns and rows are representing the card numbers. All 0s mean the card numbers are not in the same category, vice versa. Is there any way I can to this in R? I would really appreciate your help. It depends on how the data are structured in R. Here's an example (I'm sure others can come up with more clever/efficient ways): cardlist - list(c(1,2,5), c(3,4)) indicator - function(i, n=max(i)) { x - rep(0, n); x[i] - 1; x} matrix(rowSums(sapply(cardlist, function(i) crossprod(t(indicator(i, 5), nrow=5) [,1] [,2] [,3] [,4] [,5] [1,]11001 [2,]11001 [3,]00110 [4,]00110 [5,]11001 which is the matrix for ID 1 in your example. HTH, Andy Judie, Tie - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Need help to transform data into co-occurence matix
Judie, You may want to see if the MedlineR library, which has a program for constructing co-occurrence matrices, will work for you. The program can be found at: http://dbsr.duke.edu/pub/MedlineR/ Have fun with it, Tim Liao Professor of Sociology Statistics University of Illinois Urbana, IL 61801 Original message Date: Thu, 20 Jan 2005 18:19:19 -0800 (PST) From: Judie Z [EMAIL PROTECTED] Subject: [R] Need help to transform data into co-occurence matix To: r-help@stat.math.ethz.ch Dear R experts, I have the data in the following fomat(from some kind of card sorting process) ID Category Card numbers 1 1 1,2,5 1 2 3,4 2 1 1,2 2 2 3 2 3 4,5 I want to transform this data into two co-occurence matrix (one for each ID) -- For ID 1 1 2 3 4 5 1 1 1 0 0 1 2 1 1 0 0 1 3 0 0 1 1 0 4 0 0 1 1 0 5 1 1 0 0 1 -- For ID 2 1 2 3 4 5 1 1 1 0 0 0 2 1 1 0 0 0 3 0 0 1 0 0 4 0 0 0 1 1 5 0 0 0 1 1 The columns and rows are representing the card numbers. All 0s mean the card numbers are not in the same category, vice versa. Is there any way I can to this in R? I would really appreciate your help. Judie, Tie - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] easing out of Excel
I hesitate to add this comment since it either completely confuses people or they take to it very quickly. The data that you are using is mostly categorical. I expect that tables will have been used in the past and that to acertain extent the graphics are suppossed to help with getting a quick understanding of the data. There is a package called vcd (Visualizing Categorical Data) which is useful for analysing this type of data. I like the use of the mosaicplot and in particular the shade parameter (which is based on standardized residuals). If set up properly it can be used to very quickly identify sales regions that are doing significantly better than they were last year, customers who have significantly reduced purchases. Basically if you can produce a table that would give this information then a shaded mosaicplot can efficiently highlight the significant parts of the table. They take a little bit of getting used to at first, but if you need to analyse this type of data they take a lot of the guess work out of making commentary on the data. How useful they are depends upon the users, who as I have said seem to be polarised in their reactions to the output. Tom -Original Message- From: Paul Sorenson [mailto:[EMAIL PROTECTED] Sent: Friday, 21 January 2005 11:33 AM To: r-help@stat.math.ethz.ch Subject: RE: [R] easing out of Excel Thanks for the responses to this question, I fully realise it is a rather open question and the open pointers are the kind of thing I am looking for. I will look into the lattice package and layout. Regarding the HTML output, the current tool chain assets that I have have been refactored over time and are almost totally driven by config files so they suit my purposes very well. I will look into other possibilities at a later date. For those looking for a more rigorous specification of the problem, you are well justified in this. I was deliberately fuzzy since managers just want stuff and I thought casting a wide net would pay off. The problem is to summarise information which is nothing more than sales data. The kinds of columns I am dealing with look like: date, customer, invoice_no, product, amount, sales_region, etc etc. Managers want to know things like: - which products are doing well - which regions are doing well - who are good customers - etc To me these are simple aggregates and sorts, with visual presentations to match. I figure a bit of effort, R can extract considerably more useful information from the data. To be honest I am just evolving it as I go, using an existing spreadsheet as a basis. I try something and if it is useful then great, if not, put it down to learning. cheers __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] dim vs length for vectors
Hi all, I'm not sure if this is a feature or a bug (and I did read the FAQ and the posting guide, but am still not sure). Some of my students have been complaining and I thought I just might ask: Let K be a vector of length k. If one types dim(K), you get NULL rather than [1] k. Is this logical? Here's the way I explain it (and maybe someone can provide a more accurate explanation of what's going on): R has several types of scalar (atomic) values, the most common of which are numeric, integer, logical, and character values. Arrays are data structures which hold only one type of atomic value. Arrays can be one-dimensional (vectors), two-dimensional (matrices), or n-dimensional. (We generally use arrays of n-1 dimensions to populate n-dimensional arrays -- thus, we generally use vectors to populate matrices, and matrices to populate 3-dimensional arrays, but could use any array of dimension n-1 to populate an n-dimensional array.) It logically follows that when one does dim() on a vector, one should *not* get NULL, but should get the length of the vector (which one *could* obtain by doing length(), but I think this is less logical). I think that R should save length() for lists that have objects of different dimension and type. Does this make sense? Or is there a better explanation? Thanks in advance! Yours, Olivia Lau __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] dim vs length for vectors
Olivia Lau olau at fas.harvard.edu writes: : : Hi all, : : I'm not sure if this is a feature or a bug (and I did read the : FAQ and the posting guide, but am still not sure). Some of my : students have been complaining and I thought I just might ask: : Let K be a vector of length k. If one types dim(K), you get : NULL rather than [1] k. Is this logical? : : Here's the way I explain it (and maybe someone can provide a : more accurate explanation of what's going on): R has several : types of scalar (atomic) values, the most common of which are : numeric, integer, logical, and character values. Arrays are : data structures which hold only one type of atomic value. : Arrays can be one-dimensional (vectors), two-dimensional : (matrices), or n-dimensional. : : (We generally use arrays of n-1 dimensions to populate : n-dimensional arrays -- thus, we generally use vectors to : populate matrices, and matrices to populate 3-dimensional : arrays, but could use any array of dimension n-1 to populate : an n-dimensional array.) : : It logically follows that when one does dim() on a vector, one : should *not* get NULL, but should get the length of the vector : (which one *could* obtain by doing length(), but I think this is : less logical). I think that R should save length() for lists : that have objects of different dimension and type. : In R, vectors are not arrays: R v - 1:4 R dim(v) NULL R is.array(v) [1] FALSE R a - array(1:4) R dim(a) [1] 4 R is.array(a) [1] TRUE __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] dim vs length for vectors
I think the more intuitive way to think of it is that dim works only for matrices (an array being a 1 column matrix). and vectors are not matrices. x - 1:5 class(x) # numeric dim(x) - 5 class(x) # array dim(x) - c(5,1) class(x) # matrix dim(x) - c(1,5) class(x) # matrix On Fri, 21 Jan 2005 05:35:11 + (UTC), Gabor Grothendieck [EMAIL PROTECTED] wrote: Olivia Lau olau at fas.harvard.edu writes: : : Hi all, : : I'm not sure if this is a feature or a bug (and I did read the : FAQ and the posting guide, but am still not sure). Some of my : students have been complaining and I thought I just might ask: : Let K be a vector of length k. If one types dim(K), you get : NULL rather than [1] k. Is this logical? : : Here's the way I explain it (and maybe someone can provide a : more accurate explanation of what's going on): R has several : types of scalar (atomic) values, the most common of which are : numeric, integer, logical, and character values. Arrays are : data structures which hold only one type of atomic value. : Arrays can be one-dimensional (vectors), two-dimensional : (matrices), or n-dimensional. : : (We generally use arrays of n-1 dimensions to populate : n-dimensional arrays -- thus, we generally use vectors to : populate matrices, and matrices to populate 3-dimensional : arrays, but could use any array of dimension n-1 to populate : an n-dimensional array.) : : It logically follows that when one does dim() on a vector, one : should *not* get NULL, but should get the length of the vector : (which one *could* obtain by doing length(), but I think this is : less logical). I think that R should save length() for lists : that have objects of different dimension and type. : In R, vectors are not arrays: R v - 1:4 R dim(v) NULL R is.array(v) [1] FALSE R a - array(1:4) R dim(a) [1] 4 R is.array(a) [1] TRUE __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Cholesky Decomposition
Can we do Cholesky Decompositon in R for any matrix - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] dim vs length for vectors
More generally, anything that has a dim attribute is an array including 1d, 2d, 3d structures with dim attributes. Matrices have a dim attribute so matrices are arrays and is.array(m) will be TRUE if m is a matrix. miguel manese jjonphl at gmail.com writes: : : I think the more intuitive way to think of it is that dim works only : for matrices (an array being a 1 column matrix). and vectors are not : matrices. : : x - 1:5 : class(x) # numeric : dim(x) - 5 : class(x) # array : dim(x) - c(5,1) : class(x) # matrix : dim(x) - c(1,5) : class(x) # matrix : : On Fri, 21 Jan 2005 05:35:11 + (UTC), Gabor Grothendieck : ggrothendieck at myway.com wrote: : Olivia Lau olau at fas.harvard.edu writes: : : : : : Hi all, : : : : I'm not sure if this is a feature or a bug (and I did read the : : FAQ and the posting guide, but am still not sure). Some of my : : students have been complaining and I thought I just might ask: : : Let K be a vector of length k. If one types dim(K), you get : : NULL rather than [1] k. Is this logical? : : : : Here's the way I explain it (and maybe someone can provide a : : more accurate explanation of what's going on): R has several : : types of scalar (atomic) values, the most common of which are : : numeric, integer, logical, and character values. Arrays are : : data structures which hold only one type of atomic value. : : Arrays can be one-dimensional (vectors), two-dimensional : : (matrices), or n-dimensional. : : : : (We generally use arrays of n-1 dimensions to populate : : n-dimensional arrays -- thus, we generally use vectors to : : populate matrices, and matrices to populate 3-dimensional : : arrays, but could use any array of dimension n-1 to populate : : an n-dimensional array.) : : : : It logically follows that when one does dim() on a vector, one : : should *not* get NULL, but should get the length of the vector : : (which one *could* obtain by doing length(), but I think this is : : less logical). I think that R should save length() for lists : : that have objects of different dimension and type. : : : : In R, vectors are not arrays: : : R v - 1:4 : R dim(v) : NULL : R is.array(v) : [1] FALSE : : R a - array(1:4) : R dim(a) : [1] 4 : R is.array(a) : [1] TRUE : : __ : R-help at stat.math.ethz.ch mailing list : https://stat.ethz.ch/mailman/listinfo/r-help : PLEASE do read the posting guide! http://www.R-project.org/posting- guide.html : : : __ : R-help at stat.math.ethz.ch mailing list : https://stat.ethz.ch/mailman/listinfo/r-help : PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html : : __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html