[R] Importing data
Hi, I'm trying to import categorical data from SPSS to R using the script: xxx -spss.get(xxx.por, use.value.labels=TRUE) but unfortunately am getting an error message 'error reading portable-file dictionary'. I have successfully imported data in the past. What could be the problem with this data? Thanks Simo Be a better friend, newshound, and [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] run setwd at the launch of R
Dear all, my R files (and the .csv files as well) are saved somewhere pretty deep down my hard disk. i have to chage to working directory therefore everytime i run R (i run it on powerPC mac), which is disgusting. using the setwd command at the beginning of an R script doesnt really help because i have to find this file first by hand. I am looking for possibility to run setwd during the launch process of R are straight after it ... any suggestions ? i would be very glad about good ideas or help ! thanks in advance matthias __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error .. missing value where TRUE/FALSE needed
When the error occurs, valueDiff is NA: Error in if ((seedCount = seedNumber) (valueDiff sup)) { : missing value where TRUE/FALSE needed valueDiff [1] NA Look at your loop; you are going through 100 times so on the last time you are trying to access: fcsPar[k+1] which is the 101th entry which is NA. Your program has a bug in it. On Jan 6, 2008 1:22 AM, Nicholas Crosbie [EMAIL PROTECTED] wrote: Can any explain the following error: Error in if ((seedCount = seedNumber) (valueDiff sup)) { : missing value where TRUE/FALSE needed which I get upon running this script: seedNumber - 10 seeds - array(dim = seedNumber) seedCount - 1 maxValue - 100 sup - maxValue / 2 fcsPar - array(as.integer(rnorm(100, 50, 10))) while (seedCount = seedNumber) { for(k in 1:100) { valueDiff - abs(fcsPar[k] - fcsPar[k+1]) if((seedCount = seedNumber) (valueDiff sup)) {#error seeds[seedCount] - fcsPar[k] seedCount - seedCount + 1 } } sup - sup / 2 } many thanks. Make the switch to the world's best email. Get the new Yahoo!7 Mail now. www.yahoo7.com.au/worldsbestemail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] run setwd at the launch of R
It may be disgusting, but I'm not how you expect R to know where to startup. On my Mac, I keep all my scripts in a per-project working directory. I therefore type cd ~/Documents/ataxia If you have multiple nested directories then why not create a directory alias (soft-link) so it is easy to cd to? Or move the relevant folders to a better place? Alternatively, use the Mac OS X GUI, which has an option in preferences about initial working directory. Mark On 06/01/2008, bunny , lautloscrew.com [EMAIL PROTECTED] wrote: Dear all, my R files (and the .csv files as well) are saved somewhere pretty deep down my hard disk. i have to chage to working directory therefore everytime i run R (i run it on powerPC mac), which is disgusting. using the setwd command at the beginning of an R script doesnt really help because i have to find this file first by hand. I am looking for possibility to run setwd during the launch process of R are straight after it ... any suggestions ? i would be very glad about good ideas or help ! thanks in advance matthias __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email __ -- Dr. Mark Wardle Specialist registrar, Neurology Cardiff, UK __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] run setwd at the launch of R
Thanks folks fo all the help. i just missed the part where to set the initial starting directory. i ´ll try Rprofile. thanks so much __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to use R for Beta Negative Binomial
I think I should have posted this question here as well. I am posting my question here since it is R related. Please see below. I originally posted this to sci.stat.math Nasser Abbasi [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] I think R documentation is a bit hard for me to sort out at this time. I was wondering if someone who knows R better than I do could please let me know the command syntax to find the mean of Beta Negative Binomial Distribution for the following parameters: n=3 alpha=0.5 beta=3 Here is the documenation page for R which mentions this distribution http://rweb.stat.umn.edu/R/library/SuppDists/html/ghyper.html Using Mathematica, I get (-18) for the mean and -150 for the variance, and wanted to verify this with R, since there is a negative sign which is confusing me. Mathematica says the formula for the mean is n*beta/(alpha-1) and that is why the negative sign comes up. alpha, beta, n can be any positive real numbers. If someone can just show me the R command for this, that will help, I have the R package SuppDists installed, I am just not sure how to use it for this distribution. thanks, Nasser I thought I should show what I did, this is R 2.6.1: tghyper(a=-1, k=-1, N=5) %I think this makes it do Beta Negative Binomail and now I used summary command, right? sghyper(3, .5, 3) But I do not think this is correct.Tried few other permitations. Hard for me to see how to set the parameters correctly for this distribution. thanks, Nasser __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Behavior of ordered factors in glm
David Winsemius wrote: I have a variable which is roughly age categories in decades. In the original data, it came in coded: str(xxx) 'data.frame': 58271 obs. of 29 variables: $ issuecat : Factor w/ 5 levels 0 - 39,40 - 49,..: 1 1 1 1... snip I then defined issuecat as ordered: xxx$issuecat-as.ordered(xxx$issuecat) When I include issuecat in a glm model, the result makes me think I have asked R for a linear+quadratic+cubic+quartic polynomial fit. The results are not terribly surprising under that interpretation, but I was hoping for only a linear term (which I was taught to called a test of trend), at least as a starting point. age.mdl-glm(actual~issuecat,data=xxx,family=poisson) summary(age.mdl) Call: glm(formula = actual ~ issuecat, family = poisson, data = xxx) Deviance Residuals: Min 1Q Median 3Q Max -0.3190 -0.2262 -0.1649 -0.1221 5.4776 Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) -4.313210.04865 -88.665 2e-16 *** issuecat.L 2.127170.13328 15.960 2e-16 *** issuecat.Q -0.065680.11842 -0.5550.579 issuecat.C 0.088380.09737 0.9080.364 issuecat^4 -0.027010.07786 -0.3470.729 This also means my advice to a another poster this morning may have been misleading. I have tried puzzling out what I don't understand by looking at indices or searching in MASSv2, the Blue Book, Thompson's application of R to Agresti's text, and the FAQ, so far without success. What I would like to achieve is having the lowest age category be a reference category (with the intercept being the log-rate) and each succeeding age category be incremented by 1. The linear estimate would be the log(risk-ratio) for increasing ages. I don't want the higher order polynomial estimates. Am I hoping for too much? David, What you are seeing is the impact of using ordered factors versus unordered factors. Reading ?options, you will note: contrasts: the default contrasts used in model fitting such as with aov or lm. A character vector of length two, the first giving the function to be used with unordered factors and the second the function to be used with ordered factors. By default the elements are named c(unordered, ordered), but the names are unused. The default in R (which is not the same as S-PLUS) is: options(contrasts) $contrasts unordered ordered contr.treatment contr.poly Thus, note that when using ordered factors, the default handling of factors is contr.poly. Reading ?contrast, you will note: contr.poly returns contrasts based on orthogonal polynomials. To show a quick and dirty example from ?glm: counts - c(18,17,15,20,10,20,25,13,12) outcome - gl(3,1,9) # First, the default with outcome as an unordered factor: summary(glm(counts ~ outcome, family=poisson())) Call: glm(formula = counts ~ outcome, family = poisson()) Deviance Residuals: Min 1Q Median 3Q Max -0.9666 -0.6712 -0.1696 0.8472 1.0494 Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) 3.0445 0.1260 24.165 2e-16 *** outcome2 -0.4543 0.2022 -2.247 0.0246 * outcome3 -0.2930 0.1927 -1.520 0.1285 ... # Now using outcome as an ordered factor: summary(glm(counts ~ as.ordered(outcome), family=poisson())) Call: glm(formula = counts ~ as.ordered(outcome), family = poisson()) Deviance Residuals: Min 1Q Median 3Q Max -0.9666 -0.6712 -0.1696 0.8472 1.0494 Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) 2.7954 0.0831 33.640 2e-16 *** as.ordered(outcome).L -0.2072 0.1363 -1.520 0.1285 as.ordered(outcome).Q 0.2513 0.1512 1.662 0.0965 . ... Unfortunately, MASSv2 is the only one of the four editions that I do not have for some reason. In MASSv4, this is covered starting on page 146. This is also covered in an Intro to R, in section 11.1.1 on contrasts. For typical clinical applications, the default treatment contrasts are sufficient, whereby the first level of the factor is considered the reference level and all others are compared against it. Thus, using unordered factors is more common, at least in my experience and likely the etiology of the difference between S-PLUS and R in this regard. HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Cumulative sum of vector
Hi, Maybe I have not been looking in the right spot, but, I have not been able to fine a command to automatically calculate the running cumulative sum of a vector. Is there such a command? Example of current code: eig$values [1] 678.365651 6.769697 2.853783 prop-eig$values/sum(eig$values) prop [1] 0.986012163 0.009839832 0.004148005 cum-c(prop[1],sum(prop[1:2]),sum(prop[1:3])) cum [1] 0.9860122 0.9958520 1.000 This works, but, if the length of the vector changes I have to manually change the code. Thanks, Keith Jones __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Importing data
Simo Vundla wrote: Hi, I'm trying to import categorical data from SPSS to R using the script: xxx -spss.get(xxx.por, use.value.labels=TRUE) but unfortunately am getting an error message 'error reading portable-file dictionary'. I have successfully imported data in the past. What could be the problem with this data? Thanks Simo First of all, follow the posting guide. Second, state which package you are using (in this case Hmisc). spss.get in Hmisc uses read.spss in the foreign package. See the documentation of read.spss for more details. You will find there: 'read.spss' reads a file stored by the SPSS 'save' and 'export' commands and returns a list. read.spss does not claim to be able to read SPSS .por files. Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cumulative sum of vector
Keith, are you looking for 'cumsum' ? Gabor On Sat, Jan 05, 2008 at 08:32:41AM -0600, Keith Jones wrote: Hi, Maybe I have not been looking in the right spot, but, I have not been able to fine a command to automatically calculate the running cumulative sum of a vector. Is there such a command? Example of current code: eig$values [1] 678.365651 6.769697 2.853783 prop-eig$values/sum(eig$values) prop [1] 0.986012163 0.009839832 0.004148005 cum-c(prop[1],sum(prop[1:2]),sum(prop[1:3])) cum [1] 0.9860122 0.9958520 1.000 This works, but, if the length of the vector changes I have to manually change the code. Thanks, Keith Jones __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Csardi Gabor [EMAIL PROTECTED]MTA RMKI, ELTE TTK __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cumulative sum of vector
On Jan 5, 2008 8:32 AM, Keith Jones [EMAIL PROTECTED] wrote: Hi, Maybe I have not been looking in the right spot, but, I have not been able to fine a command to automatically calculate the running cumulative sum of a vector. Is there such a command? Try help.search(cumulative sum) Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to use R for Beta Negative Binomial
On 06/01/2008 9:36 AM, Nasser Abbasi wrote: I think I should have posted this question here as well. I am posting my question here since it is R related. Please see below. I originally posted this to sci.stat.math Nasser Abbasi [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] I think R documentation is a bit hard for me to sort out at this time. I was wondering if someone who knows R better than I do could please let me know the command syntax to find the mean of Beta Negative Binomial Distribution for the following parameters: n=3 alpha=0.5 beta=3 Here is the documenation page for R which mentions this distribution http://rweb.stat.umn.edu/R/library/SuppDists/html/ghyper.html Using Mathematica, I get (-18) for the mean and -150 for the variance, and wanted to verify this with R, since there is a negative sign which is confusing me. A variance could not be negative, so clearly Mathematica has it wrong. Mathematica says the formula for the mean is n*beta/(alpha-1) and that is why the negative sign comes up. alpha, beta, n can be any positive real numbers. If someone can just show me the R command for this, that will help, I have the R package SuppDists installed, I am just not sure how to use it for this distribution. thanks, Nasser I thought I should show what I did, this is R 2.6.1: tghyper(a=-1, k=-1, N=5) %I think this makes it do Beta Negative Binomail It reports itself as tghyper(a=-1, k=-1, N=5) [1] type = IV -- x = 0,1,2,... which I believe indicates Beta-negative-binomial. and now I used summary command, right? sghyper(3, .5, 3) Why did you change the parameters? If you used the same ones as above, you get sghyper(a=-1, k=-1, N=5) $title [1] Generalized Hypergeometric $a [1] -1 $k [1] -1 $N [1] 5 $Mean [1] 0.2 $Median [1] 0 $Mode [1] 0 $Variance [1] 0.36 $SD [1] 0.6 $ThirdCentralMoment [1] 1.176 $FourthCentralMoment [1] 8.9712 $PearsonsSkewness...mean.minus.mode.div.SD [1] 0.333 $Skewness...sqrtB1 [1] 5.44 $Kurtosis...B2.minus.3 [1] 66.2 I don't know if those values are correct, but at least they aren't nonsensical like the ones you report from Mathematica. Duncan Murdoch But I do not think this is correct.Tried few other permitations. Hard for me to see how to set the parameters correctly for this distribution. thanks, Nasser __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Behavior of ordered factors in glm
Thank you, Dr Ripley. After some false starts and consulting MASS2, ChambersHastie and the help files, this worked acceptably. xxx$issuecat2-C(xxx$issuecat2,poly,1) attr(xxx$issuecat2,contrasts) .L 0-39 -6.324555e-01 40-49 -3.162278e-01 50-59 -3.287978e-17 60-69 3.162278e-01 70+6.324555e-01 exp.mdl-glm(actual~gendercat+issuecat2+smokecat, data=xxx,family=poisson,offset=expected) summary(exp.mdl) Deviance Residuals: Min 1Q Median 3Q Max -0.5596 -0.2327 -0.1671 -0.1199 5.2386 Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) -4.571250.06650 -68.743 2e-16 *** gendercatMale0.296600.06426 4.615 3.92e-06 *** issuecat2.L 2.091610.09354 22.360 2e-16 *** smokecatSmoker 0.221780.07870 2.818 0.00483 ** smokecatUnknown 0.023780.08607 0.276 0.78233 The reference category is different, but the effect of a one category increase in age-decade on the log(rate) is(2.09*0.316) = 0.6604 which seems acceptable agreement with my earlier as.numeric(factor) estimate of 0.6614. -- David Winsemius Prof Brian Ripley [EMAIL PROTECTED] wrote in news:[EMAIL PROTECTED]: Further to Duncan's comments, you can control factor codings via options(contrasts=), by setting contrasts() on the factor and via C(). This does enable you to code an ordered factor as a linear term, for example. The only place I know that this is discussed in any detail is in Bill Venables' account in MASS chapter 6. On Sat, 5 Jan 2008, Duncan Murdoch wrote: On 05/01/2008 7:16 PM, David Winsemius wrote: David Winsemius [EMAIL PROTECTED] wrote in news:[EMAIL PROTECTED]: I have a variable which is roughly age categories in decades. In the original data, it came in coded: str(xxx) 'data.frame': 58271 obs. of 29 variables: $ issuecat : Factor w/ 5 levels 0 - 39,40 - 49,..: 1 1 1 1... snip I then defined issuecat as ordered: xxx$issuecat-as.ordered(xxx$issuecat) When I include issuecat in a glm model, the result makes me think I have asked R for a linear+quadratic+cubic+quartic polynomial fit. The results are not terribly surprising under that interpretation, but I was hoping for only a linear term (which I was taught to call a test of trend), at least as a starting point. age.mdl-glm(actual~issuecat,data=xxx,family=poisson) summary(age.mdl) Call: glm(formula = actual ~ issuecat, family = poisson, data = xxx) Deviance Residuals: Min 1Q Median 3Q Max -0.3190 -0.2262 -0.1649 -0.1221 5.4776 Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) -4.313210.04865 -88.665 2e-16 *** issuecat.L 2.127170.13328 15.960 2e-16 *** issuecat.Q -0.065680.11842 -0.5550.579 issuecat.C 0.088380.09737 0.9080.364 issuecat^4 -0.027010.07786 -0.3470.729 This also means my advice to a another poster this morning may have been misleading. I have tried puzzling out what I don't understand by looking at indices or searching in MASSv2, the Blue Book, Thompson's application of R to Agresti's text, and the FAQ, so far without success. What I would like to achieve is having the lowest age category be a reference category (with the intercept being the log-rate) and each succeeding age category be incremented by 1. The linear estimate would be the log(risk-ratio) for increasing ages. I don't want the higher order polynomial estimates. Am I hoping for too much? I acheived what I needed by: xxx$agecat-as.numeric(xxx$issuecat) xxx$agecat-xxx$agecat-1 The results look quite sensible: exp.mdl-glm(actual~gendercat+agecat+smokecat, data=xxx, family=poisson, offset=expected) summary(exp.mdl) Call: glm(formula = actual ~ gendercat + agecat + smokecat, family = poisson, data = xxx, offset = expected) Deviance Residuals: Min 1Q Median 3Q Max -0.5596 -0.2327 -0.1671 -0.1199 5.2386 Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) -5.894100.11009 -53.539 2e-16 *** gendercatMale0.296600.06426 4.615 3.92e-06 *** agecat 0.661430.02958 22.360 2e-16 *** smokecatSmoker 0.221780.07870 2.818 0.00483 ** smokecatUnknown 0.023780.08607 0.276 0.78233 I remain curious about how to correctly control ordered factors, or I should just simply avoid them. If you're using a factor, R generally assumes you mean each level is a different category, so you get levels-1 parameters. If you don't want this, you shouldn't use a factor: convert to a numeric scale, just as you did. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible
[R] Can a dynamic graphic produced by rgl be saved?
Dear r-helpers, Can one save a dynamic graphic produced by rgl, e.g.: open3d(); x - sort(rnorm(1000)); y - rnorm(1000); z - rnorm(1000) + atan2(x,y); plot3d(x, y, z, col=rainbow(1000), size=2) as a dynamic figure that can be embedded in a pdf? _ Professor Michael Kubovy University of Virginia Department of Psychology USPS: P.O.Box 400400Charlottesville, VA 22904-4400 Parcels:Room 102Gilmer Hall McCormick RoadCharlottesville, VA 22903 Office:B011+1-434-982-4729 Lab:B019+1-434-982-4751 Fax:+1-434-982-4766 WWW:http://www.people.virginia.edu/~mk9y/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] CSVSource in tm Package
Hello I tried to use the CSVSource in the TextDocCol function in the tm package. But a) data from several columns is concatenated in one entry and b) data in a large text column is broken into several entries I hoped that it would be possible to assign columns as metadata to one entry with one specific column being the original text to analyze. Here is an example from the vignette (the backslash in the output is not in the original data): cars - system.file(texts, cars.csv, package = tm); tdc - TextDocCol(CSVSource(cars)) Read 5 items inspect(tdc) A text document collection with 5 text documents The metadata consists of 2 tag-value pairs and a data frame Available tags are: create_date creator Available variables in the data frame are: MetaID [[1]] [1] 1997,\Ford\,\Mustang\,\3000.00\ [[2]] [1] 1999,\Chevy\,\Venture\,4900.00 [[3]] [1] 1996,\Chrylser\,\Cherokee\,\4799.00\ [[4]] [1] 2005,\Ferrari\,\Modena\,\80999.00\ [[5]] [1] 1973,\Tank\,\\,\9900.00\ Also I have a question about the best workflow for text mining/analysis: My original data is in a mySQL table. Is it possible to import the data directly into TextDocCol without creating an intermediate csv file? I am using R.Version() $platform [1] powerpc-apple-darwin8.10.1 $arch [1] powerpc $os [1] darwin8.10.1 $system [1] powerpc, darwin8.10.1 $status [1] $major [1] 2 $minor [1] 6.1 $year [1] 2007 $month [1] 11 $day [1] 26 $`svn rev` [1] 43537 $language [1] R $version.string [1] R version 2.6.1 (2007-11-26) -- Armin Goralczyk, M.D. -- Universitätsmedizin Göttingen Abteilung Allgemein- und Viszeralchirurgie Rudolf-Koch-Str. 40 39099 Göttingen -- Dept. of General Surgery University of Göttingen Göttingen, Germany -- http://www.gwdg.de/~agoralc __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Importing data
Hi, you might try to use the foreign-package, which contains the function read.spss. This works fine most of the time, For a description of its usage, see the help-files or my own website: http://www.rensenieuwenhuis.nl/r-project/manual/basics/getting-data- into-r-2/ Remember, you'll need to install the foreign-package first. Hope this helps, Rense Nieuwenhuis On Jan 6, 2008, at 12:46 , Simo Vundla wrote: Hi, I'm trying to import categorical data from SPSS to R using the script: xxx -spss.get(xxx.por, use.value.labels=TRUE) but unfortunately am getting an error message 'error reading portable-file dictionary'. I have successfully imported data in the past. What could be the problem with this data? Thanks Simo __ __ Be a better friend, newshound, and [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to get residuals in factanal
In R factanal output, I can't find a function to give me residuals e. I mannually got it by using x -lamda1*f1 -lamda2*f2 - ... -lamdan*fn, but the e I got are not uncorrelated with all the f's. What did I do wrong? Please help. Yijun Be a better friend, newshound, and __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data frame manipulation - newbie question
Hi, you may want to use that apply / tapply function. Some find it a bit hard to grasp at first, but it will help you many times in many situations when you get the hang of it. Maybe you can get some information on my site: http:// www.rensenieuwenhuis.nl/r-project/manual/basics/tables/ Hope this helps, Rense Nieuwenhuis On Jan 3, 2008, at 11:53 , José Augusto M. de Andrade Junior wrote: Hi all, Could someone please explain how can i efficientily query a data frame with several factors, as shown below: -- --- Data frame: pt.knn -- --- row | k.idx | step.forwd | pt.num | model | prev | value | abs.error 1 2000 1 lm 09 10.5 1.5 2 2000 2 lm 11 10.5 1.5 3 2011 1 lm 10 12 2.0 4 2011 2 lm 12 12 2.0 5 2022 1 lm 12 12.1 0.1 6 2022 2 lm 12 12.1 0.1 7 2000 1 rlm 10.1 10.5 0.4 8 2000 2 rlm 10.3 10.5 0.2 9 2011 1 rlm 11.6 12 0.4 102011 2 rlm 11.4 12 0.6 112022 1 rlm 11.8 12.1 0.1 122022 2 rlm 11.9 12.1 0.2 -- k.idx, step.forwd, pt.num and model columns are FACTORS. prev, value, abs.error are numeric I need to take the mean value of the numeric columns (prev, value and abs.error) for each k.idx and step.forwd and model. So: rows 1 and 2, 3 and 4, 5 and 6,7 and 8, 9 and 10, 11 and 12 must be grouped together. Next, i need to plot a boxplot of the mean(abs.error) of each model for each k.idx. I need to compare the abs.error of the two models for each step and the mean overall abs.error of each model. And so on. I read the manuals, but the examples there are too simple. I know how to do this manipulation in a brute force manner, but i wish to learn how to work the right way with R. Could someone help me? Thanks in advance. José Augusto Undergraduate student University of São Paulo Business Administration Faculty __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Importing data
On Sun, 6 Jan 2008, Rense Nieuwenhuis wrote: Hi, you might try to use the foreign-package, which contains the function read.spss. This works fine most of the time, For a description of its usage, see the help-files or my own website: http://www.rensenieuwenhuis.nl/r-project/manual/basics/getting-data- into-r-2/ Remember, you'll need to install the foreign-package first. You shouldn't have to: it is supposed to come with every installation of R, and be installed unless you specifically opt out. Perhaps you meant 'first load the foreign package via library(foreign)'? [Re: Frank Harrell's comment, many people use .por for SPSS export files; that is the extension used in package foreign's tests directory. But the issue may well be that xxx.por is not an SPSS export file.] Hope this helps, Rense Nieuwenhuis On Jan 6, 2008, at 12:46 , Simo Vundla wrote: Hi, I'm trying to import categorical data from SPSS to R using the script: xxx -spss.get(xxx.por, use.value.labels=TRUE) but unfortunately am getting an error message 'error reading portable-file dictionary'. I have successfully imported data in the past. What could be the problem with this data? Thanks Simo -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can a dynamic graphic produced by rgl be saved?
?rgl.snapshot Michael Kubovy wrote: Dear r-helpers, Can one save a dynamic graphic produced by rgl, e.g.: open3d(); x - sort(rnorm(1000)); y - rnorm(1000); z - rnorm(1000) + atan2(x,y); plot3d(x, y, z, col=rainbow(1000), size=2) as a dynamic figure that can be embedded in a pdf? _ Professor Michael Kubovy University of Virginia Department of Psychology USPS: P.O.Box 400400Charlottesville, VA 22904-4400 Parcels:Room 102Gilmer Hall McCormick RoadCharlottesville, VA 22903 Office:B011+1-434-982-4729 Lab:B019+1-434-982-4751 Fax:+1-434-982-4766 WWW:http://www.people.virginia.edu/~mk9y/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Can-a-dynamic-graphic-produced-by-rgl-be-saved--tp14649977p14650635.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to get residuals in factanal
On Sun, 6 Jan 2008, Yijun Zhao wrote: In R factanal output, I can't find a function to give me residuals e. I mannually got it by using x -lamda1*f1 -lamda2*f2 - ... -lamdan*fn, but the e I got are not uncorrelated with all the f's. What did I do wrong? Please help. What did you use for 'f'? The factors ('scores') are latent quantities in factor analysis, and there is more than one way to predict them. Most likely your assumption of uncorrelatedness is not correct for the residuals and scores as you computed them. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Is there a R function for seasonal adjustment
Hi, I just discovered decompose() and stl(), both are very nice! I am wondering if R also has a function that calculates the seasonal index, or make the seasonal adjustment directly using the results generated from either decompose() or stl(). It seems that there should be one, but I couldn't find it. Does anyone know? Thanks, -- Tom [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cubic splines in package mgcv
On Wednesday 26 December 2007 04:14, Kunio takezawa wrote: R-users E-mail: r-help@r-project.org My understanding is that package mgcv is based on Generalized Additive Models: An Introduction with R (by Simon N. Wood). On the page 126 of this book, eq(3.4) looks a quartic equation with respect to x, not a cubic equation. I am wondering if all routines which uses cubic splines in mgcv are based on this quartic equation. --- No, `mgcv' does not use the basis given on page 126. See sections 4.1.2-4.1.8 of the same book for the bases used. In my humble opinion, the '^4' in the first term of the second line of this equation should be '^3'. --- Perhaps take a look at section 2.3.3 of Gu (2002) Smoothing Spline ANOVA for a bit more detail on this/ K. Takezawa -- Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK +44 1225 386603 www.maths.bath.ac.uk/~sw283 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] run setwd at the launch of R
I have not tried this but once you know where the (relatively permanent)working directory is then putting setwd(my.directory) in your R.profile should work' --- bunny , lautloscrew.com [EMAIL PROTECTED] wrote: Dear all, my R files (and the .csv files as well) are saved somewhere pretty deep down my hard disk. i have to chage to working directory therefore everytime i run R (i run it on powerPC mac), which is disgusting. using the setwd command at the beginning of an R script doesnt really help because i have to find this file first by hand. I am looking for possibility to run setwd during the launch process of R are straight after it ... any suggestions ? i would be very glad about good ideas or help ! thanks in advance matthias __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] GLM results different from GAM results without smoothing terms
On Thursday 03 January 2008 13:54, Prof Brian Ripley wrote: fit1 - glm(factor(x1)~factor(Round)+x2,family=binomial(link=probit)) fit2 - gam(factor(x1)~factor(Round)+x2,family=binomial(link=probit)) all.equal(fitted(fit1), fitted(fit2)) [1] TRUE so the fits to the data are the same: your error was in over-interpreting the parameters in the presence on non-identifiability. -- so coming back to the original question, mgcv::gam is using an SVD approach to rank deficiency in this case (so the minumum norm parameter vector is chosen amongst all those corresponding to the best fit), while glm is using a pivoted QR approach to rank deficiency, and effectively constraining redundant parameters to zero. On Thu, 3 Jan 2008, Daniel Malter wrote: Thanks much for your response. My apologies for not putting sample code in the first place. Here it comes: Round=rep(1:10,each=10) x1=rbinom(100,1,0.3) x2=rep(rnorm(10,0,1),each=10) summary(glm(factor(x1)~factor(Round)+x2,family=binomial(link=probit))) library(mgcv) summary(gam(factor(x1)~factor(Round)+x2,family=binomial(link=probit))) Cheers, Daniel - cuncta stricte discussurus - -Ursprüngliche Nachricht- Von: Prof Brian Ripley [mailto:[EMAIL PROTECTED] Gesendet: Thursday, January 03, 2008 2:13 AM An: Daniel Malter Cc: [EMAIL PROTECTED] Betreff: Re: [R] GLM results different from GAM results without smoothing terms On Wed, 2 Jan 2008, Daniel Malter wrote: Hi, I am fitting two models, a generalized linear model and a generalized additive model, to the same data. The R-Help tells that A generalized additive model (GAM) is a generalized linear model (GLM) in which the linear predictor is given by a user specified sum of smooth functions of the covariates plus a conventional parametric component of the linear predictor. I am fitting the GAM without smooth functions and would have expected the parameter estimates to be equal to the GLM. I am fitting the following model: reg.glm=glm(YES~factor(RoundStart)+DEP+SPD+S.S+factor(LOST),family=bin omial( link=probit)) reg.gam=gam(YES~factor(RoundStart)+DEP+SPD+S.S+factor(LOST),family=bin omial( link=probit)) DEP, SPD, S.S, and LOST are invariant across the observations within the same RoundStart. Therefore, I would expect to get NAs for these parameter estimates. So your design matrix is rank-deficient and there is an identifiability problem. I get NAs in GLM, but I get estimates in GAM. Can anyone explain why that is? Because there is more than one way to handle rank deficiency. There are two different 'gam' functions in contributed packages for R (and none in R itself), so we need more details: see the footer of this message. In glm() the NA estimates are treated as zero for computing predictions. Thanks much, Daniel https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK +44 1225 386603 www.maths.bath.ac.uk/~sw283 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to get residuals in factanal
The factanal was called with 'varimax' rotation. The factors scores are uncorrelated. But the residuals I got by using X - sum(loadings*factors-scores) is not uncorrelated to the factor scores. I thought the residuals should be independent to the factor scores as ?factanal says: == The factor analysis model is x = Lambda f + e for a p¨Celement row-vector x, a p x k matrix of loadings, a k¨Celement vector of scores and a p¨Celement vector of errors. None of the components other than x is observed, but the major restriction is that the scores be uncorrelated and of unit variance, and that the errors be independent with variances Phi, the uniquenesses. === Thank you. Yijun --- Prof Brian Ripley [EMAIL PROTECTED] wrote: On Sun, 6 Jan 2008, Yijun Zhao wrote: In R factanal output, I can't find a function to give me residuals e. I mannually got it by using x -lamda1*f1 -lamda2*f2 - ... -lamdan*fn, but the e I got are not uncorrelated with all the f's. What did I do wrong? Please help. What did you use for 'f'? The factors ('scores') are latent quantities in factor analysis, and there is more than one way to predict them. Most likely your assumption of uncorrelatedness is not correct for the residuals and scores as you computed them. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 Never miss a thing. Make Yahoo your home page. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can a dynamic graphic produced by rgl be saved?
On 06/01/2008 10:46 AM, Michael Kubovy wrote: Dear r-helpers, Can one save a dynamic graphic produced by rgl, e.g.: open3d(); x - sort(rnorm(1000)); y - rnorm(1000); z - rnorm(1000) + atan2(x,y); plot3d(x, y, z, col=rainbow(1000), size=2) as a dynamic figure that can be embedded in a pdf? rgl doesn't produce any format that remains dynamic. You can produce bitmap or (with some limitations) vector format snapshots, and you can put multiple bitmaps together into a movie (see movie3d(), for example). I don't know how to embed a movie into a pdf, but I assume it's possible. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] GLMMs fitted with lmer (R) glimmix (SAS
On Jan 4, 2008 6:21 PM, Andrea Previtali [EMAIL PROTECTED] wrote: Sorry, I realized that somehow the message got truncated. Here is the remaining part of the SAS output: Solutions for Fixed Effects: Effect DIST DW ELI SEX SEAS Estimate Std. Error DF t Value Pr |t| Intercept -4.6540 0.6878 17-6.77.0001 DIST*DW 00 1.4641 0.4115 3077 3.560.0004 DIST*DW 01 1.1333 0.4028 3077 2.810.0049 DIST*DW 10 1.3456 0.3745 3077 3.590.0003 DIST*DW 11 0 . . . . SEX*ELI 0 01.2633 0.4155 3077 3.040.0024 SEX*ELI 0 10.6569 0.4140 3077 1.59 0.1126 SEX*ELI 1 01.0728 0.4364 3077 2.460.0140 SEX*ELI 1 1 0 . . . . WT0.00758 0.01912 3077 0.400.6918 SEAS0 0.7839 0.1588 3077 4.94.0001 SEAS1 0 . . . . DEN -0.01343 0.002588 3077-5.19.0001 Type III Tests of Fixed Effects EffectNUM.DF DEN.DFF Value Pr F DIST*DW 33077 6.06 0.0004 SEX*ELI 33077 6.30 0.0003 WT13077 0.16 0.6918 SEAS 13077 24.37 .0001 DEN 13077 26.94 .0001 At least on my mail reader the copies of the output ended up with wrapped lines and, apparently, some changes in the spacing. I enclose two text files, glimmix.txt and glmer.txt, that are my reconstructions of the originals. Please let me know if I have not reconstructed them correctly. In particular, i don't think I got the first table of Solutions for Fixed Effects: in the glimmix.txt file correct. It seems to mix t statistics and F statistics in ways that I don't understand. Another thing I don't understand is what the Pseudo-Likelihood is. Perhaps it is what I would call the penalized weighted residual sum of squares. The likelihood reported by lmer and based on the binomial distribution is very different. If you want to compare coefficients I suggest using options(contrasts = c(contr.SAS, contr.poly) and assure that SEX, DIST, DW and ELI are factors, then call lmer. This will ensure that the SEX, DIST, DW and ELI terms and their interactions are represented by contrasts in which the last level is the reference level (the SAS convention) as opposed to the first level (the R convention). Also, you may be confusing the S language formula terms with the SAS formula terms. In R the asterisk denotes crossing of terms and the : is used for an interaction. Thus SEX*ELI is equivalent to SEX + ELI + SEX:ELI in R. In SAS, it is the interaction that is written as SEX*ELI. I suggest that you change your SAS formula to include main effects for SEX, ELI, DIST and DW. Generalized linear mixed model fit using PQL Formula: SURV ~ SEX * ELI + DW * DIST + SEAS + DEN + WT + (1 | SITE) Family: binomial(logit link) AIC BIC logLik deviance 1539 1606 -758.7 1517 Random effects: Groups NameVariance Std.Dev. SITE (Intercept) 0.27816 0.52741 number of obs: 3104, groups: SITE, 19 Estimated scale (compare to 1 ) 0.9458749 Fixed effects: Estimate Std. Errorz value Pr(|z|) (Intercept) -1.144259 0.458672 -2.495 0.012606 SEX -0.606026 0.167289 -3.623 0.000292 *** ELI -0.190757 0.219599 -0.869 0.385034 DW -0.328796 0.175882 -1.869 0.061565 . DIST-0.117745 0.374148 -0.315 0.752989 SEAS-0.784971 0.158748 -4.945 7.62e-07 *** DEN -0.013381 0.002585 -5.176 2.27e-07 *** WT 0.007735 0.019115 0.405 0.685732 SEX:ELI -0.466425 0.461596 -1.010 0.312274 DW:DIST -1.015454 0.404683 -2.509 0.012099 * Model Information Variance Matrix Blocked BySite Estimation Technique: Residual PL Degrees
[R] aggregate.ts help
Hi, I have a ts object with a frequency of 4, i.e., quarterly data, and I would like to calculate the mean for each quarter. So for example: ts.data=ts(1:20,start=c(1984,2),frequency=4) ts.data Qtr1 Qtr2 Qtr3 Qtr4 1984 123 19854567 198689 10 11 1987 12 13 14 15 1988 16 17 18 19 1989 20 If I do this manually, the mean for the 1st quarter would be mean(c(4,8,12,16,20)), which is 12. But I am wondering if there is a R function that could do this faster. I tried aggregate.ts but it didn't work: aggregate(ts.data,nfrequency=4,mean) Qtr1 Qtr2 Qtr3 Qtr4 1984 123 19854567 198689 10 11 1987 12 13 14 15 1988 16 17 18 19 1989 20 Does anyone know what am I doing wrong? -- Tom [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] aggregate.ts help
On Jan 6, 2008 5:17 PM, tom soyer [EMAIL PROTECTED] wrote: Hi, I have a ts object with a frequency of 4, i.e., quarterly data, and I would like to calculate the mean for each quarter. So for example: ts.data=ts(1:20,start=c(1984,2),frequency=4) ts.data Qtr1 Qtr2 Qtr3 Qtr4 1984 123 19854567 198689 10 11 1987 12 13 14 15 1988 16 17 18 19 1989 20 If I do this manually, the mean for the 1st quarter would be mean(c(4,8,12,16,20)), which is 12. But I am wondering if there is a R function that could do this faster. I tried aggregate.ts but it didn't work: aggregate(ts.data,nfrequency=4,mean) Qtr1 Qtr2 Qtr3 Qtr4 1984 123 19854567 198689 10 11 1987 12 13 14 15 1988 16 17 18 19 1989 20 Does anyone know what am I doing wrong? aggregate.ts aggregates to produce series of coarser granularity which is not what you want. You want the ordinary aggregate: aggregate(c(ts.data), list(qtr = cycle(ts.data)), mean) # or tapply: tapply(ts.data, cycle(ts.data), mean) See ?aggregate __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to get residuals in factanal
p.s. I tried to use both regression and Bartlett way to get the scores. In both cases, the scores are uncorrelated, but the errors are NOT uncorrelated to the scores, and are also NOT uncorrelated among themselves. What am I missing? factanal() are supposed to give independent errors vectors, so at least should be uncorrelated among themselves. Thank you in advance for the help. Yijun --- Yijun Zhao [EMAIL PROTECTED] wrote: The factanal was called with 'varimax' rotation. The factors scores are uncorrelated. But the residuals I got by using X - sum(loadings*factors-scores) is not uncorrelated to the factor scores. I thought the residuals should be independent to the factor scores as ?factanal says: == The factor analysis model is x = Lambda f + e for a p¨Celement row-vector x, a p x k matrix of loadings, a k¨Celement vector of scores and a p¨Celement vector of errors. None of the components other than x is observed, but the major restriction is that the scores be uncorrelated and of unit variance, and that the errors be independent with variances Phi, the uniquenesses. === Thank you. Yijun --- Prof Brian Ripley [EMAIL PROTECTED] wrote: On Sun, 6 Jan 2008, Yijun Zhao wrote: In R factanal output, I can't find a function to give me residuals e. I mannually got it by using x -lamda1*f1 -lamda2*f2 - ... -lamdan*fn, but the e I got are not uncorrelated with all the f's. What did I do wrong? Please help. What did you use for 'f'? The factors ('scores') are latent quantities in factor analysis, and there is more than one way to predict them. Most likely your assumption of uncorrelatedness is not correct for the residuals and scores as you computed them. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 Never miss a thing. Make Yahoo your home page. http://www.yahoo.com/r/hs Looking for last minute shopping deals? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] aggregate.ts help
Thanks Gabor!! On 1/6/08, Gabor Grothendieck [EMAIL PROTECTED] wrote: On Jan 6, 2008 5:17 PM, tom soyer [EMAIL PROTECTED] wrote: Hi, I have a ts object with a frequency of 4, i.e., quarterly data, and I would like to calculate the mean for each quarter. So for example: ts.data=ts(1:20,start=c(1984,2),frequency=4) ts.data Qtr1 Qtr2 Qtr3 Qtr4 1984 123 19854567 198689 10 11 1987 12 13 14 15 1988 16 17 18 19 1989 20 If I do this manually, the mean for the 1st quarter would be mean(c(4,8,12,16,20)), which is 12. But I am wondering if there is a R function that could do this faster. I tried aggregate.ts but it didn't work: aggregate(ts.data,nfrequency=4,mean) Qtr1 Qtr2 Qtr3 Qtr4 1984 123 19854567 198689 10 11 1987 12 13 14 15 1988 16 17 18 19 1989 20 Does anyone know what am I doing wrong? aggregate.ts aggregates to produce series of coarser granularity which is not what you want. You want the ordinary aggregate: aggregate(c(ts.data), list(qtr = cycle(ts.data)), mean) # or tapply: tapply(ts.data, cycle(ts.data), mean) See ?aggregate -- Tom [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] need help
Hi, I'm Roslina, PhD student of University of South Australia, Australia from school Maths and Stats. I use S-Plus before and now has started using R-package. I used to analyse rainfall data using julian date. Is there any similar function that you can suggest to me to be used in R-package? Thank you so much for your attention and help [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] testing fixed effects in lmer
Dear all, I am performing a binomial glmm analysis using the lmer function in the lme4 package (last release, just downloaded). I am using the Laplace method. However, I am not sure about what I should do to test for the significance of fixed effects in the binomial case: Is it correct to test a full model against a model from which I remove the fixed effect I want to test using the anova(mod1.lmer, mod2.lmer) method and then relying on the model with the lower AIC (or on the Log- likelihood test?)? I thank in advance for your help! best regards, Achaz von Hardenberg Centro Studi Fauna Alpina - Alpine Wildlife Research Centre Servizio Sanitario e della Ricerca Scientifica Parco Nazionale Gran Paradiso, Degioz, 11, 11010-Valsavarenche (Ao), Italy E-mail: [EMAIL PROTECTED] [EMAIL PROTECTED] Skype: achazhardenberg Tel.: +39.0165.905783 Fax: +39.0165.905506 Mobile: +39.328.8736291 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Can R solve this optimization problem?
Dear All, I am trying to solve the following maximization problem with R: find x(t) (continuous) that maximizes the integral of x(t) with t from 0 to 1, subject to the constraints dx/dt = u, |u| = 1, x(0) = x(1) = 0. The analytical solution can be obtained easily, but I am trying to understand whether R is able to solve numerically problems like this one. I have tried to find an approximate solution through discretization of the objective function but with no success so far. Thanks in advance, Paul __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] need help
Rosalina, You should start by reading the Posting Guide - it has helpful advice on how to solve a problem yourself and how to craft postings to get good answers. The Posting Guide says: [some basics deleted] Do your homework before posting: If it is clear that you have done basic background research, you are far more likely to get an informative response. See also Further Resources further down this page. * Do help.search(keyword) and apropos(keyword) with different keywords (type this at the R prompt). [other helpful suggestions deleted] - and doing exactly that on my system yields: Help files with alias or concept or title matching .julian. using fuzzy matching: weekdays(base) Extract Parts of a POSIXt or Date Object day.of.week(chron) Convert between Julian and Calendar Dates TimeDateCoercion(fCalendar) timeDate Class, Coercion and Transformation date.ddmmmyy(survival) Format a Julian date date.mdy(survival) Convert from Julian Dates to Month, Day, and Year date.mmddyy(survival) Format a Julian date date.mmdd(survival) Format a Julian date mdy.date(survival) Convert to Julian Dates Type 'help(FOO, package = PKG)' to inspect entry 'FOO(PKG) TITLE'. - which should be enough to get you going. Also, you will want to consult Rnews which had an informative article on handling dates a few years back. HTH, Chuck On Mon, 7 Jan 2008, Zakaria, Roslinazairimah - zakry001 wrote: Hi, I'm Roslina, PhD student of University of South Australia, Australia from school Maths and Stats. I use S-Plus before and now has started using R-package. I used to analyse rainfall data using julian date. Is there any similar function that you can suggest to me to be used in R-package? Thank you so much for your attention and help [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:[EMAIL PROTECTED] UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Avoiding FOR loops
useR's, I would like to know if there is a way to avoid using FOR loops to perform the below calculation. Consider the following data: x [,1] [,2] [,3] [1,]4 111 [2,]192 [3,]733 [4,]364 [5,]685 xk Var1 Var2 Var3 1 -0.25 1.75 0.5 20.75 1.75 0.5 31.75 1.75 0.5 42.75 1.75 0.5 53.75 1.75 0.5 64.75 1.75 0.5 75.75 1.75 0.5 86.75 1.75 0.5 97.75 1.75 0.5 10 -0.25 2.75 0.5 Here, X is a matrix of 3 variables in which each is of size 5 and XK are some values that correspond to each variable. For each variable, I want to do: |Xi - xkj| where i = 1 to 3 and j = 1 to 10 It looks as if a double FOR loop would work, but can the apply function work? Or some other function that is shorter than a FOR loop? Thank you, I hope this makes sense. Derek -- View this message in context: http://www.nabble.com/Avoiding-FOR-loops-tp14656517p14656517.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can R solve this optimization problem?
On 06/01/2008 7:55 PM, Paul Smith wrote: On Jan 7, 2008 12:18 AM, Duncan Murdoch [EMAIL PROTECTED] wrote: I am trying to solve the following maximization problem with R: find x(t) (continuous) that maximizes the integral of x(t) with t from 0 to 1, subject to the constraints dx/dt = u, |u| = 1, x(0) = x(1) = 0. The analytical solution can be obtained easily, but I am trying to understand whether R is able to solve numerically problems like this one. I have tried to find an approximate solution through discretization of the objective function but with no success so far. R doesn't provide any way to do this directly. If you really wanted to do it in R, you'd need to choose some finite dimensional parametrization of u (e.g. as a polynomial or spline, but the constraint on it would make the choice tricky: maybe a linear spline?), then either evaluate the integral analytically or numerically to give your objective function. Then there are some optimizers available, but in my experience they aren't very good on high dimensional problems: so your solution would likely be quite crude. I'd guess you'd be better off in Matlab, Octave, Maple or Mathematica with a problem like this. Thanks, Duncan. I have placed a similar post in the Maxima list and another one in the Octave list. (I have never used splines; so I did not quite understand the method that you suggested to me.) Linear splines are just piecewise linear functions. An easy way to parametrize them is by their value at a sequence of locations; they interpolate linearly between there. x would be piecewise quadratic, so its integral would be a sum of cubic terms. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Avoiding FOR loops
On Jan 6, 2008, at 7:55 PM, dxc13 wrote: useR's, I would like to know if there is a way to avoid using FOR loops to perform the below calculation. Consider the following data: snip Here, X is a matrix of 3 variables in which each is of size 5 and XK are some values that correspond to each variable. For each variable, I want to do: |Xi - xkj| where i = 1 to 3 and j = 1 to 10 That should be i=1 to 5 I take it? If I understand what you want to do, then the outer function is the key: lapply(1:3, function(i) { outer(x[,i], xk[,i], -) } ) This should land you with a list of three 5x10 tables It looks as if a double FOR loop would work, but can the apply function work? Or some other function that is shorter than a FOR loop? Thank you, I hope this makes sense. Derek Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can R solve this optimization problem?
On Jan 7, 2008 1:04 AM, Duncan Murdoch [EMAIL PROTECTED] wrote: I am trying to solve the following maximization problem with R: find x(t) (continuous) that maximizes the integral of x(t) with t from 0 to 1, subject to the constraints dx/dt = u, |u| = 1, x(0) = x(1) = 0. The analytical solution can be obtained easily, but I am trying to understand whether R is able to solve numerically problems like this one. I have tried to find an approximate solution through discretization of the objective function but with no success so far. R doesn't provide any way to do this directly. If you really wanted to do it in R, you'd need to choose some finite dimensional parametrization of u (e.g. as a polynomial or spline, but the constraint on it would make the choice tricky: maybe a linear spline?), then either evaluate the integral analytically or numerically to give your objective function. Then there are some optimizers available, but in my experience they aren't very good on high dimensional problems: so your solution would likely be quite crude. I'd guess you'd be better off in Matlab, Octave, Maple or Mathematica with a problem like this. Thanks, Duncan. I have placed a similar post in the Maxima list and another one in the Octave list. (I have never used splines; so I did not quite understand the method that you suggested to me.) Linear splines are just piecewise linear functions. An easy way to parametrize them is by their value at a sequence of locations; they interpolate linearly between there. x would be piecewise quadratic, so its integral would be a sum of cubic terms. Thanks, Duncan, for your explanation. Paul __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can R solve this optimization problem?
This can be discretized to a linear programming problem so you can solve it with the lpSolve package. Suppose we have x0, x1, x2, ..., xn. Our objective (up to a multiple which does not matter) is: Maximize: x1 + ... + xn which is subject to the constraints: -1/n = x1 - x0 = 1/n -1/n = x2 - x1 = 1/n ... -1/n = xn - x[n-1] = 1/n and x0 = xn = 0 On Jan 6, 2008 7:05 PM, Paul Smith [EMAIL PROTECTED] wrote: Dear All, I am trying to solve the following maximization problem with R: find x(t) (continuous) that maximizes the integral of x(t) with t from 0 to 1, subject to the constraints dx/dt = u, |u| = 1, x(0) = x(1) = 0. The analytical solution can be obtained easily, but I am trying to understand whether R is able to solve numerically problems like this one. I have tried to find an approximate solution through discretization of the objective function but with no success so far. Thanks in advance, Paul __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data frame manipulation - newbie question
There are a number of different ways that you would have to manipulate your data to do what you want. It is useful to learn some of these techniques. Here, I think, are the set of actions that you want to do. x - read.table(textConnection(row k.idx step.forwd pt.nummodel prev valueabs.error + 1 2000 1 lm 09 10.5 1.5 + 2 2000 2 lm 11 10.5 1.5 + 3 2011 1 lm 10 12 2.0 + 4 2011 2 lm 12 12 2.0 + 5 2022 1 lm 12 12.1 0.1 + 6 2022 2 lm 12 12.1 0.1 + 7 2000 1 rlm 10.1 10.5 0.4 + 8 2000 2 rlm 10.3 10.5 0.2 + 9 2011 1 rlm 11.6 12 0.4 + 102011 2 rlm 11.4 12 0.6 + 112022 1 rlm 11.8 12.1 0.1 + 122022 2 rlm 11.9 12.1 0.2), header=TRUE) closeAllConnections() # split the data by the grouping factors x.split - split(x, list(x$k.idx, x$step.forwd, x$model), drop=TRUE) x.split $`200.0.lm` row k.idx step.forwd pt.num model prev value abs.error 1 1 200 0 1lm9 10.5 1.5 2 2 200 0 2lm 11 10.5 1.5 $`201.1.lm` row k.idx step.forwd pt.num model prev value abs.error 3 3 201 1 1lm 1012 2 4 4 201 1 2lm 1212 2 $`202.2.lm` row k.idx step.forwd pt.num model prev value abs.error 5 5 202 2 1lm 12 12.1 0.1 6 6 202 2 2lm 12 12.1 0.1 $`200.0.rlm` row k.idx step.forwd pt.num model prev value abs.error 7 7 200 0 1 rlm 10.1 10.5 0.4 8 8 200 0 2 rlm 10.3 10.5 0.2 $`201.1.rlm` row k.idx step.forwd pt.num model prev value abs.error 99 201 1 1 rlm 11.612 0.4 10 10 201 1 2 rlm 11.412 0.6 $`202.2.rlm` row k.idx step.forwd pt.num model prev value abs.error 11 11 202 2 1 rlm 11.8 12.1 0.1 12 12 202 2 2 rlm 11.9 12.1 0.2 # now take the means of given columns x.mean - lapply(x.split, function(.grp) colMeans(.grp[, c('prev', 'value', 'abs.error')])) # put back into a matrix (x.mean - do.call(rbind, x.mean)) prev value abs.error 200.0.lm 10.00 10.5 1.50 201.1.lm 11.00 12.0 2.00 202.2.lm 12.00 12.1 0.10 200.0.rlm 10.20 10.5 0.30 201.1.rlm 11.50 12.0 0.50 202.2.rlm 11.85 12.1 0.15 #boxplot boxplot(abs.error ~ k.idx, data=x) # create a table with average of the abs.error for each 'model' cbind(x, abs.error.mean=ave(x$abs.error, x$model)) row k.idx step.forwd pt.num model prev value abs.error abs.error.mean 11 200 0 1lm 9.0 10.5 1.5 1.200 22 200 0 2lm 11.0 10.5 1.5 1.200 33 201 1 1lm 10.0 12.0 2.0 1.200 44 201 1 2lm 12.0 12.0 2.0 1.200 55 202 2 1lm 12.0 12.1 0.1 1.200 66 202 2 2lm 12.0 12.1 0.1 1.200 77 200 0 1 rlm 10.1 10.5 0.4 0.317 88 200 0 2 rlm 10.3 10.5 0.2 0.317 99 201 1 1 rlm 11.6 12.0 0.4 0.317 10 10 201 1 2 rlm 11.4 12.0 0.6 0.317 11 11 202 2 1 rlm 11.8 12.1 0.1 0.317 12 12 202 2 2 rlm 11.9 12.1 0.2 0.317 On Jan 6, 2008 10:50 AM, Rense Nieuwenhuis [EMAIL PROTECTED] wrote: Hi, you may want to use that apply / tapply function. Some find it a bit hard to grasp at first, but it will help you many times in many situations when you get the hang of it. Maybe you can get some information on my site: http:// www.rensenieuwenhuis.nl/r-project/manual/basics/tables/ Hope this helps, Rense Nieuwenhuis On Jan 3, 2008, at 11:53 , José Augusto M. de Andrade Junior wrote: Hi all, Could someone please explain how can i efficientily query a data frame with several factors, as shown below: -- --- Data frame: pt.knn -- --- row | k.idx
Re: [R] Can R solve this optimization problem?
On Jan 7, 2008 1:32 AM, Gabor Grothendieck [EMAIL PROTECTED] wrote: This can be discretized to a linear programming problem so you can solve it with the lpSolve package. Suppose we have x0, x1, x2, ..., xn. Our objective (up to a multiple which does not matter) is: Maximize: x1 + ... + xn which is subject to the constraints: -1/n = x1 - x0 = 1/n -1/n = x2 - x1 = 1/n ... -1/n = xn - x[n-1] = 1/n and x0 = xn = 0 On Jan 6, 2008 7:05 PM, Paul Smith [EMAIL PROTECTED] wrote: Dear All, I am trying to solve the following maximization problem with R: find x(t) (continuous) that maximizes the integral of x(t) with t from 0 to 1, subject to the constraints dx/dt = u, |u| = 1, x(0) = x(1) = 0. The analytical solution can be obtained easily, but I am trying to understand whether R is able to solve numerically problems like this one. I have tried to find an approximate solution through discretization of the objective function but with no success so far. Thats is clever, Gabor! But suppose that the objective function is integral of sin( x( t ) ) with t from 0 to 1 and consider the same constraints. Can your method be adapted to get the solution? Paul __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Installing R on ubuntu dapper
I followed the instructions at http://cran.r-project.org/bin/linux/ubuntu/README.html, but I'm getting the following error: ~: sudo apt-get install r-base Reading package lists... Done Building dependency tree... Done Some packages could not be installed. This may mean that you have requested an impossible situation or if you are using the unstable distribution that some required packages have not yet been created or been moved out of Incoming. Since you only requested a single operation it is extremely likely that the package is simply not installable and a bug report against that package should be filed. The following information may help to resolve the situation: The following packages have unmet dependencies: r-base: Depends: r-base-core (= 2.6.1-1dapper0) but it is not installable Depends: r-recommended (= 2.6.1-1dapper0) but it is not going to be installed E: Broken packages Any help would be much appreciated. Thanks, Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] need help
Hi, I'm Roslina, PhD student of University of South Australia, Australia from school Maths and Stats. I use S-Plus before and now has started using R-package. I used to analyse rainfall data using julian date. Is there any similar function that you can suggest to me to be used in R-package? Thank you so much for your attention and help. Here are some of my codes: # dt1-data # mt1-begin month, mt2- end month, nn=year, n= no of years # da - days in the following month # yr1 - year begin define.date1-function(dt1,mt1,mt2,nn,da) { mt2-mt2+1 start-julian(mt1, 1, nn, origin=c(month=1, day=1, year=1971))+1 end-julian(mt2, 1, nn, origin=c(month=1, day=1, year=1971))+da a-dt1[start:end,] #am-as.matrix(a[,5]) } seq.date1-function(dt1,mt1,mt2,n,yr1,da) { yr1-yr1-1 for (i in 1:n) { kp1-define.date1(dt1,mt1,mt2,yr1+i,da) if (i==1) kp2-kp1 else kp2-rbind(kp2,kp1) } kp2 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] rainbow function
Hello I'm using rainbow function to generate 10 colors for the plot and it is difficult to tell the neighboring colors from each other. How can I make the colors more differently. Thanks Zhaoming [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rainbow function
Specify them exactly if there are only 10. On Jan 6, 2008 10:55 PM, Wang, Zhaoming (NIH/NCI) [C] [EMAIL PROTECTED] wrote: Hello I'm using rainbow function to generate 10 colors for the plot and it is difficult to tell the neighboring colors from each other. How can I make the colors more differently. Thanks Zhaoming [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] compiling with mpicc for R
Dear R People: Hao Yu has a very nice package for mpi in R. I'm trying to experiment on my own and am looking at building a shared library with objects from mpicc. I tried to compile a .o object and then use R CMD SHLIB to compile the shared library. But I'm getting errors with the MPI_Init function, which is the first MPI function in the subroutine. Any suggestions please? (or maybe i should just leave well enough alone) thanks, Erin Hodgess mailto: [EMAIL PROTECTED] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] numerical data frame
Dear All, I've successfully import my synteny data to R by using scan command. Below show my results. My major problem with my data is how am i going to combine the column names with the data( splt) where i have tried on cbind but a warning message occur. I have realized that the splt data only have 5 column instead of 6. Please help me with this!! I want my data to be a numerical data with a proper column and column names and to replace CS with 1 and CSO with 0 and also to get remove all the punctuations and the characters from the data. Attach herewith is my original data. Your kindly help is highly appreciated and thanks in advance. Cheers, Anisah 1)for col names nms-scan(C:/Users/user/Documents/cfa-1.txt,sep=\t,nlines=1,skip=10,what=character(0)) Read 6 items nms [1] CS(O) id (number of marker/anchor) [2] Location(s) on reference [3] CS(O) size [4] CS(O) density on reference chromosome [5] Location(s) on tested [6] Breakpoints CS(O) locations (denstiy of marker/anchor) 2) my data x-scan(C:/Users/user/Documents/cfa-1.txt,sep=\n,skip=12,what=character(0)) Read 21 items splt-strsplit(x,\t) splt [[1]] [1] CS 1 (73): cfa1: [ 3251712 - 24126920 ] [3] 20875208 3 [5] hsa18: [ 132170848 - 50139168 ] ] 24126920, 24153560 [(8 ) [[2]] [1] CS 2 (3): cfa1: [ 24153560 - 24265894 ] [3] 112334 27 [5] hsa18: [ 50105060 - 49934572 ] ] 24265894, 24823786 [(7 ) [[3]] [1] CSO 3.1 (6): [2] cfa1: [ 24823786 - 27113036 ] [3] 2289250 [4] 3 [5] hsa18: [ 48121156 - 46579500 ]- Decreasing order - ] 27113036, 27418228 [ (13) [[4]] [1] CSO 3.2 (4): [2] cfa1: [ 27418228 - 27578150 ] [3] 159922 [4] 25 [5] hsa18: [ 13872043 - 13208795 ]- Decreasing order - ] 27578150, 28055666 [(9 ) [[5]] [1] CS 4 (4): cfa1: [ 28055666 - 28835230 ] [3] 779564 5 [5] hsa6: [ 132311008 - 133132200 ] ] 28835230, 29482792 [(7 ) [[6]] [1] CS 5 (46): cfa1: [ 29482792 - 40120672 ] [3] 10637880 4 [5] hsa6: [ 133604208 - 146227152 ] ] 40120672, 40539680 [(8 ) [[7]] [1] CS 6 (9): cfa1: [ 40539680 - 43339444 ] [3] 2799764 3 [5] hsa6: [ 146390608 - 149867328 ] ] 43339444, 43390788 [(13 ) [[8]] [1] CSO 7.1 (74): [2] cfa1: [ 43390788 - 59714992 ] [3] 16324204 [4] 5 [5] hsa6: [ 149929104 - 169714432 ]- Increasing order -] 59714992, 59864308 [ (15) [[9]] [1] CSO 7.2 (52): [2] cfa1: [ 59864308 - 72417520 ] [3] 12553212 [4] 4 [5] hsa6: [ 116707976 - 131508152 ]- Increasing order - [6] ] 72417520, 73256040 [(7 ) [[10]] [1] CSO 8.1 (12): [2] cfa1: [ 73256040 - 75192808 ] [3] 1936768 [4] 6 [5] hsa9: [ 98441680 - 96360824 ]- Decreasing order - [6] ] 75192808, 75272528 [ [7] (6 ) [[11]] [1] CSO 8.2 (56): [2] cfa1: [ 75272528 - 91881664 ] [3] 16609136 [4] 3 [5] hsa9: [ 89530256 - 70341312 ]- Decreasing order - [6] ] 91881664, 92281272 [ [7] (5 ) [[12]] [1] CSO 8.3 (22): [2] cfa1: [ 92281272 - 96913624 ]