[R] - detecting outliers
Hello all, I am estimating parameters for regression functions on experimental data. Functional response of Rogers type II. I would like to know which points of my dataset are outliers. What is the best method to do this with R? I found a method via R help, but would like to know if there are better methods for my purpose. Here is the script I us now: library(mvoutlier) dat - read.delim(C:/data.txt) uni.plot(dat) My data looks like the following (copied into a txt file): (N0 is the initial number of eggs fed to the predator, FR is the number of eggs eaten by the predator during 24h) N0 FR 37 30 27 15 36 14 37 13 45 8 25 0 47 20 34 6 25 8 21 7 24 24 34 17 23 10 29 5 38 38 24 24 20 17 14 8 18 15 15 10 26 5 33 5 22 21 38 3 22 20 23 19 20 6 20 4 21 18 25 5 13 13 9 8 8 4 7 7 8 5 11 9 Kind regards, Met vriendelijke groeten, Joachim Don't waste paper! Think about the environment before printing this e-mail __ Joachim Audenaert Adviesdienst Gewasbescherming Proefcentrum voor Sierteelt Schaessestraat 18 B-9070 Destelbergen Belgium Tel. +32 9 353 94 71 Fax +32 9 353 94 95 E-mail: joachim.audena...@pcsierteelt.be www.pcsierteelt.be __ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating frequency table using conditions in a for-loop
Hi, Where you attached the file? You can share your problem here as well. Ozgur -- View this message in context: http://r.789695.n4.nabble.com/Creating-frequency-table-using-conditions-in-a-for-loop-tp4632630p4632638.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] conditional statement to replace values in dataframe with NA
On Jun 7, 2012, at 07:28 , Bert Gunter wrote: Actually, recycling makes the rep(NA,2) business unnecessary. Simply: dat1[dat1$x==1 dat1$y==1,1:2] - rep(NA,2) ##or with(dat1,{dat1[x==1 y==1,1:2] - NA;dat1}) will do it. Or, use the assignment form of is.na: cond - with(dat1, x==1 y==1) is.na(dat1$x) - cond is.na(dat1$y) - cond This is said to be somewhat safer if you are modifying factors (avoids potential confusion if NA is a level). -pd -- Bert On Wed, Jun 6, 2012 at 10:21 PM, Bert Gunter bgun...@gene.com wrote: Have you read An Intro to R? If not,please do so before posting further. The way you are going about things makes me think you haven't, but ... This **is** a slightly tricky application of indexing, if I understand you correctly. Here are two essentially identical ways to do it, but the second is a little trickier ## First dat1[dat1$x==1 dat1$y==1,1:2] - rep(NA,2) dat1 xy fac 1 NA NA A 212 B 313 A 4 NA NA C 512 A 613 C ##Slightly trickier version using with() to avoid explicit extraction from data frame ## Reconstitute dat1 dat1 x y fac 1 1 1 C 2 1 2 C 3 1 3 B 4 1 1 B 5 1 2 C 6 1 3 B dat1 - with(dat1,{dat1[x==1 y==1,1:2] - rep(NA,2); dat1}) dat1 xy fac 1 NA NA B 212 A 313 A 4 NA NA C 512 A 613 B ## ?with for explanation -- Bert On Wed, Jun 6, 2012 at 8:58 PM, Daisy Englert Duursma daisy.duur...@gmail.com wrote: Hello and thanks for helping. #some data L3 - LETTERS[1:3] dat1 - data.frame(cbind(x=1, y=rep(1:3,2), fac=sample(L3, 6, replace=TRUE))) #When x==1 and y==1 I want to replace the 1 values with NA #I can select the rows I want: dat2-subset(dat1,x==1 y==1) #replace the 1 with NA dat2$x-rep(NA,nrow(dat2) dat2$y-rep(NA,nrow(dat2) #select the other rows and rbind everything back together #This is where I get stuck #The end dataframe will look something like: x y fac NA NA B NA NA A 1 2 C 1 3 C 1 2 C 1 3 A #Is there a better way to do this where I do not need to subset perhaps using lapply? Thanks, Daisy -- Daisy Englert Duursma Department of Biological Sciences Room E8C156 Macquarie University, North Ryde, NSW 2109 Australia Tel +61 2 9850 9256 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to find best parameter values using deSolve n optim() ?
On 6/6/2012 3:50 PM, mhimanshu wrote: Hello Thomas, This code seems to be fine and its now working well. I read the about the FME package, but I have one doubt, as in the data set given in the paper, it showing a nice kinetics of the viral growth, so my question is what if there is a sudden increase in viral growth after some interval, say Bimodal growth curve? How does it fits the bimodal growth curve? I tried with FME but I am not getting the desired results. May be you can explain me a little, I would be really grateful to you. :) Thanks a lot, Himanshu Hi and thanks for the feedback, regarding your problem with a bimodal growth curve I am not completely sure what you mean. However, I suspect that failing to fit a bimodal behaviour may not be caused by parameter fitting with FME, but instead would need extension of the underlying ODE model. Thomas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to add a vertical line for each panel in a lattice dotplot with log scale?
thanks ilai sorry, I mixed up a little: I was thinking to medians of each panel but instead I was trying to plot medians for each variety (what an awful chart, indeed!) thanks for your solution (medians for each panel), it works perfectly, as usual... cheers max -- View this message in context: http://r.789695.n4.nabble.com/how-to-add-a-vertical-line-for-each-panel-in-a-lattice-dotplot-with-log-scale-tp4632513p4632641.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] - detecting outliers
Joachim Audenaert Joachim.Audenaert at pcsierteelt.be writes: Hello all, I am estimating parameters for regression functions on experimental data. Functional response of Rogers type II. I would like to know which points of my dataset are outliers. What is the best method to do this with R? The best method for detecting outliers really depends on the motivation/purpose. Your data look noisy, but by eye nothing really jumps out. Looking at a histogram (hist()) and Q-Q plot qqnorm() of the residuals of the fit, it looks like the distribution is slightly skewed but that there are no points that really fall very far outside a normal distribution (normality is not a necessity for making inferences from an nls fit, but it helps a lot) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] factor coercion with read.csv or read.table
Hello, Try option stringsAsFactors, see ?read.csv or ?read.table As for the thousands separator, see ?format Hope this helps, Rui Barradas Em 07-06-2012 03:09, eric escreveu: How do I fix this error ? I tried coercion to a vector but that didn't work. msci -read.csv(..MSCIexUS.csv, header=TRUE) head(msci) Date index 1 Dec 31, 1969100 2 Jan 30, 1970 97.655 3 Feb 27, 1970 96.154 4 Mar 31, 1970 95.857 5 Apr 30, 1970 85.564 6 May 29, 1970 79.005 str(msci) 'data.frame': 510 obs. of 2 variables: $ Date : Factor w/ 510 levels Apr 28, 1972,..: 98 178 134 311 13 342 268 228 55 481 ... $ index: Factor w/ 510 levels 100,1,000.302,..: 1 499 493 488 444 412 418 434 441 448 ... msci$Date -as.Date(msci$Date, dateFormat='%b %d, %Y') Error in charToDate(x) : character string is not in a standard unambiguous format -- View this message in context: http://r.789695.n4.nabble.com/factor-coercion-with-read-csv-or-read-table-tp4632622.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] factor coercion with read.csv or read.table
On Jun 7, 2012, at 09:25 , Rui Barradas wrote: Hello, Try option stringsAsFactors, see ?read.csv or ?read.table As for the thousands separator, see ?format help(as.Date) should also help. (Hint: there's no dateFormat= argument) Hope this helps, Rui Barradas Em 07-06-2012 03:09, eric escreveu: How do I fix this error ? I tried coercion to a vector but that didn't work. msci -read.csv(..MSCIexUS.csv, header=TRUE) head(msci) Date index 1 Dec 31, 1969100 2 Jan 30, 1970 97.655 3 Feb 27, 1970 96.154 4 Mar 31, 1970 95.857 5 Apr 30, 1970 85.564 6 May 29, 1970 79.005 str(msci) 'data.frame':510 obs. of 2 variables: $ Date : Factor w/ 510 levels Apr 28, 1972,..: 98 178 134 311 13 342 268 228 55 481 ... $ index: Factor w/ 510 levels 100,1,000.302,..: 1 499 493 488 444 412 418 434 441 448 ... msci$Date -as.Date(msci$Date, dateFormat='%b %d, %Y') Error in charToDate(x) : character string is not in a standard unambiguous format -- View this message in context: http://r.789695.n4.nabble.com/factor-coercion-with-read-csv-or-read-table-tp4632622.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] - detecting outliers
Hello, Had you looked more, and you would have seen R-help discussions on what is an outlier. Almost unanimously, an ill defined concept. In your problem, predators don't eat all eggs that they are given except for one case, 38 were given and all 38 were eaten. You can detect this in R with boxplot.stats(d$FR) Or with the return value of boxplot. See ?boxplot Hope this helps, Rui Barradas Em 07-06-2012 07:24, Joachim Audenaert escreveu: Hello all, I am estimating parameters for regression functions on experimental data. Functional response of Rogers type II. I would like to know which points of my dataset are outliers. What is the best method to do this with R? I found a method via R help, but would like to know if there are better methods for my purpose. Here is the script I us now: library(mvoutlier) dat - read.delim(C:/data.txt) uni.plot(dat) My data looks like the following (copied into a txt file): (N0 is the initial number of eggs fed to the predator, FR is the number of eggs eaten by the predator during 24h) N0 FR 37 30 27 15 36 14 37 13 45 8 25 0 47 20 34 6 25 8 21 7 24 24 34 17 23 10 29 5 38 38 24 24 20 17 14 8 18 15 15 10 26 5 33 5 22 21 38 3 22 20 23 19 20 6 20 4 21 18 25 5 13 13 9 8 8 4 7 7 8 5 11 9 Kind regards, Met vriendelijke groeten, Joachim Don't waste paper! Think about the environment before printing this e-mail __ Joachim Audenaert Adviesdienst Gewasbescherming Proefcentrum voor Sierteelt Schaessestraat 18 B-9070 Destelbergen Belgium Tel. +32 9 353 94 71 Fax +32 9 353 94 95 E-mail: joachim.audena...@pcsierteelt.be www.pcsierteelt.be __ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Bill Veanables Workshop
Bill Venables talks R :: Augsburg University, Germany :: 2-3 July 2012 Bill Venables will give a two-day R Workshop in Augsburg on the 2nd and 3rd July 2012, an expanded version of the course, which he has been invited to give at this year's useR! meeting in Nashville. Details: www.math.uni-augsburg.de/termin/R-workshop.html Organised by the Department of Computer-Oriented Statistics and Data Analysis, University of Augsburg Antony Unwin un...@math.uni-augsburg.de [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating frequency table using conditions in a for-loop
On 06/07/2012 08:08 AM, Faz Jones wrote: Hi, I have attached a word document to explain the problem i am having creating a for-loop in R with conditions to create a frequency table. I am new to R so any help would be greatly appreciated. Hi Jones, Unfortunately, you might as well have attached a popsicle to a camel and sent it across the Sahara. Just to show you that this help list really is helpful, I'm going to try to read your mind. You want to create a frequency table by counting the values in some set of observations. Maybe it's homework. Heck, anybody who tries to send a Word document to the R help list is more to be pitied than censured. # get a bunch of observations mydata-sample(LETTERS[1:6],50,TRUE) # let's pretend that we don't know how many different letters there are allletters-sort(unique(mydata)) # now create something to hold your answer myanswer-rep(0,length(allletters)) # give it some names, you'll need them later names(myanswer)-allletters # okay, here we go round the loop for(i in 1:length(mydata)) { answerindex-which(allletters %in% mydata[i]) myanswer[answerindex]-myanswer[answerindex]+1 } print(myanswer) table(mydata) Feel better now? Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] New version of the TraMineR package (1.8-2)
Hi all, It is our pleasure to announce that the new version 1.8-2 of TraMineR has been released on the CRAN. Alongside the fixes of a series of small bugs and some speed improvements, the main changes are: - a new information display when creating state sequence object which permits better checking of the correspondence between alphabet, state names and labels; - representative sequences now can also account for case weights; - the group argument of seqplot() can now be a a list of variables; - new from.start and from.end values for the sortv argument of seqiplot; - the support of event subsequences can now be determined by any of Joshi's 5 counting methods. See http://cran.r-project.org/web/packages/TraMineR/NEWS for a complete list of changes. Additional functions currently in test may be available in our development version and/or the TraMineRextras package on https://r-forge.r-project.org/R/?group_id=743. Best, Gilbert, Alexis, Matthias and Nicolas ___ R-packages mailing list r-packa...@r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] conditional statement to replace values in dataframe with NA
Hi, Try this, dat1 - data.frame(x=rep(1,6),y=rep(1:3,2), fac=sample(L3, 6, replace=TRUE)) dat1 x y fac 1 1 1 C 2 1 2 B 3 1 3 B 4 1 1 A 5 1 2 B 6 1 3 B dat1[dat1$x==1dat1$y==1,1:2]-NA dat1 x y fac 1 NA NA C 2 1 2 B 3 1 3 B 4 NA NA A 5 1 2 B 6 1 3 B A.K. - Original Message - From: Daisy Englert Duursma daisy.duur...@gmail.com To: r-help@R-project.org r-help@r-project.org Cc: Sent: Wednesday, June 6, 2012 11:58 PM Subject: [R] conditional statement to replace values in dataframe with NA Hello and thanks for helping. #some data L3 - LETTERS[1:3] dat1 - data.frame(cbind(x=1, y=rep(1:3,2), fac=sample(L3, 6, replace=TRUE))) #When x==1 and y==1 I want to replace the 1 values with NA #I can select the rows I want: dat2-subset(dat1,x==1 y==1) #replace the 1 with NA dat2$x-rep(NA,nrow(dat2) dat2$y-rep(NA,nrow(dat2) #select the other rows and rbind everything back together #This is where I get stuck #The end dataframe will look something like: x y fac NA NA B NA NA A 1 2 C 1 3 C 1 2 C 1 3 A #Is there a better way to do this where I do not need to subset perhaps using lapply? Thanks, Daisy -- Daisy Englert Duursma Department of Biological Sciences Room E8C156 Macquarie University, North Ryde, NSW 2109 Australia Tel +61 2 9850 9256 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] - detecting outliers
Hi, I believe that first learning the appropriate statistical methods to detect the outliers and searching for the related functions in R is a better way. Ozgur -- View this message in context: http://r.789695.n4.nabble.com/detecting-outliers-tp4632636p4632637.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] table function in a matrix
Hi, I am trying to get a summary of the counts of different variables for each sample in a matrix of the form m below to generate an output as shown. (Ultimately I want to generate a stacked barchart for each sample). I am only able to get the table function to work on one sample (column) at a time. Any help appreciated. Thank you Sarah a-c(A, A, B, B, C, A, C, D, A, D, C, A, D, C, A, C) m-matrix(a, nrow=4) m [,1] [,2] [,3] [,4] [1,] A C A D [2,] A A D C [3,] B C C A [4,] B D A C output needed (so that I can use the barplot(t(output)) function): A B C D [,1] 2 2 0 0 [,2] 1 0 2 1 [,3] 2 0 1 1 [,4] 1 0 2 1 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to Read command line parameters in Sweave?
Hi, I followed the link u provided but getting some error. R -e Sweave('MyReport.Rnw') --args PatientId=1 i am keeping commandArgs(TRUE) in my Rnw file. print(PatientId) // Error: chunk 2 Error in print(PatientId) : object 'PatientId' not found Execution halted Any working example will help me a lot. -- View this message in context: http://r.789695.n4.nabble.com/How-to-Read-command-line-parameters-in-Sweave-tp4632493p4632644.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to build a large identity matrix faster?
Hello, I am trying to build a large size identity matrix using diag(). The size is around 23000 and I've tried diag(23000), that took a long time. Since I have to use this operation several times in my program, the running time is too long to be tolerable. Are there any alternative for diag(N)? Thanks Cheers, yct [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Use fitted Garch models in linear regression
Hi I am analysing a data set of daily SP 500 Index returns and my goal is to elaborate a relationship with a sentiment indicator (daily data). For this purpose I fitted a model to each variable. I found that a GARCH (1,1) suits best for the differenced closing price of the SPX and a GARCH (2,2) for the SPX returns. The sentiment indicator follows a ARMA (2,2) process. But now I am stuck. How do I use these fitted models to perform a linear regression on the variables? Without correction A Model like model=lm(spxclose-spxsentiment) is in my mind. But this simple method does not work with garch objects. The only two alternatives I tried were: A.one: Find the relationships by evaluating the cross correllograms: par(mfrow=c(2,2)) both-ts.union(garchdspxclose$resid,arimaspxpcr$resid) acf(both,na.action = na.pass) pacf(both,na.action = na.pass) A.two: A paper mentions to correct with NeweyWest for autocorrelation and heteroskedasticity result - dynlm(spxclose ~ lag(spxclose,1) +lag(spxpcr,1)+lag(vixpcr,1)) NeweyWest(result) coeftest(result, vcov = NeweyWest) Is this method also correcting for ARCH effects? Are VAR-modells or the cointegration from Granger and Engle appropriate tools to analysis daily exchange data comparing returns and sentiment indicators? Thank you for your help marco -- View this message in context: http://r.789695.n4.nabble.com/Use-fitted-Garch-models-in-linear-regression-tp4632648.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How do I obtain the current active path of a function that's being called?
On 12-06-05 4:58 PM, Michael wrote: Hi all, How do I obtain the current active path of a function that's being called? That's to say, I have several source files and they all contain definition of function A. I would like to figure out which function A and from which file is the one that's being called and is currently active? You've had lots of good suggestions so far. One more possibility: getSrcFilename and the related functions in the same help topic will usually tell you the filename and other location information for functions that you source(). Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Rare event in logistic regression
Hi, Please see the discussion at http://r.789695.n4.nabble.com/regression-methods-for-rare-events-td4632332.html Ozgur -- View this message in context: http://r.789695.n4.nabble.com/Rare-event-in-logistic-regression-tp4632656p4632658.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to build a large identity matrix faster?
Hello, To my great surprise, on my system, Windows 7, R 15.0, 32 bits, an R version is faster! Rdiag - function(n){ m - matrix(0, nrow=n, ncol=n) m[matrix(rep(seq_len(n), 2), ncol=2)] - 1 m } Rdiag(4) n - 5e3 t1 - system.time(d1 - diag(n)) t2 - system.time(d2 - Rdiag(n)) all.equal(d1, d2) rbind(diag=t1, Rdiag=t2, ratio=t1/t2) Anyway, why don't you create it once, save a copy and use it many times? Hope this helps, Rui Barradas Em 07-06-2012 08:55, Ceci Tam escreveu: Hello, I am trying to build a large size identity matrix using diag(). The size is around 23000 and I've tried diag(23000), that took a long time. Since I have to use this operation several times in my program, the running time is too long to be tolerable. Are there any alternative for diag(N)? Thanks Cheers, yct [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] graphic problems with special characters
Hi, I am actually working on some auto-routine to import XML file, run some analysis on them and create graph as jpeg. The files are in different language french/english/danish even chinese. At the moment I'm focusing on the European language. I import them using the XML package and specify encoding=UTF-8 which seems to work pretty well when I write the text in the console, the danish characters æ å ø are printed correctly. The problem raises when I write these characters in graphics generated with jpeg(), then the name of the files and the text/title of the graphics are not written correctly. I am completely ignorant in encoding text in R and I tried my best to find some information on internet I can understand and re-use to fix my problem but it has been unsuccessful until now. My configuration is the following : R version 2.13.1 Microsoft windows XP professional version 2002 with service pack 3 best regards, Guillaume Le Ray [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] table function in a matrix
On Wed, Jun 06, 2012 at 11:02:46PM -0700, Sarah Auburn wrote: Hi, I am trying to get a summary of the counts of different variables for each sample in a matrix of the form m below to generate an output as shown. (Ultimately I want to generate a stacked barchart for each sample). I am only able to get the table function to work on one sample (column) at a time. Any help appreciated. Thank you Sarah ? a-c(A, A, B, B, C, A, C, D, A, D, C, A, D, C, A, C) m-matrix(a, nrow=4) m [,1] [,2] [,3] [,4] [1,] A? C? A? D [2,] A? A? D? C [3,] B? C? C? A [4,] B? D? A? C output needed (so that I can use the barplot(t(output)) function): A B C D [,1] 2 2 0 0 [,2] 1 0 2 1 [,3] 2 0 1 1 [,4] 1 0 2 1 Hi. Try the following. a-c(A, A, B, B, C, A, C, D, A, D, C, A, D, C, A, C) m-matrix(a, nrow=4) tab - function(x) { table(factor(x, levels=LETTERS[1:4])) } t(apply(m, 2, tab)) A B C D [1,] 2 2 0 0 [2,] 1 0 2 1 [3,] 2 0 1 1 [4,] 1 0 2 1 Factors are used to ensure that all the tables have the same length, even if some letters are missing. Hope this helps. Petr Savicky. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error in installing packages
On 07.06.2012 00:09, Martin Morgan wrote: On 06/06/2012 01:41 PM, Andreia Leite wrote: Yes it's windows (vista). It's not a specific package. I've tried more than a CRAN mirror and the message it's always date (the list with the packages simply doesn't appear). What proxy settings should I verify specifically (sorry I don't know a lot on informatics)? I've installed a few packages before this and I never had such a message. With a brand-new 2.9.2 :-) Thank you, Martin. We have not expected the web server change for CRAN extras, hence my suspicion was wrong. The changes of CRAN extras have been reverted now thanks to Brian Ripley and his crew in Oxford. This also shows it makes sense to update R from time to time. Best, Uwe I did this utils:::menuInstallPkgs() --- Please select a CRAN mirror for use in this session --- Error in read.dcf(file = tmpf) : Line starting '!DOCTYPE html PUBLI ...' is malformed! traceback() 4: read.dcf(file = tmpf) 3: available.packages(contriburl = contriburl, method = method) 2: install.packages(NULL, .libPaths()[1L], dependencies = NA, type = type) 1: utils:::menuInstallPkgs() trace(available.packages, tracer=quote(print(contriburl))) Tracing function available.packages in package utils [1] available.packages utils:::menuInstallPkgs() Tracing available.packages(contriburl = contriburl, method = method) on entry [1] http://cran.cs.wwu.edu/bin/windows/contrib/2.9; [2] http://www.stats.ox.ac.uk/pub/RWin/bin/windows/contrib/2.9; Error in read.dcf(file = tmpf) : Line starting '!DOCTYPE html PUBLI ...' is malformed! and that second url is no longer valid (redirected to the institution home page). Perhaps getOption(repos) CRAN CRANextra http://cran.cs.wwu.edu; http://www.stats.ox.ac.uk/pub/RWin; options(repos=getOption(repos)[CRAN]) utils:::menuInstallPkgs() A similar problem came up on the Bioconductor mailing list, where our installer was pointing Mac users to the non-existent http://www.stats.ox.ac.uk/pub/RWin/bin/macosx/leopard/contrib/2.15 Martin Thanks On Wed, Jun 6, 2012 at 6:11 PM, Uwe Ligges lig...@statistik.tu-dortmund.dewrote: On 06.06.2012 17:14, Andreia Leite wrote: Dear list, I'm trying to install a package but every time I select the option form the menu this error message appears: utils:::menuInstallPkgs() Error in read.dcf(file = tmpf) : Line starting '!DOCTYPE html PUBLI ...' is malformed! Do you have any clue of the reason why is it happening? I'm using a older version (2.9.2) but it always worked perfectly! Have you checked proxy settings? Is this Windows? Have you checked if the (also unstated) CRAN mirror you are using works correctly and delivers that part of the repository? Uwe ligges Best regards, Andreia LEite -- View this message in context: http://r.789695.n4.nabble.com/** Error-in-installing-packages-**tp4632543.htmlhttp://r.789695.n4.nabble.com/Error-in-installing-packages-tp4632543.html Sent from the R help mailing list archive at Nabble.com. __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.htmlhttp://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to build a large identity matrix faster?
On 07/06/2012 10:27, Rui Barradas wrote: Hello, To my great surprise, on my system, Windows 7, R 15.0, 32 bits, an R version is faster! Faster than what? diag() is written entirely in R, just more general than yours and so one would expect it to be slower. I have to say that we don't see a fast identity as a priority, as it almost always can be eliminated from calculations, and for large matrices one would want to use a sparse representation such as package Matrix. Rdiag - function(n){ m - matrix(0, nrow=n, ncol=n) m[matrix(rep(seq_len(n), 2), ncol=2)] - 1 m } Rdiag(4) n - 5e3 t1 - system.time(d1 - diag(n)) t2 - system.time(d2 - Rdiag(n)) all.equal(d1, d2) rbind(diag=t1, Rdiag=t2, ratio=t1/t2) Anyway, why don't you create it once, save a copy and use it many times? Hope this helps, Rui Barradas Em 07-06-2012 08:55, Ceci Tam escreveu: Hello, I am trying to build a large size identity matrix using diag(). The size is around 23000 and I've tried diag(23000), that took a long time. Since I have to use this operation several times in my program, the running time is too long to be tolerable. Are there any alternative for diag(N)? Thanks Cheers, yct [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [r] par and complex graph
Oh thank you Carlos!I wasted a lot of time formatting my xyplot by powerpoint.Did you used a similar tips for ternaryplot (vcd)? Many thanks.Regards,Francesco Date: Wed, 6 Jun 2012 17:08:39 +0200 Subject: Re: [R] [r] par and complex graph From: c...@qualityexcellence.es To: nutini.france...@gmail.com Hi, Sorry, layout is a parameter you should use when plotting several charts of the same nature. If you want to combien different lattice charts you should use print() which is a function that has methods to consider trellis objects. Check help details for print.tellis o consider this example: p11 - histogram( ~ height | voice.part, data = singer, xlab=Height)p12 - densityplot( ~ height | voice.part, data = singer, xlab = Height) p2 - histogram( ~ height, data = singer, xlab = Height) ## simple positioning by splitprint(p11, split=c(1,1,1,2), more=TRUE)print(p2, split=c(1,2,1,2)) ## Combining split and position:print(p11, position = c(0,0,.75,.75), split=c(1,1,1,2), more=TRUE) print(p12, position = c(0,0,.75,.75), split=c(1,2,1,2), more=TRUE)print(p2, position = c(.5,.75,1,1), more=FALSE) Regards,Carlos Ortegawww.qualityexcellence.es 2012/6/6 Carlos Ortega c...@qualityexcellence.es Hi Francesco, The parameter in the lattice package that you can use to arrange several plots in the same page is layout: xyplot(Sepal.Length + Sepal.Width ~ Petal.Length + Petal.Width | Species, data = iris, scales = free, layout = c(2, 2), auto.key = list(x = .6, y = .7, corner = c(0, 0))) Regards, Carlos Ortegawww.qualityexcellence.es 2012/6/6 Francesco Nutini nutini.france...@gmail.com Thank you Brian! So, that's why sometimes I can't use the par() Now I'm using the ternaryplot in [vcd]. Then, I have to read the vcd help to looking for a function similar to par(). Many thanks. Francesco Date: Tue, 5 Jun 2012 19:01:25 +0100 From: rip...@stats.ox.ac.uk To: nutini.france...@gmail.com CC: r-help@r-project.org Subject: Re: [R] [r] par and complex graph On 05/06/2012 11:17, Francesco Nutini wrote: Dear R-Users, I'd like to have some tips about printing graph. I use the command par to print more graphs in one window:par(mfrow=c(6,1)); par(oma=c(2.5, 2.5, 2.5, 2.5)); par(mar=c(0.5,4, 0.5, 0.5)) But this command doesn't run with complex graphic command (i.e. xyplot, ternaryplot).How can I print more than one graph per page, when I work with this elaborated graph?Many thanks!Francesco xyplot does lattice (hence grid) plots: you need to read ?print.trellis to find out how to lay those out. par() applies only to base graphics. As for ternaryplot: it depends which package you got it from (and there is more than one on CRAN). [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. That does mean you, too. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Saludos, Carlos Ortega www.qualityexcellence.es -- Saludos, Carlos Ortega www.qualityexcellence.es [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to build a large identity matrix faster?
Em 07-06-2012 11:26, Prof Brian Ripley escreveu: On 07/06/2012 10:27, Rui Barradas wrote: Hello, To my great surprise, on my system, Windows 7, R 15.0, 32 bits, an R version is faster! Faster than what? diag() is written entirely in R, just more general than yours and so one would expect it to be slower. I'm at my other's laptop so I haven't checked the diag() source but since generally vector and matrix creation functions are faster than R code I expected it to be the same for diag(). I'll check it as soon as possible. Rui Barradas I have to say that we don't see a fast identity as a priority, as it almost always can be eliminated from calculations, and for large matrices one would want to use a sparse representation such as package Matrix. Rdiag - function(n){ m - matrix(0, nrow=n, ncol=n) m[matrix(rep(seq_len(n), 2), ncol=2)] - 1 m } Rdiag(4) n - 5e3 t1 - system.time(d1 - diag(n)) t2 - system.time(d2 - Rdiag(n)) all.equal(d1, d2) rbind(diag=t1, Rdiag=t2, ratio=t1/t2) Anyway, why don't you create it once, save a copy and use it many times? Hope this helps, Rui Barradas Em 07-06-2012 08:55, Ceci Tam escreveu: Hello, I am trying to build a large size identity matrix using diag(). The size is around 23000 and I've tried diag(23000), that took a long time. Since I have to use this operation several times in my program, the running time is too long to be tolerable. Are there any alternative for diag(N)? Thanks Cheers, yct [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Basic question about confidence intervals
Hi, I am again asking a generic question and the general response for such questions is cold. I am a beginner but use and write simple R scripts. I am looking for some ideas to calculate the confidence intervals based on this excerpt from the paper. Moreover it would help if someone points to material to read about degrees of freedom and any related concepts. Thanks, Mohan Cutting Corners: Workbench Automation for Server Benchmarking APPENDIX: Confidence Intervals Given N observations of response time from N runs at given arrival rate λ, the confidence interval for the response time at that λ with a desired confidence level, c%, is computed as follows: ⢠Compute the mean server response time: μ = PN i=1 Ri/N, where Ri is the server response time for the ith run. ⢠Compute the standard deviation for the server response time: Ï = qPN i=1(Ri â μ)2/(N â 1). ⢠Confidence interval for the response time at confidence 100c% is given as: [μ â zpÏ/âN, μ + zpÏ/pN], where p = (1 + c)/2, and zp is the quantile of the unit normal distribution at p. If N = 30, we replace zp by tp;nâ1, which is the pquantile of a t-variate with nâ1 degrees of freedom, assuming that the response time values from N runs come from a normal distribution. We verified that response times do come from a normal distribution using a normal proability plot. DISCLAIMER:\ ===...{{dropped:31}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Basic question about confidence intervals
Apology. The formulas are munged. I am referring to 'APPENDIX: Confidence Intervals' in the paper at http://www.cse.iitb.ac.in/~puru/courses/spring12/cs695/downloads/cuttingcorners.pdf Mohan -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Mohan Radhakrishnan Sent: Thursday, June 07, 2012 5:00 PM To: r-help@r-project.org Subject: [R] Basic question about confidence intervals Hi, I am again asking a generic question and the general response for such questions is cold. I am a beginner but use and write simple R scripts. I am looking for some ideas to calculate the confidence intervals based on this excerpt from the paper. Moreover it would help if someone points to material to read about degrees of freedom and any related concepts. Thanks, Mohan Cutting Corners: Workbench Automation for Server Benchmarking APPENDIX: Confidence Intervals Given N observations of response time from N runs at given arrival rate λ, the confidence interval for the response time at that λ with a desired confidence level, c%, is computed as follows: • Compute the mean server response time: μ = PN i=1 Ri/N, where Ri is the server response time for the ith run. • Compute the standard deviation for the server response time: σ = qPN i=1(Ri − μ)2/(N − 1). • Confidence interval for the response time at confidence 100c% is given as: [μ − zpσ/√N, μ + zpσ/pN], where p = (1 + c)/2, and zp is the quantile of the unit normal distribution at p. If N = 30, we replace zp by tp;n−1, which is the pquantile of a t-variate with n−1 degrees of freedom, assuming that the response time values from N runs come from a normal distribution. We verified that response times do come from a normal distribution using a normal proability plot. DISCLAIMER:\ ===.{{dropped:31}} DISCLAIMER: ==The information contained in this e-mail message may be privileged and/or confidential and protected from disclosure under applicable law. It is intended only for the individual to whom or entity to which it is addressed as shown at the beginning of the message. If the reader of this message is not the intended recipient, or if the employee or agent responsible for delivering the message is not an employee or agent of the intended recipient, you are hereby notified that any review, dissemination,distribution, use, or copying of this message is strictly prohibited. If you have received this message in error, please notify us immediately by return e-mail and permanently delete this message and your reply to the extent it includes this message. Any views or opinions presented in this message or attachments are those of the author and do not necessarily represent those of the Company. All e-mails and attachments sent and received are subject to monitoring, reading, and archival by the Company.== __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to add a vertical line for each panel in a lattice dotplot with log scale?
...and what if I need to plot another vertical line for showing also the means for each panel? by simply adding another call to panel.abline () seems not producing a correct result for each panel # medians and means for each panel: dotplot(variety ~ yield | site, data = barley, scales=list(x=list(log=TRUE)), layout = c(1,6), panel = function(x,y,...) { panel.dotplot(x,y,...) median.values - median(x) panel.abline(v=median.values, col.line=red) mean.values - mean(x) panel.abline(v=mean.values, col.line=red) }) In the dataset I'm currently working on (which is not the above mentioned example) I've got a wrong plottting of the means for each panel, what I'm missing? thanks -- View this message in context: http://r.789695.n4.nabble.com/how-to-add-a-vertical-line-for-each-panel-in-a-lattice-dotplot-with-log-scale-tp4632513p4632678.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Rare event in logistic regression
Hello, I am working with logistic analysis in which event rate is 0.005% with large requirds. Is there is any R package which handle rare event in logistic regression. Please let me know? Thanks for your help. Thanks, Bharat - Bharat Warule Cypress Analytica , Pune -- View this message in context: http://r.789695.n4.nabble.com/Rare-event-in-logistic-regression-tp4632656.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Abrupt closure of R when using .C function
Hi Everyone, This is my first message on this discussion list. I create a R function which includes a .C function. I didn't get any error neither from C side, nor from R side. I tried to put proper type in R. But the problem is that I get an abrupt closure of R, with the following message: R for Windows GUI front-end encountered a problem and needs to close . Does anyone have an idea about where this abrupt closure come from? Thank you. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to build a large identity matrix faster?
On 6/7/2012 2:27 AM, Rui Barradas wrote: Hello, To my great surprise, on my system, Windows 7, R 15.0, 32 bits, an R version is faster! I was also surprised, Windows 7, R 2.15.0, 64-bit rbind(diag=t1, Rdiag=t2, ratio=t1/t2) user.self sys.self elapsed user.child sys.child diag 0.72 0.080.81 NANA Rdiag 0.09 0.030.12 NANA ratio 8.00 2.676.75 NANA sessionInfo() R version 2.15.0 (2012-03-30) Platform: x86_64-pc-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] splines stats graphics grDevices utils datasets methods [8] base other attached packages: [1] fda_2.2.9 Matrix_1.0-6 lattice_0.20-6 zoo_1.7-7 loaded via a namespace (and not attached): [1] grid_2.15.0 tools_2.15.0 Spencer Rdiag - function(n){ m - matrix(0, nrow=n, ncol=n) m[matrix(rep(seq_len(n), 2), ncol=2)] - 1 m } Rdiag(4) n - 5e3 t1 - system.time(d1 - diag(n)) t2 - system.time(d2 - Rdiag(n)) all.equal(d1, d2) rbind(diag=t1, Rdiag=t2, ratio=t1/t2) Anyway, why don't you create it once, save a copy and use it many times? Hope this helps, Rui Barradas Em 07-06-2012 08:55, Ceci Tam escreveu: Hello, I am trying to build a large size identity matrix using diag(). The size is around 23000 and I've tried diag(23000), that took a long time. Since I have to use this operation several times in my program, the running time is too long to be tolerable. Are there any alternative for diag(N)? Thanks Cheers, yct [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Relative frequencies in table
Hi, I'm trying to create a stacked bar plot with the satisfaction scores from a customer satisfaction survey. I have results for three stores over several weeks and want to create a weekly graph with a stacked bar for each store. I can flatten the dataframe into a table with absolute frequencies, but I can't find how to get relative frequencies. My dataset looks similar to the example below: Satisfaction - c(1,1,2,3,4,5,2,2,2,3,1,1,4,5,4,2,3,2,2,2,3,1,3,2,4) Store - c(1,1,2,3,3,2,2,1,2,3,1,2,3,2,1,3,2,1,2,1,2,3,2,1,3) Week - c(1,1,1,1,1,1,1,2,2,2,2,2,2,2,3,3,3,3,3,3,3,4,4,4,4) csat - data.frame(Satisfaction, Store, Week) csat[,1] - factor(csat[,1], levels=c(1,2,3,4,5), labels=c(Very satisfied, Satisfied, Neutral, Dissatisfied, Very dissatisfied)) csat[,2] - factor(csat[,2], levels=c(1,2,3), labels=c(New York, Paris, Johannesburg)) csat[,3] - factor(csat[,3], levels=c(1,2,3,4), labels=c(2012-01, 2012-02, 2012-03, 2012-04)) csat.counts - table(csat) How do I get the satisfaction scores as a percentage per store per week? It must be something simple, perhaps just because the indexing of a 3-dimensional matrix is not very intuitive to me. Any help is highly appreciated! Kind regards, Patrick [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to add a vertical line for each panel in a lattice dotplot with log scale?
Hi! I recently posted a similar question (entitled Adding mean line to a lattice density plot). Have not got any usable solution forcing my to fall back to the use of the normal 'plot' function. The problem was the same as yours: using panel.abline simply did not work, the position of the mean line was incorrect. I have posted a workaround (based on plot), please see my earlier posting. HTH, Kimmo 07.06.2012 15:37, maxbre wrote: ...and what if I need to plot another vertical line for showing also the means for each panel? by simply adding another call to panel.abline () seems not producing a correct result for each panel # medians and means for each panel: dotplot(variety ~ yield | site, data = barley, scales=list(x=list(log=TRUE)), layout = c(1,6), panel = function(x,y,...) { panel.dotplot(x,y,...) median.values- median(x) panel.abline(v=median.values, col.line=red) mean.values- mean(x) panel.abline(v=mean.values, col.line=red) }) In the dataset I'm currently working on (which is not the above mentioned example) I've got a wrong plottting of the means for each panel, what I'm missing? thanks __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Relative frequencies in table
please look at the likert function in the HH package. It is designed for this type of study. ?likert has many examples similar to yours. Rich Sent from my iPhone On Jun 7, 2012, at 8:42, Patrick Hubers stomper...@gmail.com wrote: Hi, I'm trying to create a stacked bar plot with the satisfaction scores from a customer satisfaction survey. I have results for three stores over several weeks and want to create a weekly graph with a stacked bar for each store. I can flatten the dataframe into a table with absolute frequencies, but I can't find how to get relative frequencies. My dataset looks similar to the example below: Satisfaction - c(1,1,2,3,4,5,2,2,2,3,1,1,4,5,4,2,3,2,2,2,3,1,3,2,4) Store - c(1,1,2,3,3,2,2,1,2,3,1,2,3,2,1,3,2,1,2,1,2,3,2,1,3) Week - c(1,1,1,1,1,1,1,2,2,2,2,2,2,2,3,3,3,3,3,3,3,4,4,4,4) csat - data.frame(Satisfaction, Store, Week) csat[,1] - factor(csat[,1], levels=c(1,2,3,4,5), labels=c(Very satisfied, Satisfied, Neutral, Dissatisfied, Very dissatisfied)) csat[,2] - factor(csat[,2], levels=c(1,2,3), labels=c(New York, Paris, Johannesburg)) csat[,3] - factor(csat[,3], levels=c(1,2,3,4), labels=c(2012-01, 2012-02, 2012-03, 2012-04)) csat.counts - table(csat) How do I get the satisfaction scores as a percentage per store per week? It must be something simple, perhaps just because the indexing of a 3-dimensional matrix is not very intuitive to me. Any help is highly appreciated! Kind regards, Patrick [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Abrupt closure of R when using .C function
On Jun 7, 2012, at 12:52 , Nouedoui Laetitia wrote: Hi Everyone, This is my first message on this discussion list. I create a R function which includes a .C function. I didn't get any error neither from C side, nor from R side. I tried to put proper type in R. But the problem is that I get an abrupt closure of R, with the following message: R for Windows GUI front-end encountered a problem and needs to close . Does anyone have an idea about where this abrupt closure come from? Usually, the C code did something disastrous like writing to memory it doesn't own. First step to find out what happened is to run the same code from Rterm, second step is to learn how to use a debugger (which I have so far avoided having to do under Windows, so don't ask me about it). Thank you. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] graphic problems with special characters
I think we need some data and code. Would you please provide some sample data (see ?dput for a handy way to provide data) and some working code that demonstrates the problem? John Kane Kingston ON Canada -Original Message- From: leray.guilla...@gmail.com Sent: Thu, 7 Jun 2012 11:48:53 +0200 To: r-help@r-project.org Subject: [R] graphic problems with special characters Hi, I am actually working on some auto-routine to import XML file, run some analysis on them and create graph as jpeg. The files are in different language french/english/danish even chinese. At the moment I'm focusing on the European language. I import them using the XML package and specify encoding=UTF-8 which seems to work pretty well when I write the text in the console, the danish characters f e x are printed correctly. The problem raises when I write these characters in graphics generated with jpeg(), then the name of the files and the text/title of the graphics are not written correctly. I am completely ignorant in encoding text in R and I tried my best to find some information on internet I can understand and re-use to fix my problem but it has been unsuccessful until now. My configuration is the following : R version 2.13.1 Microsoft windows XP professional version 2002 with service pack 3 best regards, Guillaume Le Ray [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks orcas on your desktop! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Basic question about confidence intervals
On Jun 7, 2012, at 7:30 AM, Mohan Radhakrishnan wrote: Hi, I am again asking a generic question and the general response for such questions is cold. I am a beginner but use and write simple R scripts. Have you read the Posting Guide? If the question is well-asked and of interest to someone on the list, it may elicit an informative up-to-date answer. See also the Usenet groups sci.stat.consult (applied statistics and consulting) and sci.stat.math (mathematical stat and probability). Basic statistics and classroom homework: R-help is not intended for these. There are other forums online for such questions: stats.exchange.com is one such. (And I would have to say that the Usenet group advice is seriously outdated.) I am looking for some ideas to calculate the confidence intervals based on this excerpt from the paper. Moreover it would help if someone points to material to read about degrees of freedom and any related concepts. Now that last one is surely a sign of failure to google. -- David. Thanks, Mohan Cutting Corners: Workbench Automation for Server Benchmarking APPENDIX: Confidence Intervals Given N observations of response time from N runs at given arrival rate λ, the confidence interval for the response time at that λ with a desired confidence level, c%, is computed as follows: • Compute the mean server response time: μ = PN i=1 Ri/N, where Ri is the server response time for the ith run. • Compute the standard deviation for the server response time: σ = qPN i=1(Ri − μ)2/(N − 1). • Confidence interval for the response time at confidence 100c% is given as: [μ − zpσ/√N, μ + zpσ/pN], where p = (1 + c)/2, and zp is the quantile of the unit normal distribution at p. If N = 30, we replace zp by tp;n−1, which is the pquantile of a t-variate with n−1 degrees of freedom, assuming that the response time values from N runs come from a normal distribution. We verified that response times do come from a normal distribution using a normal proability plot. DISCLAIMER:\ ===...{{dropped:31}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] table function in a matrix
Perfect, thank you! From: Petr Savicky savi...@cs.cas.cz To: r-help@r-project.org Sent: Thursday, 7 June 2012, 19:42 Subject: Re: [R] table function in a matrix On Wed, Jun 06, 2012 at 11:02:46PM -0700, Sarah Auburn wrote: Hi, I am trying to get a summary of the counts of different variables for each sample in a matrix of the form m below to generate an output as shown. (Ultimately I want to generate a stacked barchart for each sample). I am only able to get the table function to work on one sample (column) at a time. Any help appreciated. Thank you Sarah ? a-c(A, A, B, B, C, A, C, D, A, D, C, A, D, C, A, C) m-matrix(a, nrow=4) m [,1] [,2] [,3] [,4] [1,] A? C? A? D [2,] A? A? D? C [3,] B? C? C? A [4,] B? D? A? C output needed (so that I can use the barplot(t(output)) function): A B C D [,1] 2 2 0 0 [,2] 1 0 2 1 [,3] 2 0 1 1 [,4] 1 0 2 1 Hi. Try the following. a-c(A, A, B, B, C, A, C, D, A, D, C, A, D, C, A, C) m-matrix(a, nrow=4) tab - function(x) { table(factor(x, levels=LETTERS[1:4])) } t(apply(m, 2, tab)) A B C D [1,] 2 2 0 0 [2,] 1 0 2 1 [3,] 2 0 1 1 [4,] 1 0 2 1 Factors are used to ensure that all the tables have the same length, even if some letters are missing. Hope this helps. Petr Savicky. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to add a vertical line for each panel in a lattice dotplot with log scale?
thanks kimmo I managed to get the desired result by first plotting the medians and then adding the means through the user defind function posted in thread you mentioned (here it is http://r.789695.n4.nabble.com/Adding-mean-line-to-a-lattice-density-plot-td4455770.html#a4456502) # start dotplot(variety ~ yield | site, data = barley, scales=list(x=list(log=TRUE)), layout = c(1,6), panel = function(x,y,...) { panel.dotplot(x,y,...) median.values - median(x) panel.abline(v=median.values, col.line=red) }) addLine- function(a=NULL, b=NULL, v = NULL, h = NULL, ..., once=F) { tcL - trellis.currentLayout() k-0 for(i in 1:nrow(tcL)) for(j in 1:ncol(tcL)) if (tcL[i,j] 0) { k-k+1 trellis.focus(panel, j, i, highlight = FALSE) if (once) panel.abline(a=a[k], b=b[k], v=v[k], h=h[k], ...) else panel.abline(a=a, b=b, v=v, h=h, ...) trellis.unfocus() } } mean.values-tapply(barley$yield, barley$site, mean) addLine(v=log10(mean.values), once=TRUE, col=blue, lty=dotted) # end but back to my previous question I still not understand why the plot of medians is working fine BUT NOT of the means (apparently messing up panel positions and also values): no clue for this! I've been trying also with the use of layout() in latticeExtra but without results... anyone can clarify me these (strange for me) issues? max -- View this message in context: http://r.789695.n4.nabble.com/how-to-add-a-vertical-line-for-each-panel-in-a-lattice-dotplot-with-log-scale-tp4632513p4632692.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to add a vertical line for each panel in a lattice dotplot with log scale?
On Jun 7, 2012, at 10:23 AM, maxbre wrote: thanks kimmo I managed to get the desired result by first plotting the medians and then adding the means through the user defind function posted in thread you mentioned (here it is http://r.789695.n4.nabble.com/Adding-mean-line-to-a-lattice-density-plot-td4455770.html#a4456502) # start dotplot(variety ~ yield | site, data = barley, scales=list(x=list(log=TRUE)), layout = c(1,6), panel = function(x,y,...) { panel.dotplot(x,y,...) median.values - median(x) panel.abline(v=median.values, col.line=red) }) addLine- function(a=NULL, b=NULL, v = NULL, h = NULL, ..., once=F) { tcL - trellis.currentLayout() k-0 for(i in 1:nrow(tcL)) for(j in 1:ncol(tcL)) if (tcL[i,j] 0) { k-k+1 trellis.focus(panel, j, i, highlight = FALSE) if (once) panel.abline(a=a[k], b=b[k], v=v[k], h=h[k], ...) else panel.abline(a=a, b=b, v=v, h=h, ...) trellis.unfocus() } } mean.values-tapply(barley$yield, barley$site, mean) addLine(v=log10(mean.values), once=TRUE, col=blue, lty=dotted) # end but back to my previous question I still not understand why the plot of medians is working fine BUT NOT of the means (apparently messing up panel positions and also values): no clue for this! Can you explain what you mean by messing up panel positions and also values? When I execute this code with and without the two code lines for mean vertical lines I get expected results: dotplot(variety ~ yield | site, data = barley, scales=list(x=list(log=TRUE)), layout = c(1,6), panel = function(x,y,...) { panel.dotplot(x,y,...) mean.values - mean(x) #omitted in second run panel.abline(v=mean.values, col.line=red) #omitted in second run median.values - median(x) panel.abline(v=median.values, col.line=blue) }) --- David. sessionInfo() R version 2.14.2 (2012-02-29) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] parallel splines stats grDevices utils datasets graphics methods base other attached packages: [1] lme4_0.999375-42 Matrix_1.0-5 ggplot2_0.9.0 forecast_3.19 [5] RcppArmadillo_0.2.36 Rcpp_0.9.10 fracdiff_1.4-0 tseries_0.10-27 [9] quadprog_1.5-4 zoo_1.7-6MASS_7.3-17 circular_0.4-3 [13] boot_1.3-4 rms_3.5-0Hmisc_3.9-2 survival_2.36-12 [17] sos_1.3-1brew_1.0-6 lattice_0.20-6 loaded via a namespace (and not attached): [1] cluster_1.14.2 colorspace_1.1-0 dichromat_1.2-4 digest_0.5.1 fortunes_1.4-2 [6] grid_2.14.2memoise_0.1munsell_0.3 nlme_3.1-103 plyr_1.7.1 [11] proto_0.3-9.2 RColorBrewer_1.0-5 reshape2_1.2.1 rgl_0.92.861 scales_0.2.0 [16] stats4_2.14.2 stringr_0.6tools_2.14.2 vcd_1.2-12 I've been trying also with the use of layout() in latticeExtra but without results... anyone can clarify me these (strange for me) issues? max -- View this message in context: http://r.789695.n4.nabble.com/how-to-add-a-vertical-line-for-each-panel-in-a-lattice-dotplot-with-log-scale-tp4632513p4632692.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] divide factor in n equal groups?
Could anyone please tell me what is the most elegant way to divide an ordinal variable in equal groups? (as cut() does with continous variables) for example I'd like to have the factor educational level in three groups low medium and high Thank you! David [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot2: legend for geom_rug() ..?
Well, a brute force and stupidity approach with geom_text will work but it's not aesthetically very nice. Note I did not play around with text size. Try : ggplot(mdat, aes(position, value)) + geom_point(aes(colour = Treatment)) + geom_rug(subset = .(position 14),aes(y=NULL),color=orange) + geom_rug(subset = .(position 14),aes(y=NULL),color=black) + geom_text(data=NULL, x =11 , y= 0.75, colour= orange, label = London) + geom_text(data=NULL, x =16 , y= 0.75, colour= blue, label = NYC) I thought that one should be able to generate a second legend with an aes(colour=ids) but it does not work. Someone much more knowledgeable than me hopefully will have a better idea. John Kane Kingston ON Canada -Original Message- From: tim_smith_...@yahoo.com Sent: Thu, 7 Jun 2012 07:30:08 -0700 (PDT) To: jrkrid...@inbox.com, bsmith030...@gmail.com, r-help@r-project.org Subject: Re: [R] ggplot2: legend for geom_rug() ..? Hi, Here is the corrected code: library(ggplot2) ids - paste('id_',1:3,sep='') before - sample(9) after - sample(1:10,9) dat - as.matrix(cbind(before,after)) rownames(dat) - rep(ids,3) position - c(rep(10,3),rep(13,3),rep(19,3)) mdat - cbind(melt(dat),position) colnames(mdat) - c('ID','Treatment','value','position') ggplot(mdat, aes(position, value)) + geom_point(aes(colour = Treatment)) + geom_rug(subset = .(position 14),aes(y=NULL),color=orange) + geom_rug(subset = .(position 14),aes(y=NULL),color=black) Alternatively, how do I add a second legend in ggplot2? thanks! _ From: John Kane jrkrid...@inbox.com To: Brian Smith bsmith030...@gmail.com; r-help@r-project.org Sent: Wednesday, June 6, 2012 3:06 PM Subject: Re: [R] ggplot2: legend for geom_rug() ..? What is X2? code not running at the moment John Kane Kingston ON Canada -Original Message- From: [1]bsmith030...@gmail.com Sent: Wed, 6 Jun 2012 11:52:25 -0400 To: [2]r-help@r-project.org Subject: [R] ggplot2: legend for geom_rug() ..? Hi, I was trying to make another legend for the rug plot. Sample code: library(ggplo2) ids - paste('id_',1:3,sep='') before - sample(9) after - sample(1:10,9) dat - as.matrix(cbind(before,after)) rownames(dat) - rep(ids,3) position - c(rep(10,3),rep(13,3),rep(19,3)) mdat - cbind(melt(dat),position) ggplot(mdat, aes(position, value)) + geom_point(aes(colour = X2)) + geom_rug(subset = .(position 14),aes(y=NULL),color=orange) + geom_rug(subset = .(position 14),aes(y=NULL),color=black) This gives the plot correctly, but how can I add another legend that would give some more information on the rugplot (e.g. that 'orange' line = 'London', and 'black' line = 'NYC')? thanks!! [[alternative HTML version deleted]] __ [3]R-help@r-project.org mailing list [4]https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. GET FREE SMILEYS FOR YOUR IM EMAIL - Learn more at http://www.inbox.com/smileys Works with AIM®, MSN® Messenger, Yahoo!® Messenger, ICQ®, Google Talk™ and most webmails __ [5]R-help@r-project.org mailing list [6]https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide [7]http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. _ [8]Email Notifier Preview Receive Notifications of Incoming Messages Easily monitor multiple email accounts access them with a click. Visit [9]www.inbox.com/notifier and check it out! References 1. mailto:bsmith030...@gmail.com 2. mailto:r-help@r-project.org 3. mailto:R-help@r-project.org 4. https://stat.ethz.ch/mailman/listinfo/r-help 5. mailto:R-help@r-project.org 6. https://stat.ethz.ch/mailman/listinfo/r-help 7. http://www.R-project.org/posting-guide.html 8. http://www.inbox.com/notifier 9. http://www.inbox.com/notifier __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] flagging values without a loop
For a given hour I want to be able to add a new column called flag. The flag column will flag the highest price in a given hour. Is there a way to do this without a loop? matrix: Unit, Day,Hour, Price, Flag afd11/2/20031 1 N afd11/2/20031 2 N afd11/2/20031 3 N afd11/2/20031 4 Y dcf11/2/20032 2 N dcf11/2/20032 3 Y dcf11/2/20032 1 N dcf11/2/20032 2 N dcf11/2/20032 3 Y ghg21/2/20033 1 N afd11/2/20033 2 N . -- View this message in context: http://r.789695.n4.nabble.com/flagging-values-without-a-loop-tp4632689.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Relative frequencies in table
Thanks a lot! Takes some fiddling, but it works great. Regards, Patrick 2012/6/7 Rmh r...@temple.edu please look at the likert function in the HH package. It is designed for this type of study. ?likert has many examples similar to yours. Rich [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to set cookies in RCurl
Hi, I am trying to access a website and read its content. The website is a restricted access website that I access through a proxy server (which therefore requires me to enable cookies). I have problems in allowing Rcurl to receive and send cookies. The following lines give me: library(RCurl) library(XML) url - http://www.theurl.com; content - readHTMLTable(url) content $`NULL` V1 1 2 Cookies disabled 3 4 Your browser currently does not accept cookies.\rCookies need to be enabled for Scopus to function properly.\rPlease enable session cookies in your browser and try again. $`NULL` V1 V2 V3 1 $`NULL` V1 1 Cookies disabled $`NULL` V1 1 2 3 I have carefully read section 4.4. from this: http://www.omegahat.org/RCurl/RCurlJSS.pdf and tried the following without succes: curl - getCurlHandle() curlSetOpt(cookiejar = 'cookies.txt', curl = curl) Any suggestions on how to allow for cookies? Thanks. Math -- View this message in context: http://r.789695.n4.nabble.com/How-to-set-cookies-in-RCurl-tp4632693.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot2: legend for geom_rug() ..?
Hi, Here is the corrected code: library(ggplot2) ids - paste('id_',1:3,sep='') before - sample(9) after - sample(1:10,9) dat - as.matrix(cbind(before,after)) rownames(dat) - rep(ids,3) position - c(rep(10,3),rep(13,3),rep(19,3)) mdat - cbind(melt(dat),position) colnames(mdat) - c('ID','Treatment','value','position') ggplot(mdat, aes(position, value)) + geom_point(aes(colour = Treatment)) +       geom_rug(subset = .(position 14),aes(y=NULL),color=orange) +       geom_rug(subset = .(position 14),aes(y=NULL),color=black) Alternatively, how do I add a second legend in ggplot2? thanks! From: John Kane jrkrid...@inbox.com To: Brian Smith bsmith030...@gmail.com; r-help@r-project.org Sent: Wednesday, June 6, 2012 3:06 PM Subject: Re: [R] ggplot2: legend for geom_rug() ..? What is X2? code not running at the moment John Kane Kingston ON Canada -Original Message- From: bsmith030...@gmail.com Sent: Wed, 6 Jun 2012 11:52:25 -0400 To: r-help@r-project.org Subject: [R] ggplot2: legend for geom_rug() ..? Hi, I was trying to make another legend for the rug plot. Sample code: library(ggplo2) ids - paste('id_',1:3,sep='') before - sample(9) after - sample(1:10,9) dat - as.matrix(cbind(before,after)) rownames(dat) - rep(ids,3) position - c(rep(10,3),rep(13,3),rep(19,3)) mdat - cbind(melt(dat),position) ggplot(mdat, aes(position, value)) + geom_point(aes(colour = X2)) +     geom_rug(subset = .(position 14),aes(y=NULL),color=orange) +     geom_rug(subset = .(position 14),aes(y=NULL),color=black) This gives the plot correctly, but how can I add another legend that would give some more information on the rugplot (e.g. that 'orange' line = 'London', and 'black' line = 'NYC')? thanks!!    [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. GET FREE SMILEYS FOR YOUR IM EMAIL - Learn more at http://www.inbox.com/smileys [[elided Yahoo spam]] ®, Google Talk⢠and most webmails __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R2BayesX (command bayesx) doesn't work
On Wed, 6 Jun 2012, Prof Brian Ripley wrote: On 06/06/2012 16:13, niandra wrote: Hi all, I have a problem with the library R2BayesX, when i try to use the command bayesx i get this error: dyld: Library not loaded: /usr/local/lib/libreadline.5.2.dylib Referenced from: /Library/Frameworks/R.framework/Versions/2.15/Resources/library/BayesXsrc/libs/i386/BayesX Reason: image not found So your R installation on your unstated OS is incomplete/corrupt. It looks like this is OS X, so: - wrong list (use R-sig-mac) - you need to tell the correct list a lot more, including the 'at a minimum' information asked for in the R posting guide (see below), and how you installed R. - a possible answer is to install http://r.research.att.com/src/readline-5.2.tar.gz : see the OS X documentation. Thanks, Brian, I wasn't aware of that (because I am not an OS X user). I had recently seen this problem on another OS X machine. There, we solved it by install.packages(BayesXsrc, type = source) so that the BayesX binary would be linked against the libreadline that was available on that machine. thx, Z I obtain this message also with the example in the bayesx help: ## generate some data set.seed(111) n- 200 ## regressor dat- data.frame(x = runif(n, -3, 3)) ## response dat$y- with(dat, 1.5 + sin(x) + rnorm(n, sd = 0.6)) ## estimate models with ## bayesx REML and MCMC b1- bayesx(y ~ sx(x), method = REML, data = dat) dyld: Library not loaded: /usr/local/lib/libreadline.5.2.dylib Referenced from: /Library/Frameworks/R.framework/Versions/2.15/Resources/library/BayesXsrc/libs/i386/BayesX Reason: image not found Total run time was: 0.69 sec -- View this message in context: http://r.789695.n4.nabble.com/R2BayesX-command-bayesx-doesn-t-work-tp4632541.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] error calling Winbugs using R2WinBugs to run a multi-level model
Thanks for the suggestions! Unfortunately I get same trap whether I input the data as a named list, list of the names, or text file. I tried the three with and without transposing the matrices (I didn't change the model structure indexing but this should appear as an indexing error later on). Good news is that I experimented with downloading Jags and the model fits fine to the same data directly from r, calling the R2Jags library with minimal changes to any of my scripts. I only had to define couple of more nodes in the model in order to monitor them. I'll send you the txt files in a separate email. Thanks again for your interest! Saana On 6 June 2012 18:17, Uwe Ligges lig...@statistik.tu-dortmund.de wrote: On 06.06.2012 16:51, ilai wrote: Untested because I don't have (use) winbugs and you didn't provide dat*. But consider a- 4 ; f- 6 list('a','f') list(a,f) list(a=a,f=f) My guess is you wanted sp.data to be a named list, not a list of names... That's also OK, from ?bugs.data: data: either a named list (names corresponding to variable names in the ‘model.file’) of the data for the ‘WinBUGS’ model, _or_ (which is not recommended and unsafe) a vector or list of the names of the data objects used by the model To really now what is going on, I'd need the model file and the data. My suspicion is that the matrices have to be transposed. Best, Uwe Ligges HTH On Wed, Jun 6, 2012 at 4:12 AM, Saana Isojunno saana.isoju...@googlemail.com wrote: Dear all, I'm calling Winbugs (1.4.3) through R2WinBugs (2.1-18 coda_0.14-7) to fit a switching random walk model, but come up with an instant trap with the log only displaying 'check('. I will paste the trap with session info below; I'd be very grateful for any ideas. Couple of leads: 1. I presume the problem relates to the r package itself or the way I call bugs(), because I can use the same text files specifying the model and data directly in Winbugs and it runs fine (i.e syntax ok, compilation ok, updates slow but no traps). 2. The problem occurs in r only when I try to fit the model to multiple individuals, i.e. the data contains a matrix of step lengths (rows) and individuals (columns) instead of a vector for just one individual. I get the same error message regardless of the number of data rows in each column (I even tried just one). The model loops over the path of each animal, estimating a hidden movement state and their parameters. For 4 individuals with 100 data points each the data looks something like this: dat1 : num 100 dat2 : int 4 dat3 : num [1:4] 8 4 2 5 dat4 : num [1:100, 1:4] 1 1 1 1 1 2 2 2 2 2 ... dat5 : num [1:100, 1:4] 2 2 2 2 2 1 2 2 2 2 ... dat6 : num [1:100, 1:4] 16 34.3 33.5 27.9 14.9 ... dat7 : num [1:100, 1:4] 0.357 0.474 0.487 0.495 0.524 ... dat8 : num [1:50, 1:4] 36.4 294.5 24.4 21.1 422.8 ... This is how I've called WinBugs in r: # write data to text file sp.data = list(dat1,dat2,dat3,dat4,dat5,dat6,dat7,dat8) bugs.data(sp.data, digits=5, data.file=dir1\\data1.txt) # test the model runs fit = bugs(data=paste(C:\\Users\\User1\\Documents\\dir1\\data1.txt,dataFile,sep=), inits=NULL, parameters.to.save=list('par1','par2','par3'), model.file=modelFile, debug=TRUE, n.chains=3, n.iter=20, n.burnin=3, n.thin=1, digits=4) ## The trap incompatible copy BugsScript.Action.Do [0436H] .a BugsScript.Action [025B6790H] .argNum INTEGER 0 .bugsCommands ARRAY 240 OF CHAR 7877X, 75A5X, 0B17X, 3701X ... .p ARRAY 3, 120 OF CHAR Elements .s BugsScanners.Scanner Fields .scriptCommand ARRAY 240 OF CHAR #Bugs:check ... .vectorName BOOLEAN FALSE Services.Exec [0136H] .a Services.Action [025B6790H] .t POINTER [64E10170H] Services.IterateOverActions [02F4H] .p Services.Action [025B6790H] .t POINTER NIL .time LONGINT 4375656 Services.StdHook.Step [034DH] .h Services.StdHook [0248E380H] HostWindows.Idle [4A86H] .focus BOOLEAN FALSE .tick Controllers.TickMsg Fields .w HostWindows.Window NIL HostMenus.TimerTick [3422H] .lParam INTEGER 0 .ops Controllers.PollOpsMsg Fields .wParam INTEGER 1 .wnd INTEGER 1311298 Kernel.Try [3A61H] .a INTEGER 1311298 .b INTEGER 1 .c INTEGER 0 .h PROCEDURE HostMenus.TimerTick HostMenus.ApplWinHandler [3841H] .Proc PROCEDURE NIL .hit BOOLEAN FALSE .lParam INTEGER 0 .message INTEGER 275 .res INTEGER 1664639202 .s ARRAY 256 OF SHORTCHAR ... .w INTEGER 1970768325 .wParam INTEGER 1
Re: [R] flagging values without a loop
In two steps, you could use ave() to split by hour and find the maximum of price and then use an ifelse clause on the resulting vector to see when that actually equals the given price and assign Y/N appropriately, I'll leave the implementation as an exercise to the reader :-) Best, Michael On Thu, Jun 7, 2012 at 9:17 AM, jcrosbie ja...@crosb.ie wrote: For a given hour I want to be able to add a new column called flag. The flag column will flag the highest price in a given hour. Is there a way to do this without a loop? matrix: Unit, Day, Hour, Price, Flag afd1 1/2/2003 1 1 N afd1 1/2/2003 1 2 N afd1 1/2/2003 1 3 N afd1 1/2/2003 1 4 Y dcf1 1/2/2003 2 2 N dcf1 1/2/2003 2 3 Y dcf1 1/2/2003 2 1 N dcf1 1/2/2003 2 2 N dcf1 1/2/2003 2 3 Y ghg2 1/2/2003 3 1 N afd1 1/2/2003 3 2 N . -- View this message in context: http://r.789695.n4.nabble.com/flagging-values-without-a-loop-tp4632689.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to set cookies in RCurl
To just enable cookies and their management, use the cookiefile option, e.g. txt = getURLContent(url, cookiefile = ) Then you can pass this to readHTMLTable(), best done as content = readHTMLTable(htmlParse(txt, asText = TRUE)) The function readHTMLTable() doesn't use RCurl and doesn't handle cookies. D. On 6/7/12 7:33 AM, mdvaan wrote: Hi, I am trying to access a website and read its content. The website is a restricted access website that I access through a proxy server (which therefore requires me to enable cookies). I have problems in allowing Rcurl to receive and send cookies. The following lines give me: library(RCurl) library(XML) url - http://www.theurl.com; content - readHTMLTable(url) content $`NULL` V1 1 2 Cookies disabled 3 4 Your browser currently does not accept cookies.\rCookies need to be enabled for Scopus to function properly.\rPlease enable session cookies in your browser and try again. $`NULL` V1 V2 V3 1 $`NULL` V1 1 Cookies disabled $`NULL` V1 1 2 3 I have carefully read section 4.4. from this: http://www.omegahat.org/RCurl/RCurlJSS.pdf and tried the following without succes: curl - getCurlHandle() curlSetOpt(cookiejar = 'cookies.txt', curl = curl) Any suggestions on how to allow for cookies? Thanks. Math -- View this message in context: http://r.789695.n4.nabble.com/How-to-set-cookies-in-RCurl-tp4632693.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] flagging values without a loop
Hi. Yes it is possible. Here is one approach: DF - read.table(textConnection( Unit DayHour Price Flag afd11/2/20031 1 N afd11/2/20031 2 N afd11/2/20031 3 N afd11/2/20031 4 Y dcf11/2/20032 2 N dcf11/2/20032 3 Y dcf11/2/20032 1 N dcf11/2/20032 2 N dcf11/2/20032 3 Y ghg21/2/20033 1 N afd11/2/20033 2 N ),header=TRUE) cbind(DF, flag = ave(DF$Price, DF$Hour, FUN=function(x) ifelse(x==max(x), 1, 0))) Unit Day Hour Price Flag flag 1 afd1 1/2/20031 1N0 2 afd1 1/2/20031 2N0 3 afd1 1/2/20031 3N0 4 afd1 1/2/20031 4Y1 5 dcf1 1/2/20032 2N0 6 dcf1 1/2/20032 3Y1 7 dcf1 1/2/20032 1N0 8 dcf1 1/2/20032 2N0 9 dcf1 1/2/20032 3Y1 10 ghg2 1/2/20033 1N0 11 afd1 1/2/20033 2N1 On Thu, Jun 7, 2012 at 4:17 PM, jcrosbie ja...@crosb.ie wrote: For a given hour I want to be able to add a new column called flag. The flag column will flag the highest price in a given hour. Is there a way to do this without a loop? matrix: Unit, Day,Hour, Price, Flag afd11/2/20031 1 N afd11/2/20031 2 N afd11/2/20031 3 N afd11/2/20031 4 Y dcf11/2/20032 2 N dcf11/2/20032 3 Y dcf11/2/20032 1 N dcf11/2/20032 2 N dcf11/2/20032 3 Y ghg21/2/20033 1 N afd11/2/20033 2 N . -- View this message in context: http://r.789695.n4.nabble.com/flagging-values-without-a-loop-tp4632689.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to add a vertical line for each panel in a lattice dotplot with log scale?
a new session of R with the following sessionInfo() R version 2.15.0 (2012-03-30) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=Italian_Italy.1252 LC_CTYPE=Italian_Italy.1252 [3] LC_MONETARY=Italian_Italy.1252 LC_NUMERIC=C [5] LC_TIME=Italian_Italy.1252 attached base packages: [1] stats graphics grDevices utils datasets methods [7] base other attached packages: [1] latticeExtra_0.6-19 RColorBrewer_1.0-5 lattice_0.20-6 loaded via a namespace (and not attached): [1] grid_2.15.0 tools_2.15.0 this is the code I run #start library(lattice); library(latticeExtra) #example with user function addLine- function(a=NULL, b=NULL, v = NULL, h = NULL, ..., once=F) { tcL - trellis.currentLayout() k-0 for(i in 1:nrow(tcL)) for(j in 1:ncol(tcL)) if (tcL[i,j] 0) { k-k+1 trellis.focus(panel, j, i, highlight = FALSE) if (once) panel.abline(a=a[k], b=b[k], v=v[k], h=h[k], ...) else panel.abline(a=a, b=b, v=v, h=h, ...) trellis.unfocus() } } dotplot(variety ~ yield | site, data = barley, scales=list(x=list(log=TRUE)), layout = c(1,6), panel = function(x,y,...) { panel.dotplot(x,y,...) median.values - median(x) panel.abline(v=median.values, col.line=red) }) mean.values-tapply(barley$yield, barley$site, mean) addLine(v=log10(mean.values), once=TRUE, col=blue) # example with panel.abline dotplot(variety ~ yield | site, data = barley, scales=list(x=list(log=TRUE)), layout = c(1,6), panel = function(x,y,...) { panel.dotplot(x,y,...) mean.values - mean(x) #omitted in second run panel.abline(v=mean.values, col.line=red) #omitted in second run median.values - median(x) panel.abline(v=median.values, col.line=blue) }) #end this are the two different results I’ve got: example with user defined function http://r.789695.n4.nabble.com/file/n4632706/example_with_user_function.png example with panel.abline http://r.789695.n4.nabble.com/file/n4632706/example_with_panel_abline.png and now I’m really confused of what I’m doing and seeing… -- View this message in context: http://r.789695.n4.nabble.com/how-to-add-a-vertical-line-for-each-panel-in-a-lattice-dotplot-with-log-scale-tp4632513p4632706.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to set cookies in RCurl
Apologies for following up on my own mail, but I forgot to explicitly mention that you will need to specify the appropriate proxy information in the call to getURLContent(). D. On 6/7/12 8:31 AM, Duncan Temple Lang wrote: To just enable cookies and their management, use the cookiefile option, e.g. txt = getURLContent(url, cookiefile = ) Then you can pass this to readHTMLTable(), best done as content = readHTMLTable(htmlParse(txt, asText = TRUE)) The function readHTMLTable() doesn't use RCurl and doesn't handle cookies. D. On 6/7/12 7:33 AM, mdvaan wrote: Hi, I am trying to access a website and read its content. The website is a restricted access website that I access through a proxy server (which therefore requires me to enable cookies). I have problems in allowing Rcurl to receive and send cookies. The following lines give me: library(RCurl) library(XML) url - http://www.theurl.com; content - readHTMLTable(url) content $`NULL` V1 1 2 Cookies disabled 3 4 Your browser currently does not accept cookies.\rCookies need to be enabled for Scopus to function properly.\rPlease enable session cookies in your browser and try again. $`NULL` V1 V2 V3 1 $`NULL` V1 1 Cookies disabled $`NULL` V1 1 2 3 I have carefully read section 4.4. from this: http://www.omegahat.org/RCurl/RCurlJSS.pdf and tried the following without succes: curl - getCurlHandle() curlSetOpt(cookiejar = 'cookies.txt', curl = curl) Any suggestions on how to allow for cookies? Thanks. Math -- View this message in context: http://r.789695.n4.nabble.com/How-to-set-cookies-in-RCurl-tp4632693.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to add a vertical line for each panel in a lattice dotplot with log scale?
On Jun 7, 2012, at 11:34 AM, maxbre wrote: a new session of R with the following sessionInfo() Part of the confusion may be that you have reversed the colors for mean and median in two different examples. The other confusion may be that mean(log(.)) != log(mean(.)) this is the code I run #start library(lattice); library(latticeExtra) #example with user function addLine- function(a=NULL, b=NULL, v = NULL, h = NULL, ..., once=F) { tcL - trellis.currentLayout() k-0 for(i in 1:nrow(tcL)) for(j in 1:ncol(tcL)) if (tcL[i,j] 0) { k-k+1 trellis.focus(panel, j, i, highlight = FALSE) if (once) panel.abline(a=a[k], b=b[k], v=v[k], h=h[k], ...) else panel.abline(a=a, b=b, v=v, h=h, ...) trellis.unfocus() } } dotplot(variety ~ yield | site, data = barley, scales=list(x=list(log=TRUE)), layout = c(1,6), panel = function(x,y,...) { panel.dotplot(x,y,...) median.values - median(x) panel.abline(v=median.values, col.line=red) }) mean.values-tapply(barley$yield, barley$site, mean) addLine(v=log10(mean.values), once=TRUE, col=blue) # example with panel.abline dotplot(variety ~ yield | site, data = barley, scales=list(x=list(log=TRUE)), layout = c(1,6), panel = function(x,y,...) { panel.dotplot(x,y,...) mean.values - mean(x) #omitted in second run panel.abline(v=mean.values, col.line=red) #omitted in second run median.values - median(x) panel.abline(v=median.values, col.line=blue) }) #end this are the two different results I’ve got: example with user defined function http://r.789695.n4.nabble.com/file/n4632706/example_with_user_function.png example with panel.abline http://r.789695.n4.nabble.com/file/n4632706/example_with_panel_abline.png and now I’m really confused of what I’m doing and seeing… -- View this message in context: http://r.789695.n4.nabble.com/how-to-add-a-vertical-line-for-each-panel-in-a-lattice-dotplot-with-log-scale-tp4632513p4632706.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] select subrows based on a specific column in a matrix
Hi all, I have a matrix with 1 rows and 10 columns. The last columns contains another identifiers but the values are not uniques so that I want to generate another matrix with rows with unique values in the last column. If I did tmp-unique(my_mat$col10) this will give me 8560 unique entries so the ideal matrix will be 8560X10 columns now then. I tried sub_mat-my_mat[tmp,] but it generated weird results with many NA values and the order was not changed. The original matrix was ranked from top so I don't want to lose the order too. For the similar problem, I have used match function and do some manipulate to identify the index of the first appearance of each value but is there any better and neat way to achieve the same function? Thanks, Seungyeul Yoo Postdoc Fellow, Institute of Genomics and Multiscale Biology Department of Genetics and Genomic Sciences Mount Sinai School of Medicine __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to set cookies in RCurl
Thanks for the fast response. I am not sure how to enter the proxy info in the call. I am working via EZProxy (which I think, rewrites a URL). According to their website it does this: 1. Within the config.txt/ezproxy.cfg file, various hosts are identified that require access from a local IP address. 2. A remote user makes a web connection to port 2048 of your EZproxy server. 3. When the user authenticates successfully, a cookie is sent to the user's browser. 4. The user's browser presents this during each access to EZproxy. So, for example, if I enter URL 1, EZproxy dynamically changes it to URL 2: 1. http://www.scopus.com/results/... 2. http://www-scopus-com.ezproxy.cul.columbia.edu/results/... What kind of proxy information should I look for and where do I enter it in the call? Your help is very much appreciated. Thanks. Duncan Temple Lang wrote Apologies for following up on my own mail, but I forgot to explicitly mention that you will need to specify the appropriate proxy information in the call to getURLContent(). D. On 6/7/12 8:31 AM, Duncan Temple Lang wrote: To just enable cookies and their management, use the cookiefile option, e.g. txt = getURLContent(url, cookiefile = ) Then you can pass this to readHTMLTable(), best done as content = readHTMLTable(htmlParse(txt, asText = TRUE)) The function readHTMLTable() doesn't use RCurl and doesn't handle cookies. D. On 6/7/12 7:33 AM, mdvaan wrote: Hi, I am trying to access a website and read its content. The website is a restricted access website that I access through a proxy server (which therefore requires me to enable cookies). I have problems in allowing Rcurl to receive and send cookies. The following lines give me: library(RCurl) library(XML) url - http://www.theurl.com; content - readHTMLTable(url) content $`NULL` V1 1 2 Cookies disabled 3 4 Your browser currently does not accept cookies.\rCookies need to be enabled for Scopus to function properly.\rPlease enable session cookies in your browser and try again. $`NULL` V1 V2 V3 1 $`NULL` V1 1 Cookies disabled $`NULL` V1 1 2 3 I have carefully read section 4.4. from this: http://www.omegahat.org/RCurl/RCurlJSS.pdf and tried the following without succes: curl - getCurlHandle() curlSetOpt(cookiejar = 'cookies.txt', curl = curl) Any suggestions on how to allow for cookies? Thanks. Math -- View this message in context: http://r.789695.n4.nabble.com/How-to-set-cookies-in-RCurl-tp4632693.html Sent from the R help mailing list archive at Nabble.com. __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://r.789695.n4.nabble.com/How-to-set-cookies-in-RCurl-tp4632693p4632714.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Trouble with Functions
On 2012-06-06 12:45, dougmcintosh wrote: Haha no, TextWrangler. And that was definitely it...I think what was happening is that when I opened the text version of the book it opened in Notepad, which was probably opened the txt file in RTF. Then I copied and pasted the function code into TextWrangler and didn't even think about Smart Quotes. So I used the Straighten Quote feature. It got through all the way to the last line where I got an unexpected string error: Error in source(/Documents/score.txt) : /Documents/score.txt:13:25: unexpected INCOMPLETE_STRING 32: return(scores.df) 33: } Is there a debug version I could be running or something that lists more descriptive error explanations? That way I don't have to bother you guys and embarrass myself so. :-) You don't need debug - you just need to get the original file in plain text mode. All this rich text crud may look pretty, but it's the text equivalent of chartjunk. Anyway, the INCOMPLETE_STRING error is a pretty good hint that R is reading part of your input as a string and that it doesn't find a closing quote to match an opening quote. I'm pretty sure that TextWrangler will have replaced all single smart quotes appropriately, but it will have missed the double quotes in the three gsub() lines like this one: sentence = gsub(‘[[:punct:]]’, ”, sentence) where the double quote actually started life as a pair of single quotes. Thus you have an unequal number of double quotes still in your input to source(). Just replace them by hand. Peter Ehlers [snip] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] select subrows based on a specific column in a matrix
Hello, You should post a data example, like the posting guide says. If your dataset is large, use something like dput(head(dat, 20)) # paste the output of this in your post. where 'dat' is your dataset. Now, try # make up some data set.seed(12) dat - matrix(c(sort(rnorm(10)), sample(letters[1:4], 10, TRUE)), ncol=2) colnames(dat) - c(A, col10) dat # this does it ix - as.logical(ave(seq_len(nrow(dat)), dat[, col10], FUN=function(x) ifelse(x == min(x), TRUE, FALSE))) dat[ix, ] # rows 1, 2, 4, 6 Hope this helps, Rui Barradas Em 07-06-2012 17:07, Seungyeul Yoo escreveu: Hi all, I have a matrix with 1 rows and 10 columns. The last columns contains another identifiers but the values are not uniques so that I want to generate another matrix with rows with unique values in the last column. If I did tmp-unique(my_mat$col10) this will give me 8560 unique entries so the ideal matrix will be 8560X10 columns now then. I tried sub_mat-my_mat[tmp,] but it generated weird results with many NA values and the order was not changed. The original matrix was ranked from top so I don't want to lose the order too. For the similar problem, I have used match function and do some manipulate to identify the index of the first appearance of each value but is there any better and neat way to achieve the same function? Thanks, Seungyeul Yoo Postdoc Fellow, Institute of Genomics and Multiscale Biology Department of Genetics and Genomic Sciences Mount Sinai School of Medicine __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] non ascill characters in plots. no alternative but plotmath?
I think the problem is with fonts and encodings. The pdf device is using a different font and/or encoding than the screen device and so the non-ascii characters are looking different. If you can convince the pdf driver to use the same font and encoding then the symbols/characters in the plot should look the same, but my personal experience (not knowing font and encoding details very well) is that using plotmath or David's suggestions would probably be easier (and more useful in the long run). A couple of other options which are probably more work than plotmath in general (but may be better for some specific cases) are: Use the tikz device and then process using LaTeX to get the pdf file (this way you have the full power of LaTeX and fonts match text). Make bitmap images of the symbols you want to use, convert them to rasters and use rasterImage to add them to the plot. Find points that when connected by lines will draw your image, then use the my.symbols function (TeachingDemos package) to add them to the plot. On Wed, Jun 6, 2012 at 5:23 PM, Paul Johnson pauljoh...@gmail.com wrote: A student entered some data with text characters like epsilon and alpha. On her Windows system, the Greek letters did not display properly in a plot. There were some ordinary ASCII instead. I asked her to send me the code so I could test. For me, the plot looks ok on the screen. Format1 - c(320,500,700,1000,500,320,700,500,320) Format2 - c(800,1000,1150,1400,1500,1650,1800,2300,2500) Vowel - c(u,o, α, a,ø, y, ε, e,i) V1 - data.frame(Format1,Format2,Vowel) plot(Format1 ~ Format2, data = V1, type=n) text(V1$Format2, V1$Format1, labels=V1$Vowel) On my Debian linux system, the plot shows the Greek letters just fine in the screen device. However, I turned on a pdf device to run the same code and see signs of trouble. text(V1$Format2, V1$Format1, labels=V1$Vowel) Warning messages: 1: In text.default(V1$Format2, V1$Format1, labels = V1$Vowel) : conversion failure on 'α' in 'mbcsToSbcs': dot substituted for ce 2: In text.default(V1$Format2, V1$Format1, labels = V1$Vowel) : conversion failure on 'α' in 'mbcsToSbcs': dot substituted for b1 3: In text.default(V1$Format2, V1$Format1, labels = V1$Vowel) : font metrics unknown for Unicode character U+03b1 4: In text.default(V1$Format2, V1$Format1, labels = V1$Vowel) : conversion failure on 'α' in 'mbcsToSbcs': dot substituted for ce 5: In text.default(V1$Format2, V1$Format1, labels = V1$Vowel) : conversion failure on 'α' in 'mbcsToSbcs': dot substituted for b1 6: In text.default(V1$Format2, V1$Format1, labels = V1$Vowel) : conversion failure on 'ε' in 'mbcsToSbcs': dot substituted for ce 7: In text.default(V1$Format2, V1$Format1, labels = V1$Vowel) : conversion failure on 'ε' in 'mbcsToSbcs': dot substituted for b5 8: In text.default(V1$Format2, V1$Format1, labels = V1$Vowel) : font metrics unknown for Unicode character U+03b5 9: In text.default(V1$Format2, V1$Format1, labels = V1$Vowel) : conversion failure on 'ε' in 'mbcsToSbcs': dot substituted for ce 10: In text.default(V1$Format2, V1$Format1, labels = V1$Vowel) : conversion failure on 'ε' in 'mbcsToSbcs': dot substituted for b5 The alpha and epsilon characters don't appear in the pdf. I don't know the proper terminology to describe the situation, thus I don't know where to start reading. Until very recently, I didn't even know it was possible to directly enter these characters in Emacs, but I've learned that part. I understand you might answer use plotmath, if if that's the only workable thing, I will teach her how. But that's a little bit of an up hill climb (from where we are now standing). It will be a lot more work for me to teach about expressions and whatnot, so if there is a direct route from a column of non ASCII characters to a plot that has those characters in it, I'd be glad to know. pj -- Paul E. Johnson Professor, Political Science Assoc. Director 1541 Lilac Lane, Room 504 Center for Research Methods University of Kansas University of Kansas http://pj.freefaculty.org http://quant.ku.edu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] extracting values from txt with regular expression
Thanks for your suggestions. Bert, in your response you raised my awareness to regular expressions. Are regular expressions the same across various languages? Consider the following line of text: txt_line- PERCENT DISCREPANCY = 0.01 PERCENT DISCREPANCY = -0.05 It seems python uses the following line of code to extract the two values in txt_line and store them in a variable called v: v = re.findall([+-]? *(?:\d+(?:\.\d*)|\.\d+)(?:[eE][+-]?\d+)?, line) #v[0] 0.01 #v[1] -0.05 I tried something similar in R (but it didn't work) by using the same regular expression, but got an error: edm-grep([+-]? *(?:\d+(?:\.\d*)|\.\d+)(?:[eE][+-]?\d+)?,txt_line) #Error: '\d' is an unrecognized escape in character string starting [+-]? *(?:\d I'm not even sure which function in R most efficiently extracts the values from txt_line. Basically, I want to peel out the values and think I can use the decimal point to construct the regular expression, but don't know where to go from here? -- View this message in context: http://r.789695.n4.nabble.com/extracting-values-from-txt-file-that-follow-user-supplied-quote-tp4632558p4632724.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] extracting values from txt with regular expression
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of emorway Sent: Thursday, June 07, 2012 10:41 AM To: r-help@r-project.org Subject: [R] extracting values from txt with regular expression Thanks for your suggestions. Bert, in your response you raised my awareness to regular expressions. Are regular expressions the same across various languages? Consider the following line of text: txt_line- PERCENT DISCREPANCY = 0.01 PERCENT DISCREPANCY = -0.05 It seems python uses the following line of code to extract the two values in txt_line and store them in a variable called v: v = re.findall([+-]? *(?:\d+(?:\.\d*)|\.\d+)(?:[eE][+-]?\d+)?, line) #v[0] 0.01 #v[1] -0.05 I tried something similar in R (but it didn't work) by using the same regular expression, but got an error: edm-grep([+-]? *(?:\d+(?:\.\d*)|\.\d+)(?:[eE][+-]?\d+)?,txt_line) #Error: '\d' is an unrecognized escape in character string starting [+-]? *(?:\d I'm not even sure which function in R most efficiently extracts the values from txt_line. Basically, I want to peel out the values and think I can use the decimal point to construct the regular expression, but don't know where to go from here? I am a regular expression novice, but the error message you are receiving is the result of not doubling the backslashes in your regular expression pattern. The backslash needs to be escaped. So this will get you close to what you want (although not necessarily efficiently). ndx - gregexpr([+-]?(?:\\d+(?:\\.\\d*)|\\.\\d+)(?:[eE][+-]?\\d+)?,txt_line) matched - regmatches(txt_line, ndx) matched Hope this is helpful, Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] select subrows based on a specific column in a matrix
Dear Rui, Thank you so much. Yes, that function is what I wanted. I will make sure I post a data example for the next time. Thank you for your help again. Bests, Seungyeul On Jun 7, 2012, at 12:50 PM, Rui Barradas wrote: Hello, You should post a data example, like the posting guide says. If your dataset is large, use something like dput(head(dat, 20)) # paste the output of this in your post. where 'dat' is your dataset. Now, try # make up some data set.seed(12) dat - matrix(c(sort(rnorm(10)), sample(letters[1:4], 10, TRUE)), ncol=2) colnames(dat) - c(A, col10) dat # this does it ix - as.logical(ave(seq_len(nrow(dat)), dat[, col10], FUN=function(x) ifelse(x == min(x), TRUE, FALSE))) dat[ix, ] # rows 1, 2, 4, 6 Hope this helps, Rui Barradas Em 07-06-2012 17:07, Seungyeul Yoo escreveu: Hi all, I have a matrix with 1 rows and 10 columns. The last columns contains another identifiers but the values are not uniques so that I want to generate another matrix with rows with unique values in the last column. If I did tmp-unique(my_mat$col10) this will give me 8560 unique entries so the ideal matrix will be 8560X10 columns now then. I tried sub_mat-my_mat[tmp,] but it generated weird results with many NA values and the order was not changed. The original matrix was ranked from top so I don't want to lose the order too. For the similar problem, I have used match function and do some manipulate to identify the index of the first appearance of each value but is there any better and neat way to achieve the same function? Thanks, Seungyeul Yoo Postdoc Fellow, Institute of Genomics and Multiscale Biology Department of Genetics and Genomic Sciences Mount Sinai School of Medicine __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How do I obtain the current active path of a function that's being called?
Wow, even those of us who have been using S for more than 25 years (and R since well before version 1.0) still have things to learn since R keeps improving. So I stand corrected (well sit actually) on the part about not keeping track of this sort of thing. On Thu, Jun 7, 2012 at 3:04 AM, Duncan Murdoch murdoch.dun...@gmail.com wrote: On 12-06-05 4:58 PM, Michael wrote: Hi all, How do I obtain the current active path of a function that's being called? That's to say, I have several source files and they all contain definition of function A. I would like to figure out which function A and from which file is the one that's being called and is currently active? You've had lots of good suggestions so far. One more possibility: getSrcFilename and the related functions in the same help topic will usually tell you the filename and other location information for functions that you source(). Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] extracting values from txt file that follow user-supplied quote
Hello, I've just read your follow-up question on regular expressions, and I believe this, your original problem, can be made much faster. Just use readLine() differently, reading large amounts of text lines at a time. For this to work you will still need to know the total number of lines in the file. fun - function(con, pattern, nlines, n=5000L){ if(is.character(con)){ con - file(con, open=rt) on.exit(close(con)) } passes - nlines %/% n remaining - nlines %% n res - NULL for(i in seq_len(passes)){ txt - readLines(con, n=n) res - c(res, as.numeric(substr(txt[grepl(pattern, txt)], 70, 78))) } if(remaining){ txt - readLines(con, n=remaining) res - c(res, as.numeric(substr(txt[grepl(pattern, txt)], 70, 78))) } res } url - http://r.789695.n4.nabble.com/file/n4632558/MCR.out; pat - PERCENT DISCREPANCY = num_lines - 14405247L # your original txt_con-file(description=url,open=r) pd - NULL t1 - system.time( for(i in 1:num_lines){ txt_line-readLines(txt_con,n=1) if (length(grep(pat,txt_line))) { pd-c(pd,as.numeric(substr(txt_line,70,78))) } } ) close(txt_con) # the function above, increased 'n' t2 - system.time(pd2 - fun(url, pat, num_lines, 10L)) all.equal(pd, pd2) [1] TRUE rbind(original=t1, fun=t2, ratio=t1/t2) user.self sys.self elapsed user.child sys.child original780.16 196.16 981.9100 NANA fun 0.10 0.04 3.2000 NANA ratio 7801.60 4904.00 306.8469 NANA A factor of 300. Hope this helps, Rui Barradas Em 06-06-2012 17:54, emorway escreveu: useRs- I'm attempting to scan a more than 1Gb text file and read and store the values that follow a specific key-phrase that is repeated multiple time throughout the file. A snippet of the text file I'm trying to read is attached. The text file is a dumping ground for various aspects of the performance of the model that generates it. Thus, the location of information I'm wanting to extract from the file is not in a fixed position (i.e. it does not always appears in a predictable location, like line 1000, or 2000, etc.). Rather, the desired values always follow a specific phrase: PERCENT DISCREPANCY = One approach I took was the following: library(R.utils) txt_con-file(description=D:/MCR_BeoPEST - Copy/MCR.out,open=r) #The above will need to be altered if one desires to test code on the attached txt file, which will run much quicker system.time(num_lines-countLines(D:/MCR_BeoPEST - Copy/MCR.out)) #elapsed time on full 1Gb file took about 55 seconds on a 3.6Gh Xeon num_lines #14405247 system.time( for(i in 1:num_lines){ txt_line-readLines(txt_con,n=1) if (length(grep(PERCENT DISCREPANCY =,txt_line))) { pd-c(pd,as.numeric(substr(txt_line,70,78))) } } ) #Time took about 5 minutes The inefficiencies in this approach arise due to reading the file twice (first to get num_lines, then to step through each line looking for the desired text). Is there a way to speed this process up through the use of a ?scan ? I wan't able to get anything working, but what I had in mind was scan through the more than 1Gb file and when the keyphrase (e.g. PERCENT DISCREPANCY = ) is encountered, read and store the next 13 characters (which will include some white spaces) as a numeric value, then resume the scan until the key phrase is encountered again and repeat until the end-of-the-file marker is encountered. Is such an approach even possible or is line-by-line the best bet? http://r.789695.n4.nabble.com/file/n4632558/MCR.out MCR.out -- View this message in context: http://r.789695.n4.nabble.com/extracting-values-from-txt-file-that-follow-user-supplied-quote-tp4632558.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Quantile regression: Discrepencies Between optimizer and rq()
Hello Everyone, I'm currently learning about quantile regressions. I've been using an optimizer to compare with the rq() command for quantile regression. When I run the code, the results show that my coefficients are consistent with rq(), but the intercept term can vary by a lot. I don't think my optimizer code is wrong and suspects it has something to do with the starting values. The results seems very sensitive to different starting values and I don't know how to make sense of it. Advice from the community would be greatly appreciated. Sincerely, Kevin Chang ## CODE Below ### library(quantreg) data(engel) y-cbind(engel[,2]) x-cbind(rep(1,length(engel[,1])),engel[,1]) x1-cbind(engel[,1]) nn-nrow(engel) nn bhat.ls-solve(t(x)%*%x)%*%t(x)%*%y #bhat.ls # QUANTILES quant=.25 fr.1=function(bhat.opt) { uu=y-x%*%bhat.opt sample.cond.quantile=quantile(uu,quant) w.less=rep(0,nn) for(ii in 1:nn){if(uu[ii]sample.cond.quantile) w.less[ii]=1} sum((quant-1)*sum((y-x%*%bhat.opt)*w.less) #negative residuals +quant*sum((y-x%*%bhat.opt)*(1-w.less))) #positive residuals } start-c(0,0) result=optim(start,fr.1) bhat.cond=result$par #Quantile Command Results fit.temp=rq(y~x1,tau=quant) fit.temp #OPTIMIZER Results bhat.cond #OLS Command Results mean=lm(y~x1) mean [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Quantile regression: Discrepencies Between optimizer and rq()
Optim() by default is using Nelder-Mead which is an extremely poor way to do linear programming, despite the fact that ?optim says that: It will work reasonably well for non-differentiable functions.I didn't check your coding of the objective function fully, but at the very least you should explicitly pass the arguments y, x, and quant. and you need to replace what you call sample.cond.quantile by 0 in the definition of w.less. url:www.econ.uiuc.edu/~rogerRoger Koenker emailrkoen...@uiuc.eduDepartment of Economics vox: 217-333-4558University of Illinois fax: 217-244-6678Urbana, IL 61801 On Jun 7, 2012, at 1:49 PM, Kevin Chang wrote: Hello Everyone, I'm currently learning about quantile regressions. I've been using an optimizer to compare with the rq() command for quantile regression. When I run the code, the results show that my coefficients are consistent with rq(), but the intercept term can vary by a lot. I don't think my optimizer code is wrong and suspects it has something to do with the starting values. The results seems very sensitive to different starting values and I don't know how to make sense of it. Advice from the community would be greatly appreciated. Sincerely, Kevin Chang ## CODE Below ### library(quantreg) data(engel) y-cbind(engel[,2]) x-cbind(rep(1,length(engel[,1])),engel[,1]) x1-cbind(engel[,1]) nn-nrow(engel) nn bhat.ls-solve(t(x)%*%x)%*%t(x)%*%y #bhat.ls # QUANTILES quant=.25 fr.1=function(bhat.opt) { uu=y-x%*%bhat.opt sample.cond.quantile=quantile(uu,quant) w.less=rep(0,nn) for(ii in 1:nn){if(uu[ii]sample.cond.quantile) w.less[ii]=1} sum((quant-1)*sum((y-x%*%bhat.opt)*w.less) #negative residuals +quant*sum((y-x%*%bhat.opt)*(1-w.less))) #positive residuals } start-c(0,0) result=optim(start,fr.1) bhat.cond=result$par #Quantile Command Results fit.temp=rq(y~x1,tau=quant) fit.temp #OPTIMIZER Results bhat.cond #OLS Command Results mean=lm(y~x1) mean [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error in installing packages
Thank you for helping me to solve this question! -- View this message in context: http://r.789695.n4.nabble.com/Error-in-installing-packages-tp4632543p4632736.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R2wd error in wdGet
Dear list, I'm trying to use R2wd package. I've installed the package and try wdGet(). However a error message came up. I'm presently using R 2.15.0 wdGet() Error in if (wdapp[[Documents]][[Count]] == 0) wdapp[[Documents]]$Add() : argument is of length zero Does anyone knows what this means? Thanks a lot. Andreia Leite -- View this message in context: http://r.789695.n4.nabble.com/R2wd-error-in-wdGet-tp4632737.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how lm behaves
I was wondering if somebody could explain why I get different results here: treats[,2]-as.factor(treats[,2]) treats[,5]-as.factor(treats[,5]) treats[,4]-as.factor(treats[,4]) #there are 'c' on more days than I have 'h2o2', where treats[,4] is the day. I only want 'c' that correspond to the same days that I have a 'h2o2' also. z-treats[,3] == 'h2o2' x-treats[,4] %in% treats[z,4] a-treats[,3] == 'c' aa-which(a) xx-which(x) zz-which(z) aa-intersect(aa, xx) aa-c(aa, zz) a- count[aa] x-as.vector(treats[aa,2]) y-as.vector(treats[aa,4]) b-as.vector(treats[aa,5]) data1-cbind(a,x,y,b) data1-as.data.frame(data1) data1[,'a']-as.integer(levels(data1[,'a'])[data1[,'a']]) mo2-lm(count[aa]~treats[aa,2]*treats[aa,4]*treats[aa,5]- treats[aa,2]:treats[aa,4]:treats[aa,5]) summary(mo2) Call: lm(formula = count[aa] ~ treats[aa, 2] * treats[aa, 4] * treats[aa, 5] - treats[aa, 2]:treats[aa, 4]:treats[aa, 5]) Residuals: Min 1Q Median 3Q Max -70.000 -22.244 0.422 17.292 70.000 Coefficients: (13 not defined because of singularities) Estimate Std. Error t value Pr(|t|) (Intercept) 3.955e+02 4.038e+01 9.792 1.77e-09 *** treats[aa, 2]11 1.600e+00 4.860e+01 0.033 0.974034 treats[aa, 2]12-8.200e+01 4.860e+01 -1.687 0.105692 treats[aa, 4]5.15 -2.279e+02 5.303e+01 -4.298 0.000292 *** treats[aa, 4]5.2 -5.033e+01 5.303e+01 -0.949 0.352838 treats[aa, 4]5.21 2.111e+01 5.303e+01 0.398 0.694384 treats[aa, 4]5.29 -4.922e+01 5.303e+01 -0.928 0.363360 treats[aa, 4]6.11 1.016e+01 5.941e+01 0.171 0.865787 treats[aa, 4]6.17 -9.518e+01 5.941e+01 -1.602 0.123390 treats[aa, 4]6.18 5.566e+01 5.941e+01 0.937 0.358971 treats[aa, 4]6.55.249e+01 5.941e+01 0.884 0.386458 treats[aa, 5]5.7 -8.988e-14 4.860e+01 0.000 1.00 treats[aa, 5]38-2.554e+02 4.860e+01 -5.255 2.85e-05 *** treats[aa, 5]570 -4.009e+02 5.031e+01 -7.969 6.29e-08 *** treats[aa, 2]11:treats[aa, 4]5.15 -4.100e+01 5.809e+01 -0.706 0.487713 treats[aa, 2]12:treats[aa, 4]5.15 1.297e+02 5.809e+01 2.232 0.036103 * treats[aa, 2]11:treats[aa, 4]5.2 -6.300e+01 5.809e+01 -1.085 0.289869 treats[aa, 2]12:treats[aa, 4]5.22.740e+02 5.809e+01 4.717 0.000105 *** treats[aa, 2]11:treats[aa, 4]5.21 5.667e+00 5.809e+01 0.098 0.923172 treats[aa, 2]12:treats[aa, 4]5.21 1.170e+02 5.809e+01 2.014 0.056382 . treats[aa, 2]11:treats[aa, 4]5.29 -1.647e+02 5.809e+01 -2.835 0.009643 ** treats[aa, 2]12:treats[aa, 4]5.29 -7.667e+00 5.809e+01 -0.132 0.896199 treats[aa, 2]11:treats[aa, 4]6.11 6.577e+01 7.433e+01 0.885 0.385801 treats[aa, 2]12:treats[aa, 4]6.11 4.775e+01 7.433e+01 0.642 0.527269 treats[aa, 2]11:treats[aa, 4]6.17 3.627e+01 7.433e+01 0.488 0.630377 treats[aa, 2]12:treats[aa, 4]6.17 2.725e+01 7.433e+01 0.367 0.717427 treats[aa, 2]11:treats[aa, 4]6.18 -9.073e+01 7.433e+01 -1.221 0.235193 treats[aa, 2]12:treats[aa, 4]6.18 -1.553e+02 7.433e+01 -2.089 0.048534 * treats[aa, 2]11:treats[aa, 4]6.5 -1.257e+02 7.433e+01 -1.691 0.104888 treats[aa, 2]12:treats[aa, 4]6.5 -1.507e+02 7.433e+01 -2.028 0.054838 . treats[aa, 2]11:treats[aa, 5]5.7 -1.840e+01 4.500e+01 -0.409 0.686546 treats[aa, 2]12:treats[aa, 5]5.7 -5.960e+01 4.500e+01 -1.325 0.198909 treats[aa, 2]11:treats[aa, 5]38 9.560e+01 4.500e+01 2.125 0.045092 * treats[aa, 2]12:treats[aa, 5]38 2.860e+01 4.500e+01 0.636 0.531583 treats[aa, 2]11:treats[aa, 5]5709.525e+01 5.031e+01 1.893 0.071534 . treats[aa, 2]12:treats[aa, 5]5702.255e+02 5.031e+01 4.483 0.000186 *** treats[aa, 4]5.15:treats[aa, 5]5.7 8.767e+01 5.809e+01 1.509 0.145483 treats[aa, 4]5.2:treats[aa, 5]5.7 4.333e+00 5.809e+01 0.075 0.941209 treats[aa, 4]5.21:treats[aa, 5]5.7 4.200e+01 5.809e+01 0.723 0.477281 treats[aa, 4]5.29:treats[aa, 5]5.7 -5.700e+01 5.809e+01 -0.981 0.337138 treats[aa, 4]6.11:treats[aa, 5]5.7 NA NA NA NA treats[aa, 4]6.17:treats[aa, 5]5.7 NA NA NA NA treats[aa, 4]6.18:treats[aa, 5]5.7 NA NA NA NA treats[aa, 4]6.5:treats[aa, 5]5.7 NA NA NA NA treats[aa, 4]5.15:treats[aa, 5]38 9.500e+01 5.809e+01 1.635 0.116190 treats[aa, 4]5.2:treats[aa, 5]38 -2.433e+01 5.809e+01 -0.419 0.679354 treats[aa, 4]5.21:treats[aa, 5]38 -9.633e+01 5.809e+01 -1.658 0.111434 treats[aa, 4]5.29:treats[aa, 5]38 2.067e+01 5.809e+01 0.356 0.725398 treats[aa, 4]6.11:treats[aa, 5]38 NA NA NA NA treats[aa, 4]6.17:treats[aa, 5]38
[R] Re-creating distributions
Dear All, I often have to work with certain models in which I try to reproduce a distribution the best I can with very little known information avaible. Is there a package or function in R that could best reproduce a probability distribution using only the mean, median and SD values availble without knowing the actual distribution type to begin with and/or the covariance matrix (for more then 1 data set)? All I usually have reported availble is mean, median and SD. I hope I made my question clear enough... thanks, Andras [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] RJava: Error obtaining System.out
Hello, Any idea why trying to obtain System.out in rJava does not work? library(rJava) .jinit() s - .jnew(java/lang/String, Hello World!) .jcall(s,I,length) [1] 12 systemOut - .jfield(java/lang/System, Ljava/io/PrintStream, out) Error in .jfield(java/lang/System, Ljava/io/PrintStream, out) : RgetField: field out not found Thanks! Take care Oliver -- Oliver Ruebenacker Bioinformatics Consultant (http://www.knowomics.com/wiki/Oliver_Ruebenacker) Knowomics, The Bioinformatics Network (http://www.knowomics.com) SBPAX: Turning Bio Knowledge into Math Models (http://www.sbpax.org) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] degrees of freedom for contrast
Hi, I need some help to figure out the df I should use in t test for my contrast. I have 5 treatments and 5 phenotypes, I would like to compute the difference of treatment means for each phenotype and do t test, such as treatment1 vs treatment2 on phenotype1 How should I calculate the pooled degrees of freedom for the t tests of all the contrasts? Thank you very much. Qian mylong.lme - lme(dscore~Trt.Pheno-1, data=mylong, random=~1 | ID, method=ML) summary(mylong.lme) Linear mixed-effects model fit by maximum likelihood Data: mylong AIC BIClogLik 14789.14 14949.83 -7367.571 Random effects: Formula: ~1 | ID (Intercept) Residual StdDev: 1.40765 3.039555 Fixed effects: dscore~ Trt.Pheno - 1 Value Std.Error DFt-value p-value TrtPheno1_1 :-2.516975 0.2788703 2412 -9.025613 0. Trt.Pheno2_1 : -1.172767 0.3781179 2412 -3.101590 0.0019 Trt.Pheno3_1 :-0.810177 0.2869447 2412 -2.823459 0.0048 Trt.Pheno4_1 : -1.518063 0.2791157 2412 -5.438830 0. Trt.Pheno5_1 : -0.367947 0.3564081 2412 -1.032377 0.3020 .. . coef - fixed.effects(mylong.lme) covmat - mylong.lme$varFix c # my contrast matrix mycontr.est - c %*% coef mycontr.var - c %*% covmat %*% t(c) t2 - t(mycontr.est) %*% solve(mycontr.var) %*% mycontr.est P- 2*pt(sqrt(abs(t2)), df=) version _ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 2 minor 14.0 year 2011 month 10 day31 svn rev57496 language R version.string R version 2.14.0 (2011-10-31) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] degrees of freedom for contrast
Hi, I need some help to figure out the df I should use in t test for my contrast. I have 5 treatments and 5 phenotypes, I would like to compute the difference of treatment means for each phenotype and do t test, such as treatment1 vs treatment2 on phenotype1 How should I calculate the pooled degrees of freedom for the t tests of all the contrasts? Thank you very much. Qian mylong.lme - lme(dscore~Trt.Pheno-1, data=mylong, random=~1 | ID, method=ML) summary(mylong.lme) Linear mixed-effects model fit by maximum likelihood Data: mylong AIC BIClogLik 14789.14 14949.83 -7367.571 Random effects: Formula: ~1 | ID (Intercept) Residual StdDev: 1.40765 3.039555 Fixed effects: dscore~ Trt.Pheno - 1 Value Std.Error DFt-value p-value TrtPheno1_1 :-2.516975 0.2788703 2412 -9.025613 0. Trt.Pheno2_1 : -1.172767 0.3781179 2412 -3.101590 0.0019 Trt.Pheno3_1 :-0.810177 0.2869447 2412 -2.823459 0.0048 Trt.Pheno4_1 : -1.518063 0.2791157 2412 -5.438830 0. Trt.Pheno5_1 : -0.367947 0.3564081 2412 -1.032377 0.3020 .. . coef - fixed.effects(mylong.lme) covmat - mylong.lme$varFix c # my contrast matrix mycontr.est - c %*% coef mycontr.var - c %*% covmat %*% t(c) t2 - t(mycontr.est) %*% solve(mycontr.var) %*% mycontr.est P- 2*pt(sqrt(abs(t2)), df=) version _ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 2 minor 14.0 year 2011 month 10 day31 svn rev57496 language R version.string R version 2.14.0 (2011-10-31) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Query regarding SVD of binary matrix:
Hello, I have a binary matrix of 80k sets (sets comprising of combination of cities) by 885 cities (dimension = 80k x 885). For matrix, 1 means city is a part of the set and 0 means the city is not part of the set. Sets are rows and cities are columns (city.test). I want to do feature reduction to only keep important sets (most likely 2-10 sets of city combinations) and the associated cities. So I chose SVD and I am following these steps but not sure how to go about the next step. Could anyone help with this? s - svd(city.test) D - diag(s$d) d2 - (s$d)^2 ratio - cumsum(d2/dum(d2)) # proportion of total variance from 885 PCs. and looking at the plots, I see about first ~10 or 20 PCs explain the most variation (Please see attatched plot). How do I use this to extract the most relevant sets from my original matrix? COuld you please help. A friend of mine recommended plotting: rowSums(abs(s$u*s$d)) and choosing only the highest magnitude sets. I didn't understand the significance of it. Most probably, it reflects that only the first PC contributes the most, hence we only care about rowsum(abs(u*d)). Is this correct? Thanks. variance-cities.pdf Description: Adobe PDF document __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [r] par and complex graph
Hi Francesco, No, I haven't tried... But if you have some code I can try. Regards, Carlos Ortega www.qualityexcellence.es 2012/6/7 Francesco Nutini nutini.france...@gmail.com Oh thank you Carlos! I wasted a lot of time formatting my xyplot by powerpoint. Did you used a similar tips for ternaryplot (vcd)? Many thanks. Regards, Francesco -- Date: Wed, 6 Jun 2012 17:08:39 +0200 Subject: Re: [R] [r] par and complex graph From: c...@qualityexcellence.es To: nutini.france...@gmail.com Hi, Sorry, layout is a parameter you should use when plotting several charts of the same nature. If you want to combien different lattice charts you should use print() which is a function that has methods to consider trellis objects. Check help details for print.tellis o consider this example: p11 - histogram( ~ height | voice.part, data = singer, xlab=Height) p12 - densityplot( ~ height | voice.part, data = singer, xlab = Height) p2 - histogram( ~ height, data = singer, xlab = Height) ## simple positioning by split print(p11, split=c(1,1,1,2), more=TRUE) print(p2, split=c(1,2,1,2)) ## Combining split and position: print(p11, position = c(0,0,.75,.75), split=c(1,1,1,2), more=TRUE) print(p12, position = c(0,0,.75,.75), split=c(1,2,1,2), more=TRUE) print(p2, position = c(.5,.75,1,1), more=FALSE) Regards, Carlos Ortega www.qualityexcellence.es 2012/6/6 Carlos Ortega c...@qualityexcellence.es Hi Francesco, The parameter in the lattice package that you can use to arrange several plots in the same page is layout: xyplot(Sepal.Length + Sepal.Width ~ Petal.Length + Petal.Width | Species, data = iris, scales = free, *layout = c(2, 2)*, auto.key = list(x = .6, y = .7, corner = c(0, 0))) Regards, Carlos Ortega www.qualityexcellence.es 2012/6/6 Francesco Nutini nutini.france...@gmail.com Thank you Brian! So, that's why sometimes I can't use the par() Now I'm using the ternaryplot in [vcd]. Then, I have to read the vcd help to looking for a function similar to par(). Many thanks. Francesco Date: Tue, 5 Jun 2012 19:01:25 +0100 From: rip...@stats.ox.ac.uk To: nutini.france...@gmail.com CC: r-help@r-project.org Subject: Re: [R] [r] par and complex graph On 05/06/2012 11:17, Francesco Nutini wrote: Dear R-Users, I'd like to have some tips about printing graph. I use the command par to print more graphs in one window:par(mfrow=c(6,1)); par(oma=c(2.5, 2.5, 2.5, 2.5)); par(mar=c(0.5,4, 0.5, 0.5)) But this command doesn't run with complex graphic command (i.e. xyplot, ternaryplot).How can I print more than one graph per page, when I work with this elaborated graph?Many thanks!Francesco xyplot does lattice (hence grid) plots: you need to read ?print.trellis to find out how to lay those out. par() applies only to base graphics. As for ternaryplot: it depends which package you got it from (and there is more than one on CRAN). [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. That does mean you, too. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Saludos, Carlos Ortega www.qualityexcellence.es -- Saludos, Carlos Ortega www.qualityexcellence.es -- Saludos, Carlos Ortega www.qualityexcellence.es [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cluster algorithm with fixed cluster size
Hi, okay, and which algorithm is it? I had a closer look at the manual and could not find it, but there is quite a number of methods in there, maybe I missed it. Thanks, Martin -- View this message in context: http://r.789695.n4.nabble.com/cluster-algorithm-with-fixed-cluster-size-tp4632523p4632746.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] changing font to italic for one entry in legend()
Hello, I need to change the font for one of the items (C. elegans) in my legend to italic. Can someone suggest how to accomplish this? legend('bottomright', bty='n', c('C. elegans range', 'Study area'), cex=0.8, fill=c('light gray', 'white'), border=c('black','black')) I tried using lab.font=c(1,3) but R ignored and did not write the legend at all. Any advice would be great. Thanks. V __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] changing font to italic for one entry in legend()
Thanks Paul. That worked beautifully. V On Thu, Jun 7, 2012 at 7:46 PM, Paul Murrell p.murr...@auckland.ac.nz wrote: Hi On 8/06/2012 12:27 p.m., Vikram Chhatre wrote: Hello, I need to change the font for one of the items (C. elegans) in my legend to italic. Can someone suggest how to accomplish this? legend('bottomright', bty='n', c('C. elegans range', 'Study area'), cex=0.8, fill=c('light gray', 'white'), border=c('black','black')) I tried using lab.font=c(1,3) but R ignored and did not write the legend at all. The help page suggests that text.font=c(3, 1) should work. If you just want the C. elegans to be italic, try something like ... legend('bottomright', bty='n', expression(italic('C. elegans')*' range', 'Study area'), cex=0.8, fill=c('light gray', 'white'), border=c('black','black')) Paul Any advice would be great. Thanks. V __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dr Paul Murrell Department of Statistics The University of Auckland Private Bag 92019 Auckland New Zealand 64 9 3737599 x85392 p...@stat.auckland.ac.nz http://www.stat.auckland.ac.nz/~paul/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Re-creating distributions
Short answer: no, those are (in general) insufficient parameters to characterize a distribution. Long answer: unfortunately, it's not uncommon that those summary statistics are the only ones reported based on someone or other's limited experience with the Gaussian. There are a few things you could try, but each of them has problems: i) Pretend like your data is in fact normal and use those parameters because they do uniquely characterize a normal distribution. MASS (among others) provides a multivariate normal distribution [mvrnorm] if you have a covariance matrix available. ii) If you have reason to imagine another distribution [guided by domain knowledge], try to get its parameters in so far as possible by moment matching. Covariance structures are much harder for the general case though. iii) If you can get something that resembles original data, simply work by bootstrapping / imputation. Hope this helps, Michael On Thu, Jun 7, 2012 at 3:34 PM, Andras Farkas motyoc...@yahoo.com wrote: Dear All, I often have to work with certain models in which I try to reproduce a distribution the best I can with very little known information avaible. Is there a package or function in R that could best reproduce a probability distribution using only the mean, median and SD values availble without knowing the actual distribution type to begin with and/or the covariance matrix (for more then 1 data set)? All I usually have reported availble is mean, median and SD. I hope I made my question clear enough... thanks, Andras [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] extracting values from txt with regular expression
Hi Dan and Rui, Thank you for the suggestions, both were very helpful. Rui's code was quite fast...there is one more thing I want to explore for my own edification, but first I need some help fixing the code below, which is a slight modification to Dan's suggestion. It'll no doubt be tough to beat the time Rui's code finished the task in, but I'm willing to try. First, I need to fix the following, which 'peels' the wrong bit of text from txt_line. Instead of extracting as it now does (shown below), can the code be modified to extract the values 0.01 and -0.05, and store them in the variable 'extracted'? txt_line- PERCENT DISCREPANCY = 0.01 PERCENT DISCREPANCY = -0.05 extracted - strsplit(gsub([+-]?(?:\\d+(?:\\.\\d*)|\\.\\d+)(?:[eE][+-]?\\d+)?,\\1%,txt_line),%) extracted #[1] PERCENT DISCREPANCY =PERCENT DISCREPANCY = -- View this message in context: http://r.789695.n4.nabble.com/extracting-values-from-txt-file-that-follow-user-supplied-quote-tp4632558p4632753.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Re-creating distributions
Related comment: Even the data aren't sufficient. -- Brian Joiner (some years ago). Explanation: See W.E. Deming on analytic vs enumerative statistics. --- Bert On Thu, Jun 7, 2012 at 8:06 PM, R. Michael Weylandt michael.weyla...@gmail.com wrote: Short answer: no, those are (in general) insufficient parameters to characterize a distribution. Long answer: unfortunately, it's not uncommon that those summary statistics are the only ones reported based on someone or other's limited experience with the Gaussian. There are a few things you could try, but each of them has problems: i) Pretend like your data is in fact normal and use those parameters because they do uniquely characterize a normal distribution. MASS (among others) provides a multivariate normal distribution [mvrnorm] if you have a covariance matrix available. ii) If you have reason to imagine another distribution [guided by domain knowledge], try to get its parameters in so far as possible by moment matching. Covariance structures are much harder for the general case though. iii) If you can get something that resembles original data, simply work by bootstrapping / imputation. Hope this helps, Michael On Thu, Jun 7, 2012 at 3:34 PM, Andras Farkas motyoc...@yahoo.com wrote: Dear All, I often have to work with certain models in which I try to reproduce a distribution the best I can with very little known information avaible. Is there a package or function in R that could best reproduce a probability distribution using only the mean, median and SD values availble without knowing the actual distribution type to begin with and/or the covariance matrix (for more then 1 data set)? All I usually have reported availble is mean, median and SD. I hope I made my question clear enough... thanks, Andras [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.