[R] recursive function - finding connections
Dear all, I'm having some problems getting my recursive function to work. At first I though that maybe my data was too big and I increase option(expressions=5). Then I thought that I would try it on some smaller data. Still not working. :( I would have thought there should be a function for this already, so any suggestions are welcomed for other methods. I did try igraph but couldn't get cliques() to give anything useful. Also a quick play with hclust and cut, again nothing too useful. Basically the function is trying to find uniquely connected subgraphs. So the sub-network is only connected by itself and not to other nodes. If everything is connected then the list (connectedList) should be length of 1 and have every index in the 1st slot. cheers, Paul findconnection-function(mat, cutoff){ toList-function(mat, connectList, cutoff, i, idx){ idx-which(mat[,idx] cutoff) if(length(idx) = 1){ connectList[[i]]-idx for(z in 1:length(idx)){ connectList-toList(mat, connectList, cutoff, i, idx[z]) } }else{ return(connectList) } } connectList-list() for(i in 1:ncol(mat)){ connectList-toList(mat, connectList, cutoff, i, i) } return(unique(connectList)) } foomat-matrix(sample(c(1,0.5,0), 100, replace=T), nrow=10) ## example data foomat [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] 0.0 0.5 0.0 0.5 0.5 0.0 0.5 1.0 0.5 0.0 [2,] 0.0 1.0 1.0 0.0 0.0 1.0 0.0 1.0 0.5 1.0 [3,] 1.0 1.0 1.0 1.0 0.5 0.0 0.5 0.5 0.5 0.5 [4,] 0.0 0.5 0.0 0.0 0.5 0.5 0.5 0.0 1.0 0.0 [5,] 0.5 0.5 1.0 1.0 0.5 1.0 1.0 0.5 0.5 0.5 [6,] 0.0 0.5 0.0 0.5 0.5 0.5 0.5 0.5 1.0 1.0 [7,] 1.0 1.0 0.0 1.0 0.0 0.5 1.0 1.0 0.5 0.5 [8,] 0.5 1.0 0.0 0.5 1.0 0.0 1.0 0.0 0.0 0.0 [9,] 0.0 0.5 0.0 0.0 0.5 0.0 0.5 0.0 0.5 0.5 [10,] 1.0 1.0 0.5 1.0 0.0 1.0 0.0 0.0 0.0 0.5 pb-findconnection(foomat, 0.01) Error: C stack usage is too close to the limit Error during wrapup: __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Adding vertical space before and after Sweave chunk
Thanks Duncan, the problem now is that, the space between R code and R output is also increased. I would like to avoid this, i.e. vertical space R code NO SPACE R results vertical space TIA, Mark Am 14.07.2011 um 02:13 schrieb Duncan Murdoch: On 13/07/2011 7:14 PM, Mark Heckmann wrote: I would like to have a bigger default space in front of and behind every Sweave chunk. I have seen that space between input and output is removed as follows: \fvset{listparameters={\setlength{\topsep}{0pt}}} \renewenvironment{Schunk}{\vspace{\topsep}}{\vspace{\topsep}} Still, I can't figure out how to add vertical space before and after the Sweave chunk. Can someone help? 0pt standard for zero points. Use a bigger number if you want a bigger space, e.g. 20pt. Duncan Murdoch Mark Heckmann Blog: www.markheckmann.de R-Blog: http://ryouready.wordpress.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to translate string to variable inside a command in an easy way in R
Hi! What about as.forumla()? Like this: form - as.formula(paste(num, y, ~MemberID, sep=)) agg-aggregate(form, right.a, sum) Would it work as you expect to? HTH, Ivan Le 7/13/2011 19:30, Daniel Nordlund a écrit : -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of UriB Sent: Wednesday, July 13, 2011 9:28 AM To: r-help@r-project.org Subject: Re: [R] How to translate string to variable inside a command in an easy way in R Thanks Here is another question I want to have a function that get a string for example y=AMI and make commands like the following agg-aggregate(numAMI ~MemberID,right.a,sum) Note that I know that numAMI is part of right.a because another function with the string AMI already generated it. Is there a way to do it without paste parse and eval? I can do it by the following commands for y=AMI numy-paste(num,y,sep=) texta-paste(agg-aggregate(,numy,sep=) text2-~MemberID, right.a, sum) text1-paste(texta,text2,sep=) eval(parse(text=text1)) If you can have a shorter code for it then it can be productive. Well, you could do the above in one line (sorry, my email client is breaking the line) agg- eval(parse(text=paste(aggregate(num,y,~MemberID,right.a,sum),sep=''))) You can put that in a function and call it like your_function- function(y) eval(parse(text=paste(aggregate(num,y,~MemberID,right.a,sum),sep=''))) agg- your_function(y='AMI') However, you seem to be trying to wrestle R to the ground to get it to do things your way. Maybe if you gave R-help some context about your overall task (including why you need to construct these commands), someone could provide suggestions on solving your programming tasks in a more R-ish way. Just a thought. Dan Daniel Nordlund Bothell, WA USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Dept. Mammalogy Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY +49(0)40 42838 6231 ivan.calan...@uni-hamburg.de ** http://www.for771.uni-bonn.de http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Re: Sum weights of independent variables across models (AIC)
?model.avg Look at relative importance. Message: 102 Date: Wed, 13 Jul 2011 18:01:14 -0500 From: Michael Just mgj...@gmail.com To: r-help r-help@r-project.org Subject: [R] Sum weights of independent variables across models (AIC) Message-ID: CAHdFeLNoQBAHYL=CJ3dB=jbculdpycgk03dncypf2obvkca...@mail.gmail.com Content-Type: text/plain; charset=ISO-8859-1 Hello, I'd like to sum the weights of each independent variable across linear models that have been evaluated using AIC. For example: library(MuMIn) data(Cement) lm1 - lm(y ~ ., data = Cement) dd - dredge(lm1, beta = TRUE, eval = TRUE, rank = AICc) get.models(dd, subset = delta 4) There are 5 models with a Delta AIC Score of less than 4. I would like to sum the weights for each of the independent variables across the five models. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R in Batch mode
Duncan's suggestion is probably the way to go, but I will just point out that R does have a facility to perform a task when an error occurs. I have my code set up to send me an email when my batch code fails. (email() is a function I wrote that executes sql command to send email via dbmail.) .Err - function() {email(roger@rothschild.com,!Notify: Job FAILED on %+% Sys.info()[4],,TEXT,HIGH)} options (error = .Err) Thanks, Roger -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Duncan Murdoch Sent: Wednesday, July 13, 2011 5:04 PM To: cornejo...@gmail.com Cc: r-help@r-project.org Subject: Re: [R] R in Batch mode On 11-07-13 4:15 PM, Jorge Cornejo wrote: Hi, I'm running long code in Batch mode (as a process in ubuntu using $nohup R CMD BATCH MyCode.R out.out), but every one in a while it fail and the process stops. When the code finish with no problems, it sends an e-mail, but if fails, I'm not aware unless I check the working process, and I cannot be checking the process all day!. Is there any way to send me a warning when the process stops/fails? This is really an Ubuntu question, isn't it? I'd suggest writing a script which calls your R batch file. The last thing the batch file should do is write some sort of status value. The last thing the script should do is check and act on the status. Of course, if whatever kills your batch file kills the script too, you're out of luck. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. *** This message is for the named person's use only. It may\...{{dropped:20}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Gui editor / viewer for large data
Dear all, I am searching for a possibility to view large data sets (e.g. stored in ffdf objects) in a GUI window in a memory-efficient way. So far I looked at gtkDfEdit (package RGtk2Extras) and gdf (package gWidgets). Both operate (as far as I can see) on data frames stored in memory. gtkDfEdit accepts an ff data frame as input, but there is a long delay before it shows up, so I presume the data is converted to an in-memory data.frame beforehand. Ideal would be a solution similar to JTable in Java, where one can implement a 'table model' class with a method (getValueAt or sth. like that) that returns the value of the underlying data for a given table cell (by x and y coordinates). When the table is drawn, getValueAt() is called for each cell in the viewable area. Cells outside the viewable range are not queried, so there is no need to keep the whole data in memory - it can be fetched from disk or computed on demand. Before I seriously consider coding a suitable function or class myself, I would like to ask you, the R community, if anyone knows of an existing solution. To summarize, I need - a scrollable spreadsheet-like GUI element to view and potentially edit data - which only loads as much of the data as it displays at one point in time, - thereby able to handle datasets that are too large to fit into main memory Thanks for any suggestions, Andreas -- Andreas Borg Medizinische Informatik UNIVERSITÄTSMEDIZIN der Johannes Gutenberg-Universität Institut für Medizinische Biometrie, Epidemiologie und Informatik Obere Zahlbacher Straße 69, 55131 Mainz www.imbei.uni-mainz.de Telefon +49 (0) 6131 175062 E-Mail: b...@imbei.uni-mainz.de Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und löschen Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Mail und der darin enthaltenen Informationen ist nicht gestattet. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Adding vertical space before and after Sweave chunk
On 11-07-14 3:23 AM, Mark Heckmann wrote: Thanks Duncan, the problem now is that, the space between R code and R output is also increased. I would like to avoid this, i.e. vertical space R code NO SPACE R results vertical space Don't modify \topsep then, just put the spacing directly into the Schunk redefinition. For example, \renewenvironment{Schunk}{\vspace{20pt}}{\vspace{30pt}} The way environments work in LaTeX is that the first arg (\vspace{20pt}) is put at the start, and the second one (\vspace{30pt}) at the end. Nothing happens in the middle, but the definitions for \Sinput and \Soutput implicitly make use of the \topsep size, so the definition below affected everything. Duncan Murdoch TIA, Mark Am 14.07.2011 um 02:13 schrieb Duncan Murdoch: On 13/07/2011 7:14 PM, Mark Heckmann wrote: I would like to have a bigger default space in front of and behind every Sweave chunk. I have seen that space between input and output is removed as follows: \fvset{listparameters={\setlength{\topsep}{0pt}}} \renewenvironment{Schunk}{\vspace{\topsep}}{\vspace{\topsep}} Still, I can't figure out how to add vertical space before and after the Sweave chunk. Can someone help? 0pt standard for zero points. Use a bigger number if you want a bigger space, e.g. 20pt. Duncan Murdoch ––– Mark Heckmann Blog: www. markheckmann .de http://www.markheckmann.de/ R-Blog: http:// ryouready.wordpress .com http://ryouready.wordpress.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R-help Digest, Vol 101, Issue 14
Július 7-től 14-ig irodán kívül vagyok, és az emailjeimet nem érem el. Sürgős esetben kérem forduljon Kárpáti Edithez (karpati.e...@gyemszi.hu). Üdvözlettel, Mihalicza Péter I will be out of the office from 7 July till 14 July with no access to my emails. In urgent cases please contact Ms. Edit Kárpáti (karpati.e...@gyemszi.hu). With regards, Peter Mihalicza __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Scaling in SVM
Thanks! For testing purposes this rescaling works! But unfortunately due to timing constraints I'm not able to do the rescaling of the data, so as I mentioned I have to work on with unscaled data. So I have to calculate $f(\vec x) = sum_{i \in sv} coefs_i \langle \vec x_i \cdot \vec x \rangle - \rho$, where the sum runs over the support vectors (sorry, no LaTex available here ). The problem is how to get unscaled coefs and rho to calculate the result of the formula in order to save calculation time for the evaluation of the formula. Greetings Matthias -- View this message in context: http://r.789695.n4.nabble.com/Scaling-in-SVM-tp3664588p3666945.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] t-test on a data-frame.
Dear R-helpers, In a data frame I have 100 securities,monthly closing value,from 1995 to present,which I have to 1. Sampling with replacement,make 50 samples of 10 securities each,each sample hence will be a data frame with 10 columns. 2. With uniform probabilty,mark a month from 2000 onwards as a special month,t=0. 3. I have to subtract the market index from each column of each sample and then compute the residues. 4. For each data frame of residues I have to compute the statistic ( Eps i0 - mean(Eps it ) ) / var( Eps it ). Here i and t vary over one particular data frame. i0 corresponds to ith security residue on the special month. Basically a t-test involving a frame instead of a vector. 5. Print out a table where the statistic is significant at the 1,5,10% level. Could someone give the broad ideas on doing this ? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Var.calc in Matching
Hi, I already have matched samples, which were matched using different software. I need to calculate Abadie-Imbens standard errors, together with the average treatment effect. I know that the Matching package enables me to calculate these after a Matching procedure, but is there any way to do it on already matched samples, which were not produce by the Match function? Many thanks, Louise. -- View this message in context: http://r.789695.n4.nabble.com/Var-calc-in-Matching-tp3666950p3666950.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem with x labels of barplot
Hello everyone, i am currently creating a barplot. This barplot takes a vector of ~200 datapoints. Each datapoint represents one bar. http://img96.imageshack.us/i/human1w.png/ (Ok as you see, it is not only one barplot, but a series of barplots). Now, these barplots represent a human chromosome. This means they are ordered. For instance bar number 50, means position 50 in the human chromosome. I would like to have x-axis labels showing this. 0...50..100...150200...250 Yet i do not know how to accomplish this. If you use the normal plot function, these numbers on the xaxis are autogenerated, in case of a barplot, they are not, and i do not know how to create these labels. I would be happy about a solution. -- View this message in context: http://r.789695.n4.nabble.com/Problem-with-x-labels-of-barplot-tp3667337p3667337.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Add a density line to a cumulative histogram - second try
Hi list, this is my second try for first post on this list (I tried to post via email and nothing appeared in my email-inbox, so now I try to use the nabble-web-interface) - I hope that you will only have to read one post in your inbox! Okay, my question ... I was able to plot a histogram and add the density()-line to this plot. I was able to plot a cumulative form of this histogram. Yet, I was not able to add the density line to this cumulative histogram. You can watch a picture of the histograms here: http://www.jochen-bauer.net/downloads/histo-cumulative-density.png Source: # Histogramm histo - hist( hgd$V1, freq=FALSE ) # Dichte-Schätzer-Funktion reinzeichnen denspoints - density( hgd$V1 ) lines( denspoints, col=GREEN, lwd=2 ) # Kumulative Verteilung relativer Häufigkeiten relcum - cumsum( histo$counts ) / sum(histo$counts) barplot( relcum, names.arg=round( histo$mids, 2), col=green, ylab=Wahrscheinlichkeit) # Kumulative Dichtefunktion ? Thanks in advance - Jochen -- View this message in context: http://r.789695.n4.nabble.com/Add-a-density-line-to-a-cumulative-histogram-second-try-tp3666969p3666969.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with x labels of barplot
On 11-07-14 7:51 AM, Don wrote: Hello everyone, i am currently creating a barplot. This barplot takes a vector of ~200 datapoints. Each datapoint represents one bar. http://img96.imageshack.us/i/human1w.png/ (Ok as you see, it is not only one barplot, but a series of barplots). Now, these barplots represent a human chromosome. This means they are ordered. For instance bar number 50, means position 50 in the human chromosome. I would like to have x-axis labels showing this. 0...50..100...150200...250 Yet i do not know how to accomplish this. If you use the normal plot function, these numbers on the xaxis are autogenerated, in case of a barplot, they are not, and i do not know how to create these labels. I would be happy about a solution. The axis() function can draw whatever you ask for. The tricky thing is to get the ticks in the right place, because a barplot doesn't have an obvious x-axis scale. The secret is to save the result of the barplot call: it contains the centres of the bars. For example: x - rpois(200, 20) centres - barplot(x) axis(1, at=centres[c(50, 100, 150)], labels=c(50, 100, 150)) box() Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] t-test on a data-frame.
Hi Student (since you have no other name), On Thu, Jul 14, 2011 at 6:42 AM, Economics Student economicsstudent2...@gmail.com wrote: Dear R-helpers, In a data frame I have 100 securities,monthly closing value,from 1995 to present,which I have to 1. Sampling with replacement,make 50 samples of 10 securities each,each sample hence will be a data frame with 10 columns. 2. With uniform probabilty,mark a month from 2000 onwards as a special month,t=0. 3. I have to subtract the market index from each column of each sample and then compute the residues. 4. For each data frame of residues I have to compute the statistic ( Eps i0 - mean(Eps it ) ) / var( Eps it ). Here i and t vary over one particular data frame. i0 corresponds to ith security residue on the special month. Basically a t-test involving a frame instead of a vector. 5. Print out a table where the statistic is significant at the 1,5,10% level. Could someone give the broad ideas on doing this ? Why yes. You should talk to your professor or TA. The list doesn't do homework problems, though if you ask nicely we may help with the finer points of R. And by nicely, I don't mean saying please, though it doesn't hurt. I mean providing reproducible examples, clear statements of your problem, and signing your email professionally, with an actual name. Sarah -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to read 20 columns from the file
On 07/13/2011 10:08 PM, Kishorenalluri wrote: Hi Jim, Saving is not a problem. I wanted to load/read the columns from the file followed by plotting the area plot using ggplot2. I am a basic user. I am trying to reproduce the plot similar to the example given here. http://processtrends.com/images/RClimate_NINO_34_latest.png Hi Kishorenalluri, How about this? # download the data file from NOAA # http://www.cpc.ncep.noaa.gov/data/indices/wksst.for # make the columns match the header row with a global # replace of - with - (add a space) # then this seems to reproduce the plot you indicated plot(as.Date(sst$Week,%d%b%Y),sst$SSTA1,type=h, col=2+2*(sst$SSTA10), main=Sea surface anomalies from NOAA, xlab=Year,ylab=Temperature anomaly) Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to read 20 columns from the file
On 07/14/2011 10:35 PM, Jim Lemon wrote: On 07/13/2011 10:08 PM, Kishorenalluri wrote: Hi Jim, Saving is not a problem. I wanted to load/read the columns from the file followed by plotting the area plot using ggplot2. I am a basic user. I am trying to reproduce the plot similar to the example given here. http://processtrends.com/images/RClimate_NINO_34_latest.png Hi Kishorenalluri, How about this? # download the data file from NOAA # http://www.cpc.ncep.noaa.gov/data/indices/wksst.for # make the columns match the header row with a global # replace of - with - (add a space) # then this seems to reproduce the plot you indicated plot(as.Date(sst$Week,%d%b%Y),sst$SSTA1,type=h, col=2+2*(sst$SSTA10), main=Sea surface anomalies from NOAA, xlab=Year,ylab=Temperature anomaly) Oops, forgot to include the line reading in the data file: sst-read.table(sst_1990_2011.dat,header=TRUE) Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] SQldf with sqlite and H2
SQldf with sqlite and H2 I have a large csv file (about 2GB) and wanted to import the file into R and do some filtering and analysis. Came across sqldf ( a great idea and product) and was trying to play around to see what would be the best method of doing this. csv file is comma delimited with some columns having comma inside the quoation like this John, Doe. I tried this first ### library(sqldf) sqldf(attach testdb as new) In.File - C:/JP/Temp/2008.csv read.csv.sql(In.File, sql = create table table1 as select * from file, dbname = testdb) It errored out with message NULL Warning message: closing unused connection 3 (C:/JP/Temp/2008.csv) When this failed, I converted this file from comma delimited to tab delimited and used this command # read.csv.sql(In.File, sql = create table table1 as select * from file, dbname = testdb, sep = \t) and this worked, it created testdb sqlite file with the size of 3GB now my question is in 3 parts. 1. Is it possible to create a dataframe with appropriate column classes and use that column classes when I use the read.csv.sql command to create the table. Something like may be create the table from that DF and then update with read.csv.sql.? Any example code will be really helpful. 2. If we use the H2 database instead of default sqlite and use the readcsv option, will that be faster and is there a way we can specify the above thought of applying a DF class to table column properties and update with CSVREAD library(RH2) something like SELECT * FROM CSVREAD('C:/JP/Temp/2008.csv') Any example code will be really helpful. 3. How do we specify where the H2 file is saved. Saw something like this, when I ran this example from RH2 package, couldn't find the file in the working directory. con - dbConnect(H2(), jdbc:h2:~/test, sa, ) Sorry for the long mail. Appreciate all for building a great community and for the wonderful software in R. Thanks for Gabor Grothendieck for bring sqldf to this great community. Any help or direction you can provide in this is highly appreciated. Thanks all. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sum weights of independent variables across models (AIC)
This is what I was looking for. When I initially read about model.avg I didn't recognize it also provided variable scores. Thank you kindly, Mike __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with x labels of barplot
Thanks a lot! Great help! -- View this message in context: http://r.789695.n4.nabble.com/Problem-with-x-labels-of-barplot-tp3667337p3667498.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plotting x y z data from an irregular grid
Hi, I'm trying to plot some data (z) that is linked to latlong coordinates (xy). These coordinates are not however on a regular grid. I also have some shapefiles on which I would like to overlay the data. I can plot the shapefiles (country/city outlines) and overplot the data, but only using quilt.plot because I otherwise always get the error message that 'Error in image.default(..., add = add, col = col) : increasing 'x' and 'y' values expected' which has something to do with the organization of my data but I cannot figure out how to change it correctly. This is the code that I have that works: data-read.csv('with coord_observational log data trends all years all data.8.11.10.csv', header=TRUE) ## this is what the 'data' looks like: head(data) X SiteCode Latitude Longitude p perc_per_year perc_per_year_lower 1 1 A30 51.37357 -0.29172504 0.369164267-0.4781589 -1.390382 2 2 BB1 51.68299 -0.03254972 0.005546354-3.1810064 -5.665312 3 3 BG1 51.56375 0.17789100 0.000405606-3.2260763 -5.344999 4 4 BG2 51.52939 0.13285700 0.434756172-5.1558318 -22.123800 5 5 BH1 50.82334 -0.13724510 0.183375348-0.8735160 -2.240289 6 6 BH2 50.82785 -0.17051300 0.002702969-2.1443157 -3.543378 perc_per_year_upper sig 1 0.4786508 -999.00 2 -0.8588293 -3.181006 3 -1.5251377 -3.226076 4 11.0957982 -999.00 5 0.3431518 -999.00 6 -0.7966338 -2.144316 #read in the shapefile england-readShapePoly('D:/arcGIS/england boundary/england.shp') class(england) #define the projection proj4string(england)-CRS('+proj=tmerc +lat_0=49 +lon_0=-2 +k=0.999601271625 +x_0=40 +y_0=-10 +ellps=airy +units=m +no_defs') # transform the map into the WGS84 projection (epsg:4326): england.wgs-spTransform(england, CRS('+init=epsg:4326')) plot(england.wgs) #plot data over the map: quilt.plot(data$Longitude, data$Latitude, data$perc_per_year, add=TRUE) My problem is that I would like to be able to change how this data looks (not just 'grid squares', but circles, etc) which I can't seem to do wtih quilt.plot. If I could do this I could plot one layer of gridded data (squares as in quilt.plot) and overlay a second layer of 'z' data as points. I have tried plot.surface and image.plot and a number of others, but because of the error message that I get above it won't work. I can use image.plot, etc if I create a grid and interpolate my data onto the grid (see code below), but I don't want interpolated data, I would like discreet values for each latlong. x-seq(-4,2, by=0.0625) y-seq(50,53, by=0.0625) d1-expand.grid(x=x, y=y) data.li-interp(data$Longitude, data$Latitude, data$perc_per_year, duplicate='mean') So my questions are, (1) is there a different function that I should be using with my data as it is and still be able to overplot it onto the map that i've plotted? or (2) is there a way to create this grid and integrate my data into the grid, but not interpolate it? Any help would be very much appreciated. My R skills just are not good enough to do this yet. Thank you!! Erika. -- View this message in context: http://r.789695.n4.nabble.com/plotting-x-y-z-data-from-an-irregular-grid-tp3667605p3667605.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] glm() scale parameters and predicted Values
Peter Maclean pmaclean2011 at yahoo.com writes: In glm() you can use the summary() function to recover the shape parameter (the reciprocal of the dispersion parameter). How do you recover the scale parameter? Also, in the given example, how I estimate and save the geometric mean of the predicted values? For a simple model you can use fitted() or predicted() functions. I will appreciate any help. #Call required R packages require(plyr) require(stats) require(fitdistrplus) require(MASS) #Grouped vector n - c(1:10) yr -c(1:10) ny - list(yr=yr,n=n) require(utils) ny - expand.grid(ny) y = rgamma(100, shape=1.5, rate = 1, scale = 2) It's a bit odd to specify both the rate and scale parameters, which are redundant. I would have guessed that the rate parameter would dominate, but it looks like the scale parameter dominates: set.seed(1001); rgamma(1,shape=1,rate=2) [1] 1.622577 set.seed(1001); rgamma(1,shape=1,scale=2) [1] 6.490306 set.seed(1001); rgamma(1,shape=1,rate=2,scale=2) [1] 6.490306 set.seed(1001); rgamma(1,shape=1,scale=2,rate=2) [1] 6.490306 (I know, I could go look at the source, but it's a .Internal() function and I'm feeling lazy ...) Gdata - cbind(ny,y) Gdata2- Gdata Gdata$x1 - cos((3.14*yr)/365.25) Gdata$x2 - sin((3.14*yr)/365.25) #Fitting Generalized Linear Models Gdata - split(Gdata,Gdata$n) FGLM - lapply(Gdata, function(x){ m - as.numeric(x$y) x1 - m - as.numeric(x$x1) x2 - m - as.numeric(x$x2) summary(glm(m~1+x1+x2, family=Gamma),dispersion=NULL) }) Note that you have simulated a null model (the data don't depend on the covariates). #Save the results of the estimated parameters str(FGLM,no.list = TRUE) SFGLMC- ldply(FGLM, function(x) x$coefficients) SFGLMD- ldply(FGLM, function(x) x$dispersion) GLM - cbind(SFGLMC,SFGLMD) Which scale parameter do you mean? In a real model it will vary with x1 and x2. Let's try an experiment first: set.seed(1001) z - rgamma(1,shape=2,scale=3) g1 - glm(z~1,family=Gamma) summary(g1) Call: glm(formula = z ~ 1, family = Gamma) Deviance Residuals: Min 1Q Median 3Q Max -2.8434 -0.6579 -0.1702 0.3081 2.6220 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 0.167013 0.001189 140.5 2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for Gamma family taken to be 0.5066566) Null deviance: 5419.8 on degrees of freedom Residual deviance: 5419.8 on degrees of freedom AIC: 53526 Here the intercept estimate is 0.167 , which is very nearly 1/6 = 1/(shape*scale) -- i.e. the Gamma GLM is parameterized in terms of the mean (on the inverse scale). If you want to recover the scale parameter for the intercept case, then summary(g1)$dispersion/coef(g1)[1] should be pretty good. As for the geometric means -- that's pretty tricky. *If* you used a log link, then the predicted values on the link scale (i.e. predict(g1,type=link)) would indeed give you the geometric means. On the inverse scale, though, you would have to do a bit of finagling. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] calculating distance inland from coastline
Hi All, Does anybody know of any existing functions that will calculate distance inland from a coastline? It's possible to test if a lon,lat location is land or sea using map.where(), but I need to add a buffer to this of say 2km, to allow for points that are just on the coast, and below the resolution of the worldHires database. I'm working with a marine mammal satellite telemetry dataset and wish to filter out spurious locations on land. On another issue, does anybody know of any free vector map datasets that are more up to date and have a higher resolution than the worldHires database from the 'maps' package. Thanks, Simon -- View this message in context: http://r.789695.n4.nabble.com/calculating-distance-inland-from-coastline-tp3667464p3667464.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R package: pbatR
Dear All, Does anybody have experience with R package pbatR (http://cran.r-project.org/web/packages/pbatR/index.html)? I am trying to use it to analyze the family-based case-control data, but the package totally doesn’t work on my computer. I contacted the authors of the package, but I haven’t heard anything from them. Following the package manual, I tried the simple example as below: library(pbatR) library(tcltk) pbat.set(C:/pbat) x - data.frame(pid = c(1,1,1,2,2,2,2,3,3,3), # three families id = c(1,2,3,1,2,3,4,1,2,3), idfath = c(0,0,1,0,0,1,1,0,0,1), idmoth = c(0,0,2,0,0,2,2,0,0,2), sex = c(1,2,1,1,2,2,2,1,2,1), AffectionStatus = c(0,0,1,0,0,1,0,0,0,1), # 1 for case, 0 for control m1.1 = c(1,1,2,2,1,1,2,2,2,1), # two SNPs with two columns for each SNP m1.2 = c(1,2,1,2,1,2,1,1,2,2), m2.1 = c(1,1,2,2,2,1,1,1,2,1), m2.2 = c(2,1,2,1,2,2,2,1,1,1)) x1 - as.ped(x) y - data.frame(pid = c(1,1,1,2,2,2,2,3,3,3), id = c(1,2,3,1,2,3,4,1,2,3), age = c(55,50,22,38,37,15,11,42,41,17), weight = c(185,170,130,165,170,90,60,170,160,120)) y1 - as.phe(y) 1. I first consider a model with the disease as a phenotype, and two SNPs as predictors (on covariates) as bellow: pbat.m(AffectionStatus ~ NONE, y1, x1, fbat=gee, distribution='categorical', offset='none') But some error messages were returned: Error in writeCommandStrMatch(distribution, distribution, c(default, : 'distribution' can only take on the following values: 'default', 'jiang', 'murphy', 'naive', 'observed'. You passed the invalid value 'categorical'. Then I removed last two arguments pbat.m(AffectionStatus ~ NONE, y1, x1, fbat=gee) This time, a box appeared on the console: R for Windows GUI front-end has encountered a problem and needs to close. 2. I consider a model with the disease as a phenotype, and two covariates (age, weigth) and two SNPs as predictors as bellow: pbat.m(AffectionStatus ~ age + weight, y1, x1, fbat=gee) The function had been running for a very long time and no output was returned until I had to stop it. Any help would be greatly appreciated. Thanks. Lisa -- View this message in context: http://r.789695.n4.nabble.com/R-package-pbatR-tp3667844p3667844.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R package: pbatR
I am guessing (from other evidence of lapses in attention to documentation) that you failed to pay attention when you encountered these sentences on the page you offered a link to: For analysis, this package provides a frontend to the PBAT software For analysis, users must download PBAT (developed by Christoph Lange) and accept it's license, available on the PBAT webpage. -- david. On Jul 14, 2011, at 11:05 AM, Lisa wrote: Dear All, Does anybody have experience with R package pbatR (http://cran.r-project.org/web/packages/pbatR/index.html)? I am trying to use it to analyze the family-based case-control data, but the package totally doesn’t work on my computer. I contacted the authors of the package, but I haven’t heard anything from them. Following the package manual, I tried the simple example as below: library(pbatR) library(tcltk) pbat.set(C:/pbat) x - data.frame(pid = c(1,1,1,2,2,2,2,3,3,3), # three families id = c(1,2,3,1,2,3,4,1,2,3), idfath = c(0,0,1,0,0,1,1,0,0,1), idmoth = c(0,0,2,0,0,2,2,0,0,2), sex = c(1,2,1,1,2,2,2,1,2,1), AffectionStatus = c(0,0,1,0,0,1,0,0,0,1), # 1 for case, 0 for control m1.1 = c(1,1,2,2,1,1,2,2,2,1), # two SNPs with two columns for each SNP m1.2 = c(1,2,1,2,1,2,1,1,2,2), m2.1 = c(1,1,2,2,2,1,1,1,2,1), m2.2 = c(2,1,2,1,2,2,2,1,1,1)) x1 - as.ped(x) y - data.frame(pid = c(1,1,1,2,2,2,2,3,3,3), id = c(1,2,3,1,2,3,4,1,2,3), age = c(55,50,22,38,37,15,11,42,41,17), weight = c(185,170,130,165,170,90,60,170,160,120)) y1 - as.phe(y) 1. I first consider a model with the disease as a phenotype, and two SNPs as predictors (on covariates) as bellow: pbat.m(AffectionStatus ~ NONE, y1, x1, fbat=gee, distribution='categorical', offset='none') But some error messages were returned: Error in writeCommandStrMatch(distribution, distribution, c(default, : 'distribution' can only take on the following values: 'default', 'jiang', 'murphy', 'naive', 'observed'. You passed the invalid value 'categorical'. Then I removed last two arguments pbat.m(AffectionStatus ~ NONE, y1, x1, fbat=gee) This time, a box appeared on the console: R for Windows GUI front-end has encountered a problem and needs to close. 2. I consider a model with the disease as a phenotype, and two covariates (age, weigth) and two SNPs as predictors as bellow: pbat.m(AffectionStatus ~ age + weight, y1, x1, fbat=gee) The function had been running for a very long time and no output was returned until I had to stop it. Any help would be greatly appreciated. Thanks. Lisa -- View this message in context: http://r.789695.n4.nabble.com/R-package-pbatR-tp3667844p3667844.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] calculating distance inland from coastline
Hi Simon A combination of functions gDistance, gBuffer and gIntersects from package rgeos should do the job. Also, have a look at www.naturalearthdata.com. They have various shapefiles with coastlines and land polygons, though I don't know how the resolution compares with the worldHires database. Cheers, Francois Rousseu Date: Thu, 14 Jul 2011 05:43:23 -0700 From: s.j.good...@leeds.ac.uk To: r-help@r-project.org Subject: [R] calculating distance inland from coastline Hi All, Does anybody know of any existing functions that will calculate distance inland from a coastline? It's possible to test if a lon,lat location is land or sea using map.where(), but I need to add a buffer to this of say 2km, to allow for points that are just on the coast, and below the resolution of the worldHires database. I'm working with a marine mammal satellite telemetry dataset and wish to filter out spurious locations on land. On another issue, does anybody know of any free vector map datasets that are more up to date and have a higher resolution than the worldHires database from the 'maps' package. Thanks, Simon -- View this message in context: http://r.789695.n4.nabble.com/calculating-distance-inland-from-coastline-tp3667464p3667464.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Adding rows based on column value
Dear all, I have one problem and did not find any solution.(I have also attached the problem in text file because sometimes column spacing is not good in mail) I have a file(file.txt) attached with this mail.I am reading it using this code to make a data frame (file)- file=read.table(file.txt,fill=T,colClasses = character,header=T) file looks like this- Chr PosCaseA CaseCCaseG CaseT 10 135344110 0.00 24.00 0.00 0.00 10 135344110 0.00 0.00 24.00 0.00 10 135344110 0.00 0.00 24.00 0.00 10 135344113 0.00 0.00 24.00 0.00 10 135344114 24.00 0.00 0.00 0.00 10 135344114 24.00 0.00 0.00 0.00 10 135344116 0.00 0.00 0.00 24.00 10 135344118 0.00 24.00 0.00 0.00 10 135344118 0.00 0.00 0.00 24.00 10 135344122 24.00 0.00 0.00 0.00 10 135344122 0.00 24.00 0.00 0.00 10 135344123 0.00 24.00 0.00 0.00 10 135344123 0.00 24.00 0.00 0.00 10 135344123 0.00 0.00 0.00 24.00 10 135344126 0.00 0.00 24.00 0.00 Now some of the values in column Pos are same.for these same positions i want to add the values of columns 2:6 I will explain with an example- The output of first row should be- Chr PosCaseA CaseCCaseG CaseT 10 135344110 0.00 24.00 48.00 0.00 so the whole output for above input should be- Chr PosCaseA CaseCCaseG CaseT 10 1353441100.00 24.00 48.000.00 10 1353441130.00 0.00 24.000.00 10 135344114 48.00 0.000.00 0.00 10 135344116 0.00 0.000.0024.00 10 135344118 0.00 24.00 0.0024.00 10 135344122 24.00 24.00 0.000.00 10 135344123 0.00 48.00 0.0024.00 10 135344126 0.00 0.0024.00 0.00 Can you please help me. Thanking you, Warm Regards Vikas Bansal Msc Bioinformatics Kings College London Chr Pos CaseA CaseC CaseG CaseT 10 135344110 0.00 24.00 0.00 0.00 10 135344110 0.00 0.00 24.00 0.00 10 135344110 0.00 0.00 24.00 0.00 10 135344113 0.00 0.00 24.00 0.00 10 135344114 24.00 0.00 0.00 0.00 10 135344114 24.00 0.00 0.00 0.00 10 135344116 0.00 0.00 0.00 24.00 10 135344118 0.00 24.00 0.00 0.00 10 135344118 0.00 0.00 0.00 24.00 10 135344122 24.00 0.00 0.00 0.00 10 135344122 0.00 24.00 0.00 0.00 10 135344123 0.00 24.00 0.00 0.00 10 135344123 0.00 24.00 0.00 0.00 10 135344123 0.00 0.00 0.00 24.00 10 135344126 0.00 0.00 24.00 0.00 Dear all, I have one problem and did not find any solution. I have a file(file.txt) attached with this mail.I am reading it using this code to make a data frame (file)- file=read.table(file.txt,fill=T,colClasses = character,header=T) file looks like this- Chr PosCaseA CaseCCaseG CaseT 10 135344110 0.00 24.00 0.00 0.00 10 135344110 0.00 0.00 24.00 0.00 10 135344110 0.00 0.00 24.00 0.00 10 135344113 0.00 0.00 24.00 0.00 10 135344114 24.00 0.00 0.00 0.00 10 135344114 24.00 0.00 0.00 0.00 10 135344116 0.00 0.00 0.00 24.00 10 135344118 0.00 24.00 0.00 0.00 10 135344118 0.00 0.00 0.00 24.00 10 135344122 24.00 0.00 0.00 0.00 10 135344122 0.00 24.00 0.00 0.00 10 135344123 0.00 24.00 0.00 0.00 10 135344123 0.00 24.00 0.00 0.00 10 135344123 0.00 0.00 0.00 24.00 10 135344126 0.00 0.00 24.00 0.00 Now some of the values in column Pos are same.For these same positions i want to add the values of columns 3:6 I will explain with an example- The output of first row should be- Chr Pos CaseA CaseC CaseG CaseT 10 135344110 0.00 24.00 48.00 0.00 because first three rows have same value in Pos column. so the whole output for above input should be- Chr PosCaseA CaseC CaseG CaseT 10 1353441100.00 24.00 48.000.00 10 1353441130.00 0.00 24.000.00 10 135344114 48.00 0.000.00 0.00 10 135344116 0.00 0.000.0024.00
Re: [R] Adding rows based on column value
?tapply (in base R) ?aggregate ?by (wrapper for tapply) ?ave (in base R -- based on tapply) Also package plyr (and several others, undoubtedly). Also google on R summarize data by groups or similar gets many relevant hits. -- Bert 2011/7/14 Bansal, Vikas vikas.ban...@kcl.ac.uk: Dear all, I have one problem and did not find any solution.(I have also attached the problem in text file because sometimes column spacing is not good in mail) I have a file(file.txt) attached with this mail.I am reading it using this code to make a data frame (file)- file=read.table(file.txt,fill=T,colClasses = character,header=T) file looks like this- Chr Pos CaseA CaseC CaseG CaseT 10 135344110 0.00 24.00 0.00 0.00 10 135344110 0.00 0.00 24.00 0.00 10 135344110 0.00 0.00 24.00 0.00 10 135344113 0.00 0.00 24.00 0.00 10 135344114 24.00 0.00 0.00 0.00 10 135344114 24.00 0.00 0.00 0.00 10 135344116 0.00 0.00 0.00 24.00 10 135344118 0.00 24.00 0.00 0.00 10 135344118 0.00 0.00 0.00 24.00 10 135344122 24.00 0.00 0.00 0.00 10 135344122 0.00 24.00 0.00 0.00 10 135344123 0.00 24.00 0.00 0.00 10 135344123 0.00 24.00 0.00 0.00 10 135344123 0.00 0.00 0.00 24.00 10 135344126 0.00 0.00 24.00 0.00 Now some of the values in column Pos are same.for these same positions i want to add the values of columns 2:6 I will explain with an example- The output of first row should be- Chr Pos CaseA CaseC CaseG CaseT 10 135344110 0.00 24.00 48.00 0.00 so the whole output for above input should be- Chr Pos CaseA CaseC CaseG CaseT 10 135344110 0.00 24.00 48.00 0.00 10 135344113 0.00 0.00 24.00 0.00 10 135344114 48.00 0.00 0.00 0.00 10 135344116 0.00 0.00 0.00 24.00 10 135344118 0.00 24.00 0.00 24.00 10 135344122 24.00 24.00 0.00 0.00 10 135344123 0.00 48.00 0.00 24.00 10 135344126 0.00 0.00 24.00 0.00 Can you please help me. Thanking you, Warm Regards Vikas Bansal Msc Bioinformatics Kings College London __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Men by nature long to get on to the ultimate truths, and will often be impatient with elementary studies or fight shy of them. If it were possible to reach the ultimate truths without the elementary studies usually prefixed to them, these would not be preparatory studies but superfluous diversions. -- Maimonides (1135-1204) Bert Gunter Genentech Nonclinical Biostatistics __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] question on formula and terms.formula()
I think you should replace terms.formula(formula) by terms(formula) When terms() is given a formula object it will execute terms.formula but for other classes of inputs it will invoke the appropriate method. E.g., your formula may already be a terms object, in which case terms.formula(formula) will waste time reanalyzing it while terms(formula) will invoke terms.terms, which just returns its input. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: Pang Du [mailto:pan...@vt.edu] Sent: Thursday, July 14, 2011 8:36 AM To: William Dunlap; r-help@r-project.org Subject: RE: [R] question on formula and terms.formula() Thank you so much for your suggestion, Bill. The R program I try to modify needs match.call() for something else. But the problem does seem to be caused by this statement as you suggested. Following this clue, I find out that terms.formula(formula) does essentially what I want for terms.formula(mf$formula). Pang Du Virginia Tech -Original Message- From: William Dunlap [mailto:wdun...@tibco.com] Sent: Wednesday, July 13, 2011 12:21 PM To: pan...@vt.edu; r-help@r-project.org Subject: RE: [R] question on formula and terms.formula() Does your code work if you omit the match.call step? terms() generally doesn't need that. Also, do not call terms.formula(formula): call just terms(formula). The terms method appropriate to the class of the formula will be used. f1 - function(formula, ...) { + terms(formula, ...) + } form - as.formula(y ~ x1 + Error(x2)) f1(form, specials=Error) y ~ x1 + Error(x2) attr(,variables) list(y, x1, Error(x2)) attr(,factors) x1 Error(x2) y 0 0 x1 1 0 Error(x2) 0 1 attr(,term.labels) [1] x1Error(x2) attr(,specials) attr(,specials)$Error [1] 3 attr(,order) [1] 1 1 attr(,intercept) [1] 1 attr(,response) [1] 1 attr(,.Environment) environment: R_GlobalEnv Bill Dunlap TIBCO Spotfire From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] on behalf of pan...@vt.edu [pan...@vt.edu] Sent: Tuesday, July 12, 2011 8:40 PM To: r-help@r-project.org Subject: [R] question on formula and terms.formula() I'm trying to create a formula object to pass on to a function that applies the function terms.formula() to it. f - function(formula, ...) { ... mf - match.call() term - terms.formula(mf$formula) ... } However, my code below gives an error. form - as.formula(y~x) f(form, ...) The error message was: Error in terms.formula(mf$formula): argument is not a valid model. Could anybody help me figure out the problem and find a solution here? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.= __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] rgl: reproduce final state of interactive plot?
After interacting with a 3d plot (eg plot3d, persp3d), is there a way to capture the final settings of view angles, etc, so that the final plot could be easily reproduced? The plot functions themselves just return a vector of 'ids'. -- View this message in context: http://r.789695.n4.nabble.com/rgl-reproduce-final-state-of-interactive-plot-tp3667866p3667866.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] question on formula and terms.formula()
Thank you so much for your suggestion, Bill. The R program I try to modify needs match.call() for something else. But the problem does seem to be caused by this statement as you suggested. Following this clue, I find out that terms.formula(formula) does essentially what I want for terms.formula(mf$formula). Pang Du Virginia Tech -Original Message- From: William Dunlap [mailto:wdun...@tibco.com] Sent: Wednesday, July 13, 2011 12:21 PM To: pan...@vt.edu; r-help@r-project.org Subject: RE: [R] question on formula and terms.formula() Does your code work if you omit the match.call step? terms() generally doesn't need that. Also, do not call terms.formula(formula): call just terms(formula). The terms method appropriate to the class of the formula will be used. f1 - function(formula, ...) { + terms(formula, ...) + } form - as.formula(y ~ x1 + Error(x2)) f1(form, specials=Error) y ~ x1 + Error(x2) attr(,variables) list(y, x1, Error(x2)) attr(,factors) x1 Error(x2) y 0 0 x1 1 0 Error(x2) 0 1 attr(,term.labels) [1] x1Error(x2) attr(,specials) attr(,specials)$Error [1] 3 attr(,order) [1] 1 1 attr(,intercept) [1] 1 attr(,response) [1] 1 attr(,.Environment) environment: R_GlobalEnv Bill Dunlap TIBCO Spotfire From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] on behalf of pan...@vt.edu [pan...@vt.edu] Sent: Tuesday, July 12, 2011 8:40 PM To: r-help@r-project.org Subject: [R] question on formula and terms.formula() I'm trying to create a formula object to pass on to a function that applies the function terms.formula() to it. f - function(formula, ...) { ... mf - match.call() term - terms.formula(mf$formula) ... } However, my code below gives an error. form - as.formula(y~x) f(form, ...) The error message was: Error in terms.formula(mf$formula): argument is not a valid model. Could anybody help me figure out the problem and find a solution here? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.= __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] question on formula and terms.formula()
Points taken and terms(formula) is used now. Thanks. -Original Message- From: William Dunlap [mailto:wdun...@tibco.com] Sent: Thursday, July 14, 2011 11:56 AM To: Pang Du; r-help@r-project.org Subject: RE: [R] question on formula and terms.formula() I think you should replace terms.formula(formula) by terms(formula) When terms() is given a formula object it will execute terms.formula but for other classes of inputs it will invoke the appropriate method. E.g., your formula may already be a terms object, in which case terms.formula(formula) will waste time reanalyzing it while terms(formula) will invoke terms.terms, which just returns its input. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: Pang Du [mailto:pan...@vt.edu] Sent: Thursday, July 14, 2011 8:36 AM To: William Dunlap; r-help@r-project.org Subject: RE: [R] question on formula and terms.formula() Thank you so much for your suggestion, Bill. The R program I try to modify needs match.call() for something else. But the problem does seem to be caused by this statement as you suggested. Following this clue, I find out that terms.formula(formula) does essentially what I want for terms.formula(mf$formula). Pang Du Virginia Tech -Original Message- From: William Dunlap [mailto:wdun...@tibco.com] Sent: Wednesday, July 13, 2011 12:21 PM To: pan...@vt.edu; r-help@r-project.org Subject: RE: [R] question on formula and terms.formula() Does your code work if you omit the match.call step? terms() generally doesn't need that. Also, do not call terms.formula(formula): call just terms(formula). The terms method appropriate to the class of the formula will be used. f1 - function(formula, ...) { + terms(formula, ...) + } form - as.formula(y ~ x1 + Error(x2)) f1(form, specials=Error) y ~ x1 + Error(x2) attr(,variables) list(y, x1, Error(x2)) attr(,factors) x1 Error(x2) y 0 0 x1 1 0 Error(x2) 0 1 attr(,term.labels) [1] x1Error(x2) attr(,specials) attr(,specials)$Error [1] 3 attr(,order) [1] 1 1 attr(,intercept) [1] 1 attr(,response) [1] 1 attr(,.Environment) environment: R_GlobalEnv Bill Dunlap TIBCO Spotfire From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] on behalf of pan...@vt.edu [pan...@vt.edu] Sent: Tuesday, July 12, 2011 8:40 PM To: r-help@r-project.org Subject: [R] question on formula and terms.formula() I'm trying to create a formula object to pass on to a function that applies the function terms.formula() to it. f - function(formula, ...) { ... mf - match.call() term - terms.formula(mf$formula) ... } However, my code below gives an error. form - as.formula(y~x) f(form, ...) The error message was: Error in terms.formula(mf$formula): argument is not a valid model. Could anybody help me figure out the problem and find a solution here? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.= __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rgl: reproduce final state of interactive plot?
On 14/07/2011 11:12 AM, sjaffe wrote: After interacting with a 3d plot (eg plot3d, persp3d), is there a way to capture the final settings of view angles, etc, so that the final plot could be easily reproduced? The plot functions themselves just return a vector of 'ids'. Yes, saving the result of par3d() will save it. (It saves more than necessary; see ?par3d for more details.) Duncan Murdoch -- View this message in context: http://r.789695.n4.nabble.com/rgl-reproduce-final-state-of-interactive-plot-tp3667866p3667866.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Adding rows based on column value
I have checked it but did not get any results.Is there a way I can do it?I will be very thankful to you. Thanking you, Warm Regards Vikas Bansal Msc Bioinformatics Kings College London From: Bert Gunter [gunter.ber...@gene.com] Sent: Thursday, July 14, 2011 4:54 PM To: Bansal, Vikas Cc: r-help@r-project.org Subject: Re: [R] Adding rows based on column value ?tapply (in base R) ?aggregate ?by (wrapper for tapply) ?ave (in base R -- based on tapply) Also package plyr (and several others, undoubtedly). Also google on R summarize data by groups or similar gets many relevant hits. -- Bert 2011/7/14 Bansal, Vikas vikas.ban...@kcl.ac.uk: Dear all, I have one problem and did not find any solution.(I have also attached the problem in text file because sometimes column spacing is not good in mail) I have a file(file.txt) attached with this mail.I am reading it using this code to make a data frame (file)- file=read.table(file.txt,fill=T,colClasses = character,header=T) file looks like this- Chr PosCaseA CaseCCaseG CaseT 10 135344110 0.00 24.00 0.00 0.00 10 135344110 0.00 0.00 24.00 0.00 10 135344110 0.00 0.00 24.00 0.00 10 135344113 0.00 0.00 24.00 0.00 10 135344114 24.00 0.00 0.00 0.00 10 135344114 24.00 0.00 0.00 0.00 10 135344116 0.00 0.00 0.00 24.00 10 135344118 0.00 24.00 0.00 0.00 10 135344118 0.00 0.00 0.00 24.00 10 135344122 24.00 0.00 0.00 0.00 10 135344122 0.00 24.00 0.00 0.00 10 135344123 0.00 24.00 0.00 0.00 10 135344123 0.00 24.00 0.00 0.00 10 135344123 0.00 0.00 0.00 24.00 10 135344126 0.00 0.00 24.00 0.00 Now some of the values in column Pos are same.for these same positions i want to add the values of columns 3:6 I will explain with an example- The output of first row should be- Chr PosCaseA CaseCCaseG CaseT 10 135344110 0.00 24.00 48.00 0.00 so the whole output for above input should be- Chr PosCaseA CaseCCaseG CaseT 10 1353441100.00 24.00 48.000.00 10 1353441130.00 0.00 24.000.00 10 135344114 48.00 0.000.00 0.00 10 135344116 0.00 0.000.0024.00 10 135344118 0.00 24.00 0.0024.00 10 135344122 24.00 24.00 0.000.00 10 135344123 0.00 48.00 0.0024.00 10 135344126 0.00 0.0024.00 0.00 Can you please help me. Thanking you, Warm Regards Vikas Bansal Msc Bioinformatics Kings College London __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Men by nature long to get on to the ultimate truths, and will often be impatient with elementary studies or fight shy of them. If it were possible to reach the ultimate truths without the elementary studies usually prefixed to them, these would not be preparatory studies but superfluous diversions. -- Maimonides (1135-1204) Bert Gunter Genentech Nonclinical Biostatistics __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R package: pbatR
Thanks. I have installed PBAT on my computer. -- View this message in context: http://r.789695.n4.nabble.com/R-package-pbatR-tp3667844p3667907.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Adding rows based on column value
Thanking you, Warm Regards Vikas Bansal Msc Bioinformatics Kings College London From: Bansal, Vikas Sent: Thursday, July 14, 2011 6:07 PM To: Bert Gunter Subject: RE: [R] Adding rows based on column value Yes sir.I am trying. I am using this- aggregate(x = file[,3:6], by = list(file[,2]), FUN = sum) but I think this is not a right way.Because we cannot use sum to add.That is why I was asking for help. Thanking you, Warm Regards Vikas Bansal Msc Bioinformatics Kings College London From: Bert Gunter [gunter.ber...@gene.com] Sent: Thursday, July 14, 2011 6:01 PM To: Bansal, Vikas Subject: Re: [R] Adding rows based on column value Not from me -- I don't believe you've made an honest effort. Maybe someone else will help you. You might try posting reproducible code that show your efforts -- as the posting guide requests. -- Bert On Thu, Jul 14, 2011 at 9:46 AM, Bansal, Vikas vikas.ban...@kcl.ac.uk wrote: I have checked it but did not get any results.Is there a way I can do it?I will be very thankful to you. Thanking you, Warm Regards Vikas Bansal Msc Bioinformatics Kings College London From: Bert Gunter [gunter.ber...@gene.com] Sent: Thursday, July 14, 2011 4:54 PM To: Bansal, Vikas Cc: r-help@r-project.org Subject: Re: [R] Adding rows based on column value ?tapply (in base R) ?aggregate ?by (wrapper for tapply) ?ave (in base R -- based on tapply) Also package plyr (and several others, undoubtedly). Also google on R summarize data by groups or similar gets many relevant hits. -- Bert 2011/7/14 Bansal, Vikas vikas.ban...@kcl.ac.uk: Dear all, I have one problem and did not find any solution.(I have also attached the problem in text file because sometimes column spacing is not good in mail) I have a file(file.txt) attached with this mail.I am reading it using this code to make a data frame (file)- file=read.table(file.txt,fill=T,colClasses = character,header=T) file looks like this- Chr PosCaseA CaseCCaseG CaseT 10 135344110 0.00 24.00 0.00 0.00 10 135344110 0.00 0.00 24.00 0.00 10 135344110 0.00 0.00 24.00 0.00 10 135344113 0.00 0.00 24.00 0.00 10 135344114 24.00 0.00 0.00 0.00 10 135344114 24.00 0.00 0.00 0.00 10 135344116 0.00 0.00 0.00 24.00 10 135344118 0.00 24.00 0.00 0.00 10 135344118 0.00 0.00 0.00 24.00 10 135344122 24.00 0.00 0.00 0.00 10 135344122 0.00 24.00 0.00 0.00 10 135344123 0.00 24.00 0.00 0.00 10 135344123 0.00 24.00 0.00 0.00 10 135344123 0.00 0.00 0.00 24.00 10 135344126 0.00 0.00 24.00 0.00 Now some of the values in column Pos are same.for these same positions i want to add the values of columns 3:6 I will explain with an example- The output of first row should be- Chr PosCaseA CaseCCaseG CaseT 10 135344110 0.00 24.00 48.00 0.00 so the whole output for above input should be- Chr PosCaseA CaseCCaseG CaseT 10 1353441100.00 24.00 48.000.00 10 1353441130.00 0.00 24.000.00 10 135344114 48.00 0.000.00 0.00 10 135344116 0.00 0.000.0024.00 10 135344118 0.00 24.00 0.0024.00 10 135344122 24.00 24.00 0.000.00 10 135344123 0.00 48.00 0.0024.00 10 135344126 0.00 0.0024.00 0.00 Can you please help me. Thanking you, Warm Regards Vikas Bansal Msc Bioinformatics Kings College London __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Men by nature long to get on to the ultimate truths, and will often be impatient with elementary studies or fight shy of them. If it were possible to reach the ultimate truths without the elementary studies usually prefixed to them, these would not be preparatory studies but superfluous diversions. -- Maimonides (1135-1204) Bert Gunter Genentech Nonclinical Biostatistics -- Men by nature long to get on to the ultimate truths, and will often be impatient with elementary studies or fight shy of them. If it were possible to reach the ultimate truths without the elementary studies usually prefixed to them, these would not be preparatory studies but superfluous diversions. --
Re: [R] Writing Complex Formulas
I resolved this issue. It appears that ^ won't work for this case, but ** worked. I can't find any reference to this, but where ^ seems to be used to raise a value to a numerical function, ** is used for a y raised to the power of x where x it a computation. -- View this message in context: http://r.789695.n4.nabble.com/Writing-Complex-Formulas-tp3638379p3668109.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] computing functions with Euler's number (e^n)
I solved this in two ways: 1. ** was necessary to raise (-dummy + 1) to the power of B. ^ doesn't work here, for some reason. 2. I needed to use as.complex which greatly simplified my code and produces the correct response. (I had to revisit math that I had not used in many years.) W - as.complex(((fq/cf[j])**B[j])*(exp(-(fq/cf[j])+1)**B[j])) -- View this message in context: http://r.789695.n4.nabble.com/computing-functions-with-Euler-s-number-e-n-tp3655205p3668125.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Adding rows based on column value
Bansal, Vikas vikas.bansal at kcl.ac.uk writes: I am using this- aggregate(x = file[,3:6], by = list(file[,2]), FUN = sum) Better, although still not reproducible (please *do* read the posting guide -- it is listed at the bottom of every R list post and is the *first* google hit for posting guide (!); search for Examples). What about removing the quotation marks around sum? aggregate(x = file[,3:6], by = list(file[,2]), FUN = sum) but I think this is not a right way. Because we cannot use sum to add.That is why I was asking for help. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Adding rows based on column value
I have tried that also.But it is showing this error- aggregate(file[,3:6], by = list(file[,2]), FUN = sum) Error in FUN(X[[1L]], ...) : invalid 'type' (character) of argument Thanking you, Warm Regards Vikas Bansal Msc Bioinformatics Kings College London From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of Ben Bolker [bbol...@gmail.com] Sent: Thursday, July 14, 2011 6:24 PM To: r-h...@stat.math.ethz.ch Subject: Re: [R] Adding rows based on column value Bansal, Vikas vikas.bansal at kcl.ac.uk writes: I am using this- aggregate(x = file[,3:6], by = list(file[,2]), FUN = sum) Better, although still not reproducible (please *do* read the posting guide -- it is listed at the bottom of every R list post and is the *first* google hit for posting guide (!); search for Examples). What about removing the quotation marks around sum? aggregate(x = file[,3:6], by = list(file[,2]), FUN = sum) but I think this is not a right way. Because we cannot use sum to add.That is why I was asking for help. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] computing functions with Euler's number (e^n)
warmstron1 wrote: I solved this in two ways: 1. ** was necessary to raise (-dummy + 1) to the power of B. ^ doesn't work here, for some reason. ... Using which version R on which platform? Most strange. The help page for Arithmetic operators clearly states in a Note that ** is translated in the parser to ^, ... In R-2.13.1 on Mac OS X I see no difference between results for ^ and **. Berend -- View this message in context: http://r.789695.n4.nabble.com/computing-functions-with-Euler-s-number-e-n-tp3655205p3668256.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Adding rows based on column value
On 07/14/2011 01:46 PM, Bansal, Vikas wrote: I have tried that also.But it is showing this error- aggregate(file[,3:6], by = list(file[,2]), FUN = sum) Error in FUN(X[[1L]], ...) : invalid 'type' (character) of argument Farther down in your previous e-mail you state that you read the file in using file=read.table(file.txt,fill=T,colClasses = character,header=T) the 'colClasses' argument is telling R to read in the data as type character, which of course it is having trouble summing (as the error message suggests: R's error messages are often cryptic, but in this case it seems to be telling you exactly what's wrong). (You probably put it in there so that R wouldn't mess up your second column, but it was overkill. It converted *all* the columns to character.) Try changing your read statement to: file=read.table(file.txt,fill=TRUE, colClasses = rep(c(character,numeric),c(2,4)),header=TRUE) (changing T to TRUE is safer; the different colClasses is the important part. fill=TRUE is probably unnecessary.) If you're unsure what this is doing, please do your best to read ?read.table and ?rep, and try out examples, before responding with further queries ... Ben Bolker Thanking you, Warm Regards Vikas Bansal Msc Bioinformatics Kings College London From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of Ben Bolker [bbol...@gmail.com] Sent: Thursday, July 14, 2011 6:24 PM To: r-h...@stat.math.ethz.ch Subject: Re: [R] Adding rows based on column value Bansal, Vikas vikas.bansal at kcl.ac.uk writes: I am using this- aggregate(x = file[,3:6], by = list(file[,2]), FUN = sum) Better, although still not reproducible (please *do* read the posting guide -- it is listed at the bottom of every R list post and is the *first* google hit for posting guide (!); search for Examples). What about removing the quotation marks around sum? aggregate(x = file[,3:6], by = list(file[,2]), FUN = sum) but I think this is not a right way. Because we cannot use sum to add.That is why I was asking for help. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Adding rows based on column value
Yes, because from your previous posts, you appeared to have read in the data as character: file=read.table(file.txt,fill=T,colClasses = character,header=T) But, of course, without a reproducible example, one cannot be sure. -- Bert On Thu, Jul 14, 2011 at 10:46 AM, Bansal, Vikas vikas.ban...@kcl.ac.uk wrote: I have tried that also.But it is showing this error- aggregate(file[,3:6], by = list(file[,2]), FUN = sum) Error in FUN(X[[1L]], ...) : invalid 'type' (character) of argument Thanking you, Warm Regards Vikas Bansal Msc Bioinformatics Kings College London From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of Ben Bolker [bbol...@gmail.com] Sent: Thursday, July 14, 2011 6:24 PM To: r-h...@stat.math.ethz.ch Subject: Re: [R] Adding rows based on column value Bansal, Vikas vikas.bansal at kcl.ac.uk writes: I am using this- aggregate(x = file[,3:6], by = list(file[,2]), FUN = sum) Better, although still not reproducible (please *do* read the posting guide -- it is listed at the bottom of every R list post and is the *first* google hit for posting guide (!); search for Examples). What about removing the quotation marks around sum? aggregate(x = file[,3:6], by = list(file[,2]), FUN = sum) but I think this is not a right way. Because we cannot use sum to add.That is why I was asking for help. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Men by nature long to get on to the ultimate truths, and will often be impatient with elementary studies or fight shy of them. If it were possible to reach the ultimate truths without the elementary studies usually prefixed to them, these would not be preparatory studies but superfluous diversions. -- Maimonides (1135-1204) Bert Gunter Genentech Nonclinical Biostatistics __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rgl: reproduce final state of interactive plot?
Terrific! This is great to know. I first tried saving and restoring the entire set from par3d but this produced some changes (eg bg) and also one must call par3d with no.readonly=TRUE. Clearly this is the way to go if one has changed a variety of rgl properties. But if one has only used the mouse to rotate/scale I discovered (by looking at the documentation of view3d) that I believe that all one needs are userMatrix, FOV, and zoom: ## create an rgl plot, interact with it, then: snap - par3d( c(userMatrix, FOV, zoom) ) ## create a new rgl plot, apply the same transformation: par3d( snap ) This can also be used to 'snap' the current view during an interactive session and restore it later during that same session, which could be quite useful. To save typing, a (trivial) pair of functions to encapsulate this: snap.view- function() par3d( c(userMatrix, FOV, zoom) ) restore.view - function( snap) par3d( snap ) -- View this message in context: http://r.789695.n4.nabble.com/rgl-reproduce-final-state-of-interactive-plot-tp3667866p3668272.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQldf with sqlite and H2
On Thu, Jul 14, 2011 at 10:33 AM, Mandans mandan...@yahoo.com wrote: SQldf with sqlite and H2 I have a large csv file (about 2GB) and wanted to import the file into R and do some filtering and analysis. Came across sqldf ( a great idea and product) and was trying to play around to see what would be the best method of doing this. csv file is comma delimited with some columns having comma inside the quoation like this John, Doe. I tried this first ### library(sqldf) sqldf(attach testdb as new) In.File - C:/JP/Temp/2008.csv read.csv.sql(In.File, sql = create table table1 as select * from file, dbname = testdb) It errored out with message NULL Warning message: closing unused connection 3 (C:/JP/Temp/2008.csv) When this failed, I converted this file from comma delimited to tab delimited and used this command # read.csv.sql(In.File, sql = create table table1 as select * from file, dbname = testdb, sep = \t) and this worked, it created testdb sqlite file with the size of 3GB now my question is in 3 parts. 1. Is it possible to create a dataframe with appropriate column classes and use that column classes when I use the read.csv.sql command to create the table. Something like may be create the table from that DF and then update with read.csv.sql.? Any example code will be really helpful. Here is an example of using method = name__class. Note there are two underscores in a row. It appears I neglected to document that Date2 means convert from character representation whereas Date means convert from numeric representation. It would also be possible to use method = raw and then coerce the columns yourself afterwards. # create test file Lines - 'A__Date2|B 2000-01-01|x,y 2000-01-02|c,d ' tf - tempfile() cat(Lines, file = tf) library(sqldf) DF - read.csv.sql(tf, sep = |, method = name__class) str(DF) 2. If we use the H2 database instead of default sqlite and use the readcsv option, will that be faster and is there a way we can specify the above thought of applying a DF class to table column properties and update with CSVREAD library(RH2) something like SELECT * FROM CSVREAD('C:/JP/Temp/2008.csv') Any example code will be really helpful. Sorry, I haven't tested the speed of this. postgresql and mysql, both supported by sqldf, also have builtin methods to read files. If I had to guess I would guess that mysql would be fastest but this would have to be tested. 3. How do we specify where the H2 file is saved. Saw something like this, when I ran this example from RH2 package, couldn't find the file in the working directory. con - dbConnect(H2(), jdbc:h2:~/test, sa, ) ~ means your home directory so ~/test means test is in the home directory. Try normalizePath(~) normalizePath(~/test) etc. to see what they refer to. Regards. Sorry for the long mail. Appreciate all for building a great community and for the wonderful software in R. Thanks for Gabor Grothendieck for bring sqldf to this great community. Any help or direction you can provide in this is highly appreciated. Thanks all. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Simple clustering help
Hi all, I have just begun to use R and am hoping to receive some advice about the problem I need to solve. I have a file containing xy points that I need to find all significant clusters and write each of their xy coordinates to file(total points ~ 75000 and sig. cluster = 2500 points. I want to use a euclidean distance threshold to determine if a point belongs to a cluster. My initial thought is to take a random (seed) point and write a region growing method to determine how many points belong to the cluster (basically, add to the cluster all points that are within the threshold and continue this until no points are with the threshold) . Once there are no neighboring points within the threshold, if the number of points added to the region (cluster) is greater than 2500 I would write all of the point's coordinates to a text file and remove them from the list of seed candidates and begin again. If the cluster size is less than 2500 I would simply remove the points as they are not significant. The process would continue until there are less than 2500 points remaining. Is there a package that would be helpful in this task? Thanks Don -- View this message in context: http://r.789695.n4.nabble.com/Simple-clustering-help-tp3668274p3668274.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] LME and overall treatment effects
Hello fellow R users, I am having a problem finding the estimates for some overall treatment effects for my mixed models using 'lme' (package nlme). I hope someone can help. Firstly then, the model: The data: Plant biomass (log transformed) Fixed Factors: Treatment(x3 Dry, Wet, Control) Year(x8 2002-2009) Random Factors: 5 plots per treatment, 5 quadrats per plot (N=594 (3*5*5*8) with 6 missing values). I am modelling this in two ways, firstly with year as a continuous variable (interested in the difference in estimated slope over time in each treatment 'year*treatment'), and secondly with year as a categorical variable (interested in difference between 'treatments'). When using Year as a continuous variable, the output of the lme means that I can compare the 3 treatments within my model... i.e. it takes one of the Treatment*year interactions as the baseline and compares (contrasts) the other two to that. I can then calculate the overall treatment*year effect using 'anova.lme(Model). However, the problem comes when I use Year as a categorical variable. Here, I am interested solely in the Treatment effect (not the interaction with year). However, the output for the two labelled 'Treatment's against the third comparison, are not the overall effect but are a comparison within a year (2002). I can still get my overall effect (using anova.lme) but how do I calculate the estimates (with P-Values if possible) for each seperate overall treatment comparison (not within year). I tried 'glht' (package 'multicomp') but this only works if there are no interactions present, otherwise again it gives a comparison for one particular year. Very grateful for any assistance, Mark __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Amelia_Multiple_Imputation_with_observational_priors_noms
I am fairly new at using R/programming in general so I apologize if I am leaving crucial parts of the puzzle out, but here goes. First and foremost this is the error I am receiving: Error in muPriors[priors[, 1:2]] - priors[, 3] : NAs are not allowed in subscripted assignments This occurs only when I am using observational priors and some number of nominal variables, it does not occur when I change to specifying the variables as ordinal, or if I do not use priors. Using a doctored version of the example from the Amelia User Guide this is what I came up with: library(Amelia) library(MatchIt) library(Zelig) library(foreign) data(freetrade) a.out = amelia(freetrade, m=5, ts = year, cs = country) newtrade = which(is.na(freetrade), arr.ind = TRUE) newtrade newtradeMean = c(30,29,28,31,32,34,30,34,26,32,30,29,28,31,32,34,30,34,26,32,30,29,28,31,32,34,30,34,26, 32,30,29,28,31,32,34,30,34,26,32,30,29,28,31,32,34,30,34,26,32,31,31,32,33,34,35,20,21, 7,8,2.0,2.1,2.2,2.0,2.5,2.3,2.5,2.1,1.9,1.8,1.7,2.9,2.0,0,0,1,.3,.4,.5,.3,.2,.3,.2,.4,.5,.3,.2,.3,.3,.4,.5,.3,.2,.3) newishtrade = cbind(newtrade,newtradeMean) newishtrade newtradeSD = c(5,5,5,4,3,3,2,4,5,4,5,5,5,4,3,3,2,4,5,4,5,5,5,4,3,3,2,4,5,4,5,5,5,4,3,3,2,4,5,4,5,5,5,4,3,3,2,4,5,4,3,4,5,6,4,3,4,2, 4,1,1.2,.3,.4,.3,.4,.3,.2,.3,.4,.3,.3,.3,.2,.3,.2,.3,.3,.1,.1,.2,.2,.3,.3,.1,.1,.2,.2,.3,.3,.1,.1,.2,.2,.3) newishtrade = cbind(newtrade,newtradeMean, newtradeSD) newishtrade a.out = amelia(freetrade, m=5, ts = year, cs = country, priors = newishtrade, p2s=2, noms=signed) a.out = amelia(freetrade, m=5, ts = year, cs = country, priors = newishtrade, p2s=2, ords=signed) I followed the trail back through the source code to try and see what was occurring. I learned some debugging techniques and tried to discern if it was something with the code or perhaps some incorrect steps in using priors with nominal variables. To summarize, I cannot get observational priors to work when I specify nominal variables. If it helps below is my sourcing output scattered through the various functions in the source code. Thoughts? amelia starting before prep amelia.prep started beginning prep functions Variables used: tariff polity pop gdp.pc intresmi fiveop usheg noms.signed.2 after prep running bootstrap does this work Error in muPriors[priors[, 1:2]] - priors[, 3] : NAs are not allowed in subscripted assignments [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Writing Complex Formulas
On 14/07/2011 12:46 PM, warmstron1 wrote: I resolved this issue. It appears that ^ won't work for this case, but ** worked. I can't find any reference to this, but where ^ seems to be used to raise a value to a numerical function, ** is used for a y raised to the power of x where x it a computation. Those should be equivalent. Can you post the code that wasn't working, and describe what not working meant? Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rgl: reproduce final state of interactive plot?
sjaffe sjaffe at riskspan.com writes: [snip snip] This can also be used to 'snap' the current view during an interactive session and restore it later during that same session, which could be quite useful. To save typing, a (trivial) pair of functions to encapsulate this: snap.view- function() par3d( c(userMatrix, FOV, zoom) ) restore.view - function( snap) par3d( snap ) This is very nice, I keep having to remember how to do things like it. Duncan, perhaps it could go in an example in the package ... ? Ben __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Repating a loop of lm function with different columns of database
Hi, First let me thank you for the incredible help and resource that this forum is. I am trying to compare the repeated measurement of more than 100 analytes that have been take in 70 subjects at 2 time points adjusted for the time difference of sample times(TimeDifferenceDays), therefore I wanted to do it with a function that allows me to do all at once. (131 is the column difference that separates the two different measurements of the same anlyte) I have this one: for(i in 1:125){return(summary(lm(Data[,i] ~ Data[,(i+131)] + Data$TimeDifferenceDays)))} But it only gives me one result I also wanted to get the p-value in a dataframe with three columns: Fitst column: analyte name (that´s the name of the column)Second column: pvalue of the first measure of the analte predicting the second measureThird column: The effect of time I copy a samller example: txt - V1a V2a V3a V1b V2b V3b TimeDifferenceDays2.42 72.4 3.75 2.46 55.4 4.44 6081.66 89.7 2.54 2.17 94.0 2.15 4192.45 112. 0.46 2.40 129.0 .42 7142.58 55.6 5.05 2.44 135.0 5.39 7212.61 332.0 22.6 3.55 238.0 16.4 729 Data - read.table(textConnection(txt), header = TRUE) Thank you Jon Toledo, MD Postdoctoral fellow University of Pennsylvania School of Medicine Center for Neurodegenerative Disease Research [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] LME and overall treatment effects
Probably no hope of help until you do as the posting guide asks: __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and *** provide commented, minimal, self-contained, reproducible code.*** -- Bert On Thu, Jul 14, 2011 at 11:02 AM, Mark Bilton mark.bil...@uni-tuebingen.de wrote: Hello fellow R users, I am having a problem finding the estimates for some overall treatment effects for my mixed models using 'lme' (package nlme). I hope someone can help. Firstly then, the model: The data: Plant biomass (log transformed) Fixed Factors: Treatment(x3 Dry, Wet, Control) Year(x8 2002-2009) Random Factors: 5 plots per treatment, 5 quadrats per plot (N=594 (3*5*5*8) with 6 missing values). I am modelling this in two ways, firstly with year as a continuous variable (interested in the difference in estimated slope over time in each treatment 'year*treatment'), and secondly with year as a categorical variable (interested in difference between 'treatments'). When using Year as a continuous variable, the output of the lme means that I can compare the 3 treatments within my model... i.e. it takes one of the Treatment*year interactions as the baseline and compares (contrasts) the other two to that. I can then calculate the overall treatment*year effect using 'anova.lme(Model). However, the problem comes when I use Year as a categorical variable. Here, I am interested solely in the Treatment effect (not the interaction with year). However, the output for the two labelled 'Treatment's against the third comparison, are not the overall effect but are a comparison within a year (2002). I can still get my overall effect (using anova.lme) but how do I calculate the estimates (with P-Values if possible) for each seperate overall treatment comparison (not within year). I tried 'glht' (package 'multicomp') but this only works if there are no interactions present, otherwise again it gives a comparison for one particular year. Very grateful for any assistance, Mark __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Men by nature long to get on to the ultimate truths, and will often be impatient with elementary studies or fight shy of them. If it were possible to reach the ultimate truths without the elementary studies usually prefixed to them, these would not be preparatory studies but superfluous diversions. -- Maimonides (1135-1204) Bert Gunter Genentech Nonclinical Biostatistics __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] cbind in aggregate formula - based on an existing object (vector)
Hello! I am aggregating using a formula in aggregate - of the type: aggregate(cbind(var1,var2,var3)~factor1+factor2,sum,data=mydata) However, I actually have an object (vector of my variables to be aggregated): myvars-c(var1,var2,var3) I'd like my aggregate formula (its cbind part) to be able to use my myvars object. Is it possible? Thanks for your help! Dimitri Reproducible example: mydate = rep(seq(as.Date(2008-12-01), length = 3, by = month),4) value1=c(1,10,100,2,20,200,3,30,300,4,40,400) value2=c(1.1,10.1,100.1,2.1,20.1,200.1,3.1,30.1,300.1,4.1,40.1,400.1) example-data.frame(mydate=mydate,value1=value1,value2=value2) example$group-c(rep(group1,3),rep(group2,3),rep(group1,3),rep(group2,3)) example$group-as.factor(example$group) (example);str(example) example.agg1-aggregate(cbind(value1,value2)~group+mydate,sum,data=example) # this works (example.agg1) ### Building my object (vector of 2 names - in reality, many more): myvars-c(value1,value2) example.agg1-aggregate(cbind(myvars)~group+mydate,sum,data=example) ### does not work -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cbind in aggregate formula - based on an existing object (vector)
On Jul 14, 2011, at 3:05 PM, Dimitri Liakhovitski wrote: Hello! I am aggregating using a formula in aggregate - of the type: aggregate(cbind(var1,var2,var3)~factor1+factor2,sum,data=mydata) However, I actually have an object (vector of my variables to be aggregated): myvars-c(var1,var2,var3) I'd like my aggregate formula (its cbind part) to be able to use my myvars object. Is it possible? Thanks for your help! Not sure I have gotten all the way there, but this does work: example.agg1-aggregate(as.matrix(example[myvars])~group +mydate,sum,data=example) example.agg1 group mydate example[myvars]NA 1 group1 2008-12-01 4 4.2 2 group2 2008-12-01 6 6.2 3 group1 2009-01-01 40 40.2 4 group2 2009-01-01 60 60.2 5 group1 2009-02-01 400 400.2 6 group2 2009-02-01 600 600.2 Dimitri Reproducible example: mydate = rep(seq(as.Date(2008-12-01), length = 3, by = month),4) value1=c(1,10,100,2,20,200,3,30,300,4,40,400) value2=c(1.1,10.1,100.1,2.1,20.1,200.1,3.1,30.1,300.1,4.1,40.1,400.1) example-data.frame(mydate=mydate,value1=value1,value2=value2) example$group-c(rep(group1,3),rep(group2,3),rep(group1, 3),rep(group2,3)) example$group-as.factor(example$group) (example);str(example) example.agg1-aggregate(cbind(value1,value2)~group +mydate,sum,data=example) # this works (example.agg1) ### Building my object (vector of 2 names - in reality, many more): myvars-c(value1,value2) example.agg1-aggregate(cbind(myvars)~group+mydate,sum,data=example) ### does not work -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cbind in aggregate formula - based on an existing object (vector)
Thank you, David, it does work. Could you please explain why? What exactly does changing it to as matrix do? Thank you! Dimitri On Thu, Jul 14, 2011 at 3:25 PM, David Winsemius dwinsem...@comcast.net wrote: On Jul 14, 2011, at 3:05 PM, Dimitri Liakhovitski wrote: Hello! I am aggregating using a formula in aggregate - of the type: aggregate(cbind(var1,var2,var3)~factor1+factor2,sum,data=mydata) However, I actually have an object (vector of my variables to be aggregated): myvars-c(var1,var2,var3) I'd like my aggregate formula (its cbind part) to be able to use my myvars object. Is it possible? Thanks for your help! Not sure I have gotten all the way there, but this does work: example.agg1-aggregate(as.matrix(example[myvars])~group+mydate,sum,data=example) example.agg1 group mydate example[myvars] NA 1 group1 2008-12-01 4 4.2 2 group2 2008-12-01 6 6.2 3 group1 2009-01-01 40 40.2 4 group2 2009-01-01 60 60.2 5 group1 2009-02-01 400 400.2 6 group2 2009-02-01 600 600.2 Dimitri Reproducible example: mydate = rep(seq(as.Date(2008-12-01), length = 3, by = month),4) value1=c(1,10,100,2,20,200,3,30,300,4,40,400) value2=c(1.1,10.1,100.1,2.1,20.1,200.1,3.1,30.1,300.1,4.1,40.1,400.1) example-data.frame(mydate=mydate,value1=value1,value2=value2) example$group-c(rep(group1,3),rep(group2,3),rep(group1,3),rep(group2,3)) example$group-as.factor(example$group) (example);str(example) example.agg1-aggregate(cbind(value1,value2)~group+mydate,sum,data=example) # this works (example.agg1) ### Building my object (vector of 2 names - in reality, many more): myvars-c(value1,value2) example.agg1-aggregate(cbind(myvars)~group+mydate,sum,data=example) ### does not work -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cbind in aggregate formula - based on an existing object (vector)
Dmitri: Look at my vars from myvars-c(value1,value2) It's just a character vector of length 2! You can't cbind a character vector of length 2! These are not references/pointers. It's not at all clear to me what you ultimately want to do, but IF it's: pass a character vector of names to be used as the LHS of the aggregate .formula call, then something like: (untested) MyVars - do.call(cbind, lapply(myvars,get)) and then aggregate(MyVars ~ ...) might do. But there are so many potential scoping problems here that I would not be surprised if it failed. The usual advice for this sort of thing is to use substitute() or maybe the dreaded eval(parse(...)) construction -- but as I said, I don't really understand what you're after. -- Bert On Thu, Jul 14, 2011 at 12:32 PM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Thank you, David, it does work. Could you please explain why? What exactly does changing it to as matrix do? Thank you! Dimitri On Thu, Jul 14, 2011 at 3:25 PM, David Winsemius dwinsem...@comcast.net wrote: On Jul 14, 2011, at 3:05 PM, Dimitri Liakhovitski wrote: Hello! I am aggregating using a formula in aggregate - of the type: aggregate(cbind(var1,var2,var3)~factor1+factor2,sum,data=mydata) However, I actually have an object (vector of my variables to be aggregated): myvars-c(var1,var2,var3) I'd like my aggregate formula (its cbind part) to be able to use my myvars object. Is it possible? Thanks for your help! Not sure I have gotten all the way there, but this does work: example.agg1-aggregate(as.matrix(example[myvars])~group+mydate,sum,data=example) example.agg1 group mydate example[myvars] NA 1 group1 2008-12-01 4 4.2 2 group2 2008-12-01 6 6.2 3 group1 2009-01-01 40 40.2 4 group2 2009-01-01 60 60.2 5 group1 2009-02-01 400 400.2 6 group2 2009-02-01 600 600.2 Dimitri Reproducible example: mydate = rep(seq(as.Date(2008-12-01), length = 3, by = month),4) value1=c(1,10,100,2,20,200,3,30,300,4,40,400) value2=c(1.1,10.1,100.1,2.1,20.1,200.1,3.1,30.1,300.1,4.1,40.1,400.1) example-data.frame(mydate=mydate,value1=value1,value2=value2) example$group-c(rep(group1,3),rep(group2,3),rep(group1,3),rep(group2,3)) example$group-as.factor(example$group) (example);str(example) example.agg1-aggregate(cbind(value1,value2)~group+mydate,sum,data=example) # this works (example.agg1) ### Building my object (vector of 2 names - in reality, many more): myvars-c(value1,value2) example.agg1-aggregate(cbind(myvars)~group+mydate,sum,data=example) ### does not work -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Men by nature long to get on to the ultimate truths, and will often be impatient with elementary studies or fight shy of them. If it were possible to reach the ultimate truths without the elementary studies usually prefixed to them, these would not be preparatory studies but superfluous diversions. -- Maimonides (1135-1204) Bert Gunter Genentech Nonclinical Biostatistics 467-7374 http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cbind in aggregate formula - based on an existing object (vector)
Dmitri: as.matrix makes a matrix out of the dataframe that is passed to it. As a further note I attempted and failed for reasons that are unclear to me to construct a formula that would (I hoped) preserve the column names which are being mangle in the posted effort: form - as.formula(paste( cbind(, paste( myvars, collapse=,), ) ~ group+mydate, sep= ) ) myvars-c(value1,value2) example.agg1-aggregate(formula=form,data=example, FUN=sum) Error in m[[2L]][[2L]] : object of type 'symbol' is not subsettable traceback() 2: aggregate.formula(formula = form, data = example, FUN = sum) 1: aggregate(formula = form, data = example, FUN = sum) form cbind(value1, value2) ~ group + mydate parse(text=form) expression(~ cbind(value1, value2), group + mydate) So it seems to be correctly dispatched to aggregate.formula but not passing some check or another. Also tried with formula() rather than as.formula with identical error message. Also tried including without naming the argument. -- David On Jul 14, 2011, at 3:32 PM, Dimitri Liakhovitski wrote: Thank you, David, it does work. Could you please explain why? What exactly does changing it to as matrix do? Thank you! Dimitri On Thu, Jul 14, 2011 at 3:25 PM, David Winsemius dwinsem...@comcast.net wrote: On Jul 14, 2011, at 3:05 PM, Dimitri Liakhovitski wrote: Hello! I am aggregating using a formula in aggregate - of the type: aggregate(cbind(var1,var2,var3)~factor1+factor2,sum,data=mydata) However, I actually have an object (vector of my variables to be aggregated): myvars-c(var1,var2,var3) I'd like my aggregate formula (its cbind part) to be able to use my myvars object. Is it possible? Thanks for your help! Not sure I have gotten all the way there, but this does work: example.agg1-aggregate(as.matrix(example[myvars])~group +mydate,sum,data=example) example.agg1 group mydate example[myvars]NA 1 group1 2008-12-01 4 4.2 2 group2 2008-12-01 6 6.2 3 group1 2009-01-01 40 40.2 4 group2 2009-01-01 60 60.2 5 group1 2009-02-01 400 400.2 6 group2 2009-02-01 600 600.2 Dimitri Reproducible example: mydate = rep(seq(as.Date(2008-12-01), length = 3, by = month),4) value1=c(1,10,100,2,20,200,3,30,300,4,40,400) value2 =c(1.1,10.1,100.1,2.1,20.1,200.1,3.1,30.1,300.1,4.1,40.1,400.1) example-data.frame(mydate=mydate,value1=value1,value2=value2) example$group-c(rep(group1,3),rep(group2,3),rep(group1, 3),rep(group2,3)) example$group-as.factor(example$group) (example);str(example) example.agg1-aggregate(cbind(value1,value2)~group +mydate,sum,data=example) # this works (example.agg1) ### Building my object (vector of 2 names - in reality, many more): myvars-c(value1,value2) example.agg1-aggregate(cbind(myvars)~group+mydate,sum,data=example) ### does not work -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT -- Dimitri Liakhovitski Ninah Consulting www.ninah.com David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sum weights of independent variables across models (AIC)
Why go to so much trouble? Why not fit a single full model and use it? Even better why not use a quadratic penalty on the full model to get optimum cross-validation? Frank nofunsally wrote: Hello, I'd like to sum the weights of each independent variable across linear models that have been evaluated using AIC. For example: library(MuMIn) data(Cement) lm1 - lm(y ~ ., data = Cement) dd - dredge(lm1, beta = TRUE, eval = TRUE, rank = AICc) get.models(dd, subset = delta 4) There are 5 models with a Delta AIC Score of less than 4. I would like to sum the weights for each of the independent variables across the five models. How can I do that? Thanks, Mike __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Sum-weights-of-independent-variables-across-models-AIC-tp3666306p3668513.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Repating a loop of lm function with different columns of database
If I understood your question x-data.frame(matrix(rnorm(2000,10,10),ncol=50)) sapply(1:5,function(i) summary(lm(x[,i]~x[,i+10]+x[,50]))) Weidong Gu On Thu, Jul 14, 2011 at 2:27 PM, Jon Toledo tintin...@hotmail.com wrote: Hi, First let me thank you for the incredible help and resource that this forum is. I am trying to compare the repeated measurement of more than 100 analytes that have been take in 70 subjects at 2 time points adjusted for the time difference of sample times(TimeDifferenceDays), therefore I wanted to do it with a function that allows me to do all at once. (131 is the column difference that separates the two different measurements of the same anlyte) I have this one: for(i in 1:125){return(summary(lm(Data[,i] ~ Data[,(i+131)] + Data$TimeDifferenceDays)))} But it only gives me one result I also wanted to get the p-value in a dataframe with three columns: Fitst column: analyte name (that愀 the name of the column)Second column: pvalue of the first measure of the analte predicting the second measureThird column: The effect of time I copy a samller example: txt - V1a V2a V3a V1b V2b V3b TimeDifferenceDays2.42 72.4 3.75 2.46 55.4 4.44 6081.66 89.7 2.54 2.17 94.0 2.15 4192.45 112. 0.46 2.40 129.0 .42 7142.58 55.6 5.05 2.44 135.0 5.39 7212.61 332.0 22.6 3.55 238.0 16.4 729 Data - read.table(textConnection(txt), header = TRUE) Thank you Jon Toledo, MD Postdoctoral fellow University of Pennsylvania School of Medicine Center for Neurodegenerative Disease Research [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] WLS regression, lm() with weights as a matrix
Dear All, I've been trying to run a Weighted Least Squares (WLS) regression: Dependent variables: a 60*200 matrix (*Rit*) with 200 companies and 60 dates for each company Independent variables: a 60*4 matrix (*Ft*) with 4 factors and 60 dates for each factor Weights: a 60*200 matrix (*Wit*) with weights for 200 companies and 60 dates for each company The WLS regression I would like to run is: (Wit)*Rit = a*(Wit*F1t) + b*(Wit*F2t) + c*(Wit*F3t) + d*(Wit*F4t) + eit Ideally, I want to run WLS regressions for each company i (i.e., 200 WLS regressions in total), in each regression using weights from column i in matrix *Wit* ,and in the end obtain a 60*4 matrix with coefficients and a 200*60 matrix with residuals. However, when I run: /lm(Rit ~ Ft, weights=Wit)/ it fails because weights argument can only be vector not matrix. I have been searching old posts but couldn't find any solutions. I'm wondering if there is any other function or way to do this? I would really appreciate for your help. Thanks, Victor -- View this message in context: http://r.789695.n4.nabble.com/WLS-regression-lm-with-weights-as-a-matrix-tp3668577p3668577.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Very slow optim(): solved
After Googling and trial and errors, the major cause of optimization was not functions, but data setting. Originally, I was using data.frame for likelihood calculation. Then, I changed data.frame to vector and matrix for the same likelihood calculation. Now convergence takes ~ 14 sec instead of 25 min. Certainly, I didn't know this simple change makes huge computational difference. Toshihide Hamachan Hamazaki, 濱崎俊秀PhD Alaska Department of Fish and Game: アラスカ州漁業野生動物課 Diivision of Commercial Fisheries: 商業漁業部 333 Raspberry Rd. Anchorage, AK 99518 Phone: (907)267-2158 Cell: (907)440-9934 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Ben Bolker Sent: Wednesday, July 13, 2011 12:21 PM To: r-h...@stat.math.ethz.ch Subject: Re: [R] Very slow optim() Hamazaki, Hamachan (DFG toshihide.hamazaki at alaska.gov writes: Dear list, I am using optim() function to MLE ~55 parameters, but it is very slow to converge (~ 25 min), whereas I can do the same in ~1 sec. using ADMB, and ~10 sec using MS EXCEL Solver. Are there any tricks to speed up? Are there better optimization functions? There's absolutely no way to tell without knowing more about your code. You might try method=CG: Method ‘CG’ is a conjugate gradients method based on that by Fletcher and Reeves (1964) (but with the option of Polak-Ribiere or Beale-Sorenson updates). Conjugate gradient methods will generally be more fragile than the BFGS method, but as they do not store a matrix they may be successful in much larger optimization problems. If ADMB works better, why not use it? You can use the R2admb package (on R forge) to wrap your ADMB calls in R code, if you prefer that workflow. Ben __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cbind in aggregate formula - based on an existing object (vector)
Thanks a lot! actually, what I tried to do is very simple - just passing tons of variable names into the formula. Maybe that get thing suggested by Bert would work... Dimitri On Thu, Jul 14, 2011 at 4:01 PM, David Winsemius dwinsem...@comcast.net wrote: Dmitri: as.matrix makes a matrix out of the dataframe that is passed to it. As a further note I attempted and failed for reasons that are unclear to me to construct a formula that would (I hoped) preserve the column names which are being mangle in the posted effort: form - as.formula(paste( cbind(, paste( myvars, collapse=,), ) ~ group+mydate, sep= ) ) myvars-c(value1,value2) example.agg1-aggregate(formula=form,data=example, FUN=sum) Error in m[[2L]][[2L]] : object of type 'symbol' is not subsettable traceback() 2: aggregate.formula(formula = form, data = example, FUN = sum) 1: aggregate(formula = form, data = example, FUN = sum) form cbind(value1, value2) ~ group + mydate parse(text=form) expression(~ cbind(value1, value2), group + mydate) So it seems to be correctly dispatched to aggregate.formula but not passing some check or another. Also tried with formula() rather than as.formula with identical error message. Also tried including without naming the argument. -- David On Jul 14, 2011, at 3:32 PM, Dimitri Liakhovitski wrote: Thank you, David, it does work. Could you please explain why? What exactly does changing it to as matrix do? Thank you! Dimitri On Thu, Jul 14, 2011 at 3:25 PM, David Winsemius dwinsem...@comcast.net wrote: On Jul 14, 2011, at 3:05 PM, Dimitri Liakhovitski wrote: Hello! I am aggregating using a formula in aggregate - of the type: aggregate(cbind(var1,var2,var3)~factor1+factor2,sum,data=mydata) However, I actually have an object (vector of my variables to be aggregated): myvars-c(var1,var2,var3) I'd like my aggregate formula (its cbind part) to be able to use my myvars object. Is it possible? Thanks for your help! Not sure I have gotten all the way there, but this does work: example.agg1-aggregate(as.matrix(example[myvars])~group+mydate,sum,data=example) example.agg1 group mydate example[myvars] NA 1 group1 2008-12-01 4 4.2 2 group2 2008-12-01 6 6.2 3 group1 2009-01-01 40 40.2 4 group2 2009-01-01 60 60.2 5 group1 2009-02-01 400 400.2 6 group2 2009-02-01 600 600.2 Dimitri Reproducible example: mydate = rep(seq(as.Date(2008-12-01), length = 3, by = month),4) value1=c(1,10,100,2,20,200,3,30,300,4,40,400) value2=c(1.1,10.1,100.1,2.1,20.1,200.1,3.1,30.1,300.1,4.1,40.1,400.1) example-data.frame(mydate=mydate,value1=value1,value2=value2) example$group-c(rep(group1,3),rep(group2,3),rep(group1,3),rep(group2,3)) example$group-as.factor(example$group) (example);str(example) example.agg1-aggregate(cbind(value1,value2)~group+mydate,sum,data=example) # this works (example.agg1) ### Building my object (vector of 2 names - in reality, many more): myvars-c(value1,value2) example.agg1-aggregate(cbind(myvars)~group+mydate,sum,data=example) ### does not work -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT -- Dimitri Liakhovitski Ninah Consulting www.ninah.com David Winsemius, MD West Hartford, CT -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cbind in aggregate formula - based on an existing object (vector)
David - I tried exactly the thing you did (and after that asked my question to the forum): form - as.formula(paste( cbind(, paste( myvars, collapse=,), ) ~ group+mydate, sep= ) ) And it did not work - although it looks clean. At the end I ended up writing a loop across individual variables with this code in the body: myformula-as.formula(paste(i,~myfactor,sep=)) temp-aggregate(myformula,sum,data=mydata) ... I then it worked. Really don't understand why pasting the cbind(...) text does not work. Dimitri On Thu, Jul 14, 2011 at 4:01 PM, David Winsemius dwinsem...@comcast.net wrote: Dmitri: as.matrix makes a matrix out of the dataframe that is passed to it. As a further note I attempted and failed for reasons that are unclear to me to construct a formula that would (I hoped) preserve the column names which are being mangle in the posted effort: form - as.formula(paste( cbind(, paste( myvars, collapse=,), ) ~ group+mydate, sep= ) ) myvars-c(value1,value2) example.agg1-aggregate(formula=form,data=example, FUN=sum) Error in m[[2L]][[2L]] : object of type 'symbol' is not subsettable traceback() 2: aggregate.formula(formula = form, data = example, FUN = sum) 1: aggregate(formula = form, data = example, FUN = sum) form cbind(value1, value2) ~ group + mydate parse(text=form) expression(~ cbind(value1, value2), group + mydate) So it seems to be correctly dispatched to aggregate.formula but not passing some check or another. Also tried with formula() rather than as.formula with identical error message. Also tried including without naming the argument. -- David On Jul 14, 2011, at 3:32 PM, Dimitri Liakhovitski wrote: Thank you, David, it does work. Could you please explain why? What exactly does changing it to as matrix do? Thank you! Dimitri On Thu, Jul 14, 2011 at 3:25 PM, David Winsemius dwinsem...@comcast.net wrote: On Jul 14, 2011, at 3:05 PM, Dimitri Liakhovitski wrote: Hello! I am aggregating using a formula in aggregate - of the type: aggregate(cbind(var1,var2,var3)~factor1+factor2,sum,data=mydata) However, I actually have an object (vector of my variables to be aggregated): myvars-c(var1,var2,var3) I'd like my aggregate formula (its cbind part) to be able to use my myvars object. Is it possible? Thanks for your help! Not sure I have gotten all the way there, but this does work: example.agg1-aggregate(as.matrix(example[myvars])~group+mydate,sum,data=example) example.agg1 group mydate example[myvars] NA 1 group1 2008-12-01 4 4.2 2 group2 2008-12-01 6 6.2 3 group1 2009-01-01 40 40.2 4 group2 2009-01-01 60 60.2 5 group1 2009-02-01 400 400.2 6 group2 2009-02-01 600 600.2 Dimitri Reproducible example: mydate = rep(seq(as.Date(2008-12-01), length = 3, by = month),4) value1=c(1,10,100,2,20,200,3,30,300,4,40,400) value2=c(1.1,10.1,100.1,2.1,20.1,200.1,3.1,30.1,300.1,4.1,40.1,400.1) example-data.frame(mydate=mydate,value1=value1,value2=value2) example$group-c(rep(group1,3),rep(group2,3),rep(group1,3),rep(group2,3)) example$group-as.factor(example$group) (example);str(example) example.agg1-aggregate(cbind(value1,value2)~group+mydate,sum,data=example) # this works (example.agg1) ### Building my object (vector of 2 names - in reality, many more): myvars-c(value1,value2) example.agg1-aggregate(cbind(myvars)~group+mydate,sum,data=example) ### does not work -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT -- Dimitri Liakhovitski Ninah Consulting www.ninah.com David Winsemius, MD West Hartford, CT -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting date data over couple of months
thank you for your reply.. As i have told you earlier... i want to plot the total no.of birds counted for each day and plot total no.of birds for each day.. one level for each day .. i wanted to normalize the data.. since i don't have the data for equal no.of hours for all days.. for example on 2011-04-23 i have a data for 2 hours.. but on 2011-05-03 i have data for 8 hours.. that is why i am dividing the data by the no.of hours of data on that particular day.. so... i will create a dataset with dates from the whole month using the above command you have mentioned.. .. and plot my data for these values.. since my data has some dates missing.. i won't get any bars on those particular days how can i place a * on those missing days.. thank you -- View this message in context: http://r.789695.n4.nabble.com/plotting-date-data-over-couple-of-months-tp3666298p3668433.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] LME and overall treatment effects
Ok...lets try again with some code... --- Hello fellow R users, I am having a problem finding the estimates for some overall treatment effects for my mixed models using 'lme' (package nlme). I hope someone can help. Firstly then, the model: The data: Plant biomass (log transformed) Fixed Factors: Treatment(x3 Dry, Wet, Control) Year(x8 2002-2009) Random Factors: 5 plots per treatment, 10 quadrats per plot (N=1200 (3*5*10*8)). I am modelling this in two ways, firstly with year as a continuous variable (interested in the difference in estimated slope over time in each treatment 'year*treatment'), and secondly with year as a categorical variable (interested in difference between 'treatments'). -- ie: (with Year as either numeric or factor) Model-lme(Species~Year*Treatment,random=~1|Plot/Quadrat,na.action = na.omit,data=UDD) - When using Year as a continuous variable, the output of the lme means that I can compare the 3 treatments within my model... i.e. it takes one of the Treatment*year interactions as the baseline and compares (contrasts) the other two to that. - ie Fixed effects: Species ~ Year * Treatment Value Std.Error DF t-value p-value (Intercept) 1514.3700 352.7552 1047 4.292978 0. Year -0.75190.1759 1047 -4.274786 0. Treatment0 -461.9500 498.8711 12 -0.925991 0.3727 Treatment1 -1355.0450 498.8711 12 -2.716222 0.0187 Year:Treatment0 0.23050.2488 1047 0.926537 0.3544 Year:Treatment1 0.67760.2488 1047 2.724094 0.0066 so Year:Treatment0 differs from baseline Year:Treatment-1 by 0.2305 and Year:Treatment1 is significantly different (p=0.0066) from Year:Treatment-1 -- I can then calculate the overall treatment*year effect using 'anova.lme(Model)'. - anova.lme(Model1) numDF denDF F-value p-value (Intercept)1 1047 143.15245 .0001 Year 1 1047 19.56663 .0001 Treatment 212 3.73890 0.0547 Year:Treatment 2 1047 3.83679 0.0219 so there is an overall difference in slope between treatments (Year:Treatment interaction) p=0.0219 -- However, the problem comes when I use Year as a categorical variable. Here, I am interested solely in the Treatment effect (not the interaction with year). However, the output for the two labelled 'Treatment's against the third comparison, are not the overall effect but are a comparison within a year (2002). -- Fixed effects: Species ~ Year * Treatment Value Std.Error DF t-value p-value (Intercept) 6.42 1.528179 1029 4.201079 0. Year2003 4.10 1.551578 1029 2.642471 0.0084 Year2004 5.00 1.551578 1029 3.222526 0.0013 Year2005-1.52 1.551578 1029 -0.979648 0.3275 Year2006-3.08 1.551578 1029 -1.985076 0.0474 Year2007-2.40 1.551578 1029 -1.546813 0.1222 Year2008 2.24 1.551578 1029 1.443692 0.1491 Year2009-4.30 1.551578 1029 -2.771372 0.0057 Treatment0 0.46 2.161171 12 0.212848 0.8350 Treatment1 0.50 2.161171 12 0.231356 0.8209 Year2003:Treatment0 -2.46 2.194262 1029 -1.121106 0.2625 Year2004:Treatment0 -1.34 2.194262 1029 -0.610684 0.5415 Year2005:Treatment0 0.34 2.194262 1029 0.154950 0.8769 Year2006:Treatment0 1.60 2.194262 1029 0.729174 0.4661 Year2007:Treatment0 1.76 2.194262 1029 0.802092 0.4227 Year2008:Treatment0 -3.22 2.194262 1029 -1.467464 0.1426 Year2009:Treatment0 1.80 2.194262 1029 0.820321 0.4122 Year2003:Treatment1 0.22 2.194262 1029 0.100261 0.9202 Year2004:Treatment1 3.48 2.194262 1029 1.585954 0.1131 Year2005:Treatment1 5.00 2.194262 1029 2.278670 0.0229 Year2006:Treatment1 4.70 2.194262 1029 2.141950 0.0324 Year2007:Treatment1 6.08 2.194262 1029 2.770863 0.0057 Year2008:Treatment1 2.32 2.194262 1029 1.057303 0.2906 Year2009:Treatment1 5.56 2.194262 1029 2.533881 0.0114 so Treatment0 (in year 2002) is different to baseline Treatment-1 (in year 2002) by 0.46 p=0.8350 I can still get my overall effect (using anova.lme) but how do I calculate the estimates (with P-Values if possible) for each seperate overall treatment comparison (not within year). I can do this in SAS using 'estimate' or 'lsmeans', but for various reasons I want to do it in R as well. I tried 'glht' (package 'multicomp') but this only works if there are no interactions present, otherwise again it gives a comparison for one particular year. -- ie require(multcomp) summary(glht(Model, linfct = mcp(Treatment = Tukey))) (sorry, I can't get this to work at home but trust me that it's
[R] nlme gls errors
Hi I keep getting an error like this: Error in `coef-.corARMA`(`*tmp*`, value = c(18.3113452983211, -1.56626248550284, : Coefficient matrix not invertible or like this: Error in gls(archlogfl ~ co2, correlation = corARMA(p = 3)) : false convergence (8) with the gls function in nlme. The former example was with the model gls(archlogflfornma~nma,correlation=corARMA(p=3)) where archlogflfornma is [1] 2.611840 2.618454 2.503317 2.305531 2.180464 2.185764 2.221760 2.211320 and nma is [1] 138 139 142 148 150 134 137 135. You can see the model in the latter, and archlogfl is [1] 2.611840 2.618454 2.503317 2.305531 2.180464 2.185764 2.221760 2.211320 [9] 2.105556 2.176747 and co2 is [1] 597.5778 917.9308 1101.0430 679.7803 886.5347 597.0668 873.4995 [8] 816.3483 1427.0190 423.8917. I'm quite a nube, so sorry, but help most appreciated. I have R 2.13.1. Roland [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] random selection of elements from a matrix
Hi! How can I make a random selection of n row elements from a matrix. The matrix was previously created from a table with different rows and columns. However I want to keep all the information (columns), I just want to reduce the number of observations. Thanks, Ana -- View this message in context: http://r.789695.n4.nabble.com/random-selection-of-elements-from-a-matrix-tp3668574p3668574.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cbind in aggregate formula - based on an existing object (vector)
You may find it easier to use the data.frame method for aggregate instead of the formula method when you are using vectors of column names. E.g., responseVars - c(mpg, wt) byVars - c(cyl, gear) aggregate(mtcars[responseVars], by=mtcars[byVars], FUN=median) gives the same result as aggregate(cbind(mpg, wt) ~ cyl + gear, FUN=median, data=mtcars) Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Dimitri Liakhovitski Sent: Thursday, July 14, 2011 1:45 PM To: David Winsemius Cc: r-help Subject: Re: [R] cbind in aggregate formula - based on an existing object (vector) Thanks a lot! actually, what I tried to do is very simple - just passing tons of variable names into the formula. Maybe that get thing suggested by Bert would work... Dimitri On Thu, Jul 14, 2011 at 4:01 PM, David Winsemius dwinsem...@comcast.net wrote: Dmitri: as.matrix makes a matrix out of the dataframe that is passed to it. As a further note I attempted and failed for reasons that are unclear to me to construct a formula that would (I hoped) preserve the column names which are being mangle in the posted effort: form - as.formula(paste( cbind(, paste( myvars, collapse=,), ) ~ group+mydate, sep= ) ) myvars-c(value1,value2) example.agg1-aggregate(formula=form,data=example, FUN=sum) Error in m[[2L]][[2L]] : object of type 'symbol' is not subsettable traceback() 2: aggregate.formula(formula = form, data = example, FUN = sum) 1: aggregate(formula = form, data = example, FUN = sum) form cbind(value1, value2) ~ group + mydate parse(text=form) expression(~ cbind(value1, value2), group + mydate) So it seems to be correctly dispatched to aggregate.formula but not passing some check or another. Also tried with formula() rather than as.formula with identical error message. Also tried including without naming the argument. -- David On Jul 14, 2011, at 3:32 PM, Dimitri Liakhovitski wrote: Thank you, David, it does work. Could you please explain why? What exactly does changing it to as matrix do? Thank you! Dimitri On Thu, Jul 14, 2011 at 3:25 PM, David Winsemius dwinsem...@comcast.net wrote: On Jul 14, 2011, at 3:05 PM, Dimitri Liakhovitski wrote: Hello! I am aggregating using a formula in aggregate - of the type: aggregate(cbind(var1,var2,var3)~factor1+factor2,sum,data=mydata) However, I actually have an object (vector of my variables to be aggregated): myvars-c(var1,var2,var3) I'd like my aggregate formula (its cbind part) to be able to use my myvars object. Is it possible? Thanks for your help! Not sure I have gotten all the way there, but this does work: example.agg1-aggregate(as.matrix(example[myvars])~group+mydate,sum,data=example) example.agg1 group mydate example[myvars] NA 1 group1 2008-12-01 4 4.2 2 group2 2008-12-01 6 6.2 3 group1 2009-01-01 40 40.2 4 group2 2009-01-01 60 60.2 5 group1 2009-02-01 400 400.2 6 group2 2009-02-01 600 600.2 Dimitri Reproducible example: mydate = rep(seq(as.Date(2008-12-01), length = 3, by = month),4) value1=c(1,10,100,2,20,200,3,30,300,4,40,400) value2=c(1.1,10.1,100.1,2.1,20.1,200.1,3.1,30.1,300.1,4.1,40.1,400.1) example-data.frame(mydate=mydate,value1=value1,value2=value2) example$group-c(rep(group1,3),rep(group2,3),rep(group1,3),rep(group2,3)) example$group-as.factor(example$group) (example);str(example) example.agg1-aggregate(cbind(value1,value2)~group+mydate,sum,data=example) # this works (example.agg1) ### Building my object (vector of 2 names - in reality, many more): myvars-c(value1,value2) example.agg1-aggregate(cbind(myvars)~group+mydate,sum,data=example) ### does not work -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT -- Dimitri Liakhovitski Ninah Consulting www.ninah.com David Winsemius, MD West Hartford, CT -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented,
Re: [R] random selection of elements from a matrix
ups...already found the solution matrix2 - matrix1[sample(samplenumber,replace=F),] -- View this message in context: http://r.789695.n4.nabble.com/random-selection-of-elements-from-a-matrix-tp3668574p3668594.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error: non-numeric argument to binary operator
Hi I am posting in the topic related to the non-numeric argument to binary operator as I got similar problem while running the netcdf code. I have attached the file with this post. It is a climate data from NOAA site. The code follows as: library(survival) library(RNetCDF) library(ncdf) setwd(c:/projects/netcdfcsfiles) Conn42 = open.ncdf(128.111.220.111.46.15.32.42.nc); # read the time variable, which measures years, and # use the length of the vector to estimate the time span # timeObj = get.var.ncdf(Conn42,time); file42YrRangeDays = trunc(length(timeObj)) # Process each file separately: # get attributes, read the rhum data cube, # extract the time series from the cube, # and rescale the time series vector # scaleFact = att.get.ncdf(Conn42,rhum,scale_factor) offset= att.get.ncdf(Conn42,rhum,add_offset) rhumObj = get.var.ncdf(Conn42,rhum); # # the Relative Humidity data 'cube' has 4 dimensions: # latitudes (4), longitudes (3), pressure levels (1), # and time (days - several thousand) # rhCube = array(rhumObj,dim = c(4,3,file42YrRangeDays)) rh42Vec = (((rhCube[2,2,1:file42YrRangeDays]) * scaleFact) + offset) #(Here is the point where I got the error Error in (rhCube[2, 2, 1:file24YrRangeDays]) * scaleFact : non-numeric argument to binary operator I appreciate if anyone can help me out. Sincerely, Anil Acharya __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Very slow optim(): solved
Date: Thu, 14 Jul 2011 12:44:18 -0800 From: toshihide.hamaz...@alaska.gov To: r-h...@stat.math.ethz.ch Subject: Re: [R] Very slow optim(): solved After Googling and trial and errors, the major cause of optimization was not functions, but data setting. Originally, I was using data.frame for likelihood calculation. Then, I changed data.frame to vector and matrix for the same likelihood calculation. Now convergence takes ~ 14 sec instead of 25 min. Certainly, I didn't know this simple change makes huge computational difference. Thanks, can you pass along any additional details like google links you found or comment on the resulting limitation( were you CPU limited converting data formats or did this cause memory problems leading to VM thrashing?)? I've often had c++ code that turns out to be IO limited when I expected I was doing real complicated computations, it never hurts to go beyond the usual suspects LOL. Toshihide Hamachan Hamazaki, 濱崎俊秀PhD Alaska Department of Fish and Game: アラスカ州漁業野生動物課 Diivision of Commercial Fisheries: 商業漁業部 333 Raspberry Rd. Anchorage, AK 99518 Phone: (907)267-2158 Cell: (907)440-9934 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Ben Bolker Sent: Wednesday, July 13, 2011 12:21 PM To: r-h...@stat.math.ethz.ch Subject: Re: [R] Very slow optim() Hamazaki, Hamachan (DFG toshihide.hamazaki at alaska.gov writes: Dear list, I am using optim() function to MLE ~55 parameters, but it is very slow to converge (~ 25 min), whereas I can do the same in ~1 sec. using ADMB, and ~10 sec using MS EXCEL Solver. Are there any tricks to speed up? Are there better optimization functions? There's absolutely no way to tell without knowing more about your code. You might try method=CG: Method ‘CG’ is a conjugate gradients method based on that by Fletcher and Reeves (1964) (but with the option of Polak-Ribiere or Beale-Sorenson updates). Conjugate gradient methods will generally be more fragile than the BFGS method, but as they do not store a matrix they may be successful in much larger optimization problems. If ADMB works better, why not use it? You can use the R2admb package (on R forge) to wrap your ADMB calls in R code, if you prefer that workflow. Ben __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Correct behavior of Hmisc::capitalize()?
Hi, from example(capitalize) of the Hmisc package (v 0.8.3) you get: capitalize(c(Hello, bob, daN)) [1] Hello Bob daN Is that daN correct? If so, then this behavior that only *all lowercase strings*, which the code indicates, will be capitalized is not documented. Hmisc::capitalize function (string) { capped - grep(^[^A-Z]*$, string, perl = TRUE) substr(string[capped], 1, 1) - toupper(substr(string[capped], 1, 1)) return(string) } environment: namespace:Hmisc There are also some misspelled words in help(capitalize). sessionInfo() R version 2.13.1 Patched (2011-07-09 r56344) Platform: x86_64-pc-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] splines stats graphics grDevices utils datasets methods [8] base other attached packages: [1] Hmisc_3.8-3 survival_2.36-9 loaded via a namespace (and not attached): [1] cluster_1.14.0 grid_2.13.1 lattice_0.19-30 tools_2.13.1 /Henrik (Hmisc maintainer cc:ed) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cbind in aggregate formula - based on an existing object (vector)
Hi: I think Bill's got the right idea for your problem, but for the fun of it, here's how Bert's suggestion would play out: # Kind of works, but only for the first variable in myvars... aggregate(get(myvars) ~ group + mydate, FUN = sum, data = example) group mydate get(myvars) 1 group1 2008-12-01 4 2 group2 2008-12-01 6 3 group1 2009-01-01 40 4 group2 2009-01-01 60 5 group1 2009-02-01 400 6 group2 2009-02-01 600 # Maybe sapply() with get as the function will work... aggregate(sapply(myvars, get) ~ group + mydate, FUN = sum, data = example) group mydate myvars get 1 group1 2008-12-01 4 4.2 2 group2 2008-12-01 6 6.2 3 group1 2009-01-01 40 40.2 4 group2 2009-01-01 60 60.2 5 group1 2009-02-01400 400.2 6 group2 2009-02-01600 600.2 Apart from the variable names, it matches example.agg1. OTOH, Bill's suggestion matches example.agg1 exactly and has an advantage in terms of code clarity: byVars - c('group', 'mydate') aggregate(example[myvars], by = example[byVars], FUN = sum) group mydate value1 value2 1 group1 2008-12-01 44.2 2 group2 2008-12-01 66.2 3 group1 2009-01-01 40 40.2 4 group2 2009-01-01 60 60.2 5 group1 2009-02-01400 400.2 6 group2 2009-02-01600 600.2 FWIW, Dennis On Thu, Jul 14, 2011 at 12:05 PM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Hello! I am aggregating using a formula in aggregate - of the type: aggregate(cbind(var1,var2,var3)~factor1+factor2,sum,data=mydata) However, I actually have an object (vector of my variables to be aggregated): myvars-c(var1,var2,var3) I'd like my aggregate formula (its cbind part) to be able to use my myvars object. Is it possible? Thanks for your help! Dimitri Reproducible example: mydate = rep(seq(as.Date(2008-12-01), length = 3, by = month),4) value1=c(1,10,100,2,20,200,3,30,300,4,40,400) value2=c(1.1,10.1,100.1,2.1,20.1,200.1,3.1,30.1,300.1,4.1,40.1,400.1) example-data.frame(mydate=mydate,value1=value1,value2=value2) example$group-c(rep(group1,3),rep(group2,3),rep(group1,3),rep(group2,3)) example$group-as.factor(example$group) (example);str(example) example.agg1-aggregate(cbind(value1,value2)~group+mydate,sum,data=example) # this works (example.agg1) ### Building my object (vector of 2 names - in reality, many more): myvars-c(value1,value2) example.agg1-aggregate(cbind(myvars)~group+mydate,sum,data=example) ### does not work -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Cost-sensitive classification
Hi , everybody !!! I want to perform a cost-sensitive classification using the rpart as a base classifier . Is it possible ? Nissim -- View this message in context: http://r.789695.n4.nabble.com/Cost-sensitive-classification-tp3668749p3668749.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] (no subject)
Good Afternoon R Community, I often work with very large data bases and want to search for select cases by a particular word or numeric value. I created the following simple function to do just that. It searchs a particular column for the phrase and returns a data frame with the rows that contain that phrase (for a particular column). Search-function(term, dataframe, column.name, variation=.02,...){ te-substitute(term) te-as.character(te) cn-substitute(column.name) cn-as.character(cn) HUNT-agrep(te,dataframe[,cn],ignore.case =TRUE,max.distance=variation,...) dataframe[c(HUNT),] } I would like to modify this to search all columns for the phrase keep only the unique rows and return a data frame for any columns (minus repeated rows) that contain the phrase. I assumed this would be an easy task for me using sapply() and unique() or union(). Because this argument takes more than one argument (vector{column} is not the only argument) I dont know how to set it up. Could someone tell me how to apply this function to multiple columns and return one data frame with all the agrep matches (Ill figure out how to deal with duplicates after that; thats the easy part). Thank you in advance for your help, Tyler Rinker PS if your idea is a for loop please explain it well or provide the code because I do not have a programming background and for loops are very difficult to wrap my head around. Running windows 7 R version 2.14.0 (beta) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Splitting one column value into multiple rows
Hi i have the data in the following format: rent,100,1,common,674 pipe,200,0,usual,864 car,300,1,uncommon,392:jump,700,0,common,664 car,200,1,uncommon,864:snap,900,1,usual,746 stint,600,1,uncommon,257 pull,800,0,usual,594 where as i want the above 6 lines data into 8 lines as below (Spliting row 3 4 at : and sending to a new row): rent,100,1,common,674 pipe,200,0,usual,864 car,300,1,uncommon,392 jump,700,0,common,664 car,200,1,uncommon,864 snap,900,1,usual,746 stint,600,1,uncommon,257 pull,800,0,usual,594 Request any one who can help me getting this done. Regards, Madana -- View this message in context: http://r.789695.n4.nabble.com/Splitting-one-column-value-into-multiple-rows-tp3668835p3668835.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] recursive function - finding connections
Sorry bad example. My data is undirected. It's a correlation matrix so probably better to look at something like: foomat-cor(matrix(rnorm(100), ncol=10)) foomat mine are pvalues from the correlation but same idea. On 14 Jul 2011, at 11:23, Erich Neuwirth wrote: cliques only works for undirected graphs. Your matrix is not symmetric, therefore the graph is directed. On 7/14/2011 8:53 AM, Benton, Paul wrote: Dear all, I'm having some problems getting my recursive function to work. At first I though that maybe my data was too big and I increase option(expressions=5). Then I thought that I would try it on some smaller data. Still not working. :( I would have thought there should be a function for this already, so any suggestions are welcomed for other methods. I did try igraph but couldn't get cliques() to give anything useful. Also a quick play with hclust and cut, again nothing too useful. Basically the function is trying to find uniquely connected subgraphs. So the sub-network is only connected by itself and not to other nodes. If everything is connected then the list (connectedList) should be length of 1 and have every index in the 1st slot. cheers, Paul findconnection-function(mat, cutoff){ toList-function(mat, connectList, cutoff, i, idx){ idx-which(mat[,idx] cutoff) if(length(idx) = 1){ connectList[[i]]-idx for(z in 1:length(idx)){ connectList-toList(mat, connectList, cutoff, i, idx[z]) } }else{ return(connectList) } } connectList-list() for(i in 1:ncol(mat)){ connectList-toList(mat, connectList, cutoff, i, i) } return(unique(connectList)) } foomat-matrix(sample(c(1,0.5,0), 100, replace=T), nrow=10) ## example data foomat [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] 0.0 0.5 0.0 0.5 0.5 0.0 0.5 1.0 0.5 0.0 [2,] 0.0 1.0 1.0 0.0 0.0 1.0 0.0 1.0 0.5 1.0 [3,] 1.0 1.0 1.0 1.0 0.5 0.0 0.5 0.5 0.5 0.5 [4,] 0.0 0.5 0.0 0.0 0.5 0.5 0.5 0.0 1.0 0.0 [5,] 0.5 0.5 1.0 1.0 0.5 1.0 1.0 0.5 0.5 0.5 [6,] 0.0 0.5 0.0 0.5 0.5 0.5 0.5 0.5 1.0 1.0 [7,] 1.0 1.0 0.0 1.0 0.0 0.5 1.0 1.0 0.5 0.5 [8,] 0.5 1.0 0.0 0.5 1.0 0.0 1.0 0.0 0.0 0.0 [9,] 0.0 0.5 0.0 0.0 0.5 0.0 0.5 0.0 0.5 0.5 [10,] 1.0 1.0 0.5 1.0 0.0 1.0 0.0 0.0 0.0 0.5 pb-findconnection(foomat, 0.01) Error: C stack usage is too close to the limit Error during wrapup: __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQldf with sqlite and H2
Thanks a lot Gabor. It helped a lot. Appreciate your time and effort. Thanks --- On Thu, 7/14/11, Gabor Grothendieck ggrothendi...@gmail.com wrote: From: Gabor Grothendieck ggrothendi...@gmail.com Subject: Re: [R] SQldf with sqlite and H2 To: Mandans mandan...@yahoo.com Cc: r-help@r-project.org Date: Thursday, July 14, 2011, 2:22 PM On Thu, Jul 14, 2011 at 10:33 AM, Mandans mandan...@yahoo.com wrote: SQldf with sqlite and H2 I have a large csv file (about 2GB) and wanted to import the file into R and do some filtering and analysis. Came across sqldf ( a great idea and product) and was trying to play around to see what would be the best method of doing this. csv file is comma delimited with some columns having comma inside the quoation like this John, Doe. I tried this first ### library(sqldf) sqldf(attach testdb as new) In.File - C:/JP/Temp/2008.csv read.csv.sql(In.File, sql = create table table1 as select * from file, dbname = testdb) It errored out with message NULL Warning message: closing unused connection 3 (C:/JP/Temp/2008.csv) When this failed, I converted this file from comma delimited to tab delimited and used this command # read.csv.sql(In.File, sql = create table table1 as select * from file, dbname = testdb, sep = \t) and this worked, it created testdb sqlite file with the size of 3GB now my question is in 3 parts. 1. Is it possible to create a dataframe with appropriate column classes and use that column classes when I use the read.csv.sql command to create the table. Something like may be create the table from that DF and then update with read.csv.sql.? Any example code will be really helpful. Here is an example of using method = name__class. Note there are two underscores in a row. It appears I neglected to document that Date2 means convert from character representation whereas Date means convert from numeric representation. It would also be possible to use method = raw and then coerce the columns yourself afterwards. # create test file Lines - 'A__Date2|B 2000-01-01|x,y 2000-01-02|c,d ' tf - tempfile() cat(Lines, file = tf) library(sqldf) DF - read.csv.sql(tf, sep = |, method = name__class) str(DF) 2. If we use the H2 database instead of default sqlite and use the readcsv option, will that be faster and is there a way we can specify the above thought of applying a DF class to table column properties and update with CSVREAD library(RH2) something like SELECT * FROM CSVREAD('C:/JP/Temp/2008.csv') Any example code will be really helpful. Sorry, I haven't tested the speed of this. postgresql and mysql, both supported by sqldf, also have builtin methods to read files. If I had to guess I would guess that mysql would be fastest but this would have to be tested. 3. How do we specify where the H2 file is saved. Saw something like this, when I ran this example from RH2 package, couldn't find the file in the working directory. con - dbConnect(H2(), jdbc:h2:~/test, sa, ) ~ means your home directory so ~/test means test is in the home directory. Try normalizePath(~) normalizePath(~/test) etc. to see what they refer to. Regards. Sorry for the long mail. Appreciate all for building a great community and for the wonderful software in R. Thanks for Gabor Grothendieck for bring sqldf to this great community. Any help or direction you can provide in this is highly appreciated. Thanks all. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] recursive function - finding connections
Hi Paul, I assume you are using the argument cutoff to specify the p-value below which nodes are considered connected and above which they are not connected. I would use single linkage hierarchical clustering. If you have two groups of nodes and any two nodes between the groups are connected (i.e. have adjacency =1 or dissimilarity 0), then the groups have dissimilarity 0. If no two nodes between the two groups are connected, you will get dissimilarity 1. Thus you can use any tree cut height between 0 and 1 to get the clusters that correspond to connected. For large data you will need a large computer to hold your distance matrix, but you must have observed that already. subgraphs = function(mat, cut) { disconnected = matcut # Change the inequality if necessary tree = hclust(as.dist(disconnected), method = single) clusters = cutree(tree, h = 0.5) # Clusters is already the answer, but you want it in a different format, so we reformat it. nClusters = max(clusters) connectedList = list(); for (c in 1:nClusters) connectedList[[c]] = which(clusters==c) connectedList } Try it and see if this does what you want. HTH Peter On Thu, Jul 14, 2011 at 4:12 PM, Benton, Paul hpaul.bento...@imperial.ac.uk wrote: Sorry bad example. My data is undirected. It's a correlation matrix so probably better to look at something like: foomat-cor(matrix(rnorm(100), ncol=10)) foomat mine are pvalues from the correlation but same idea. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] recursive function - finding connections
One more thing - for large data sets, the packages flashClust and fastcluster provide much faster hierarchical clustering that (at least for flashClust which I'm the maintainer of) give the exact same results. Simply insert a library(flashClust) before you call the function and your code will run much faster. Peter On Thu, Jul 14, 2011 at 4:58 PM, Peter Langfelder peter.langfel...@gmail.com wrote: Hi Paul, I assume you are using the argument cutoff to specify the p-value below which nodes are considered connected and above which they are not connected. I would use single linkage hierarchical clustering. If you have two groups of nodes and any two nodes between the groups are connected (i.e. have adjacency =1 or dissimilarity 0), then the groups have dissimilarity 0. If no two nodes between the two groups are connected, you will get dissimilarity 1. Thus you can use any tree cut height between 0 and 1 to get the clusters that correspond to connected. For large data you will need a large computer to hold your distance matrix, but you must have observed that already. subgraphs = function(mat, cut) { disconnected = matcut # Change the inequality if necessary tree = hclust(as.dist(disconnected), method = single) clusters = cutree(tree, h = 0.5) # Clusters is already the answer, but you want it in a different format, so we reformat it. nClusters = max(clusters) connectedList = list(); for (c in 1:nClusters) connectedList[[c]] = which(clusters==c) connectedList } Try it and see if this does what you want. HTH Peter On Thu, Jul 14, 2011 at 4:12 PM, Benton, Paul hpaul.bento...@imperial.ac.uk wrote: Sorry bad example. My data is undirected. It's a correlation matrix so probably better to look at something like: foomat-cor(matrix(rnorm(100), ncol=10)) foomat mine are pvalues from the correlation but same idea. -- Sent from my Linux computer. Way better than iPad :) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Font problem
Hi, I'm running Redhat Linux (I believe it is Fedora 13 or 14) With the latest version of R Everything works nicely, except for the fonts on some plots. I see small empty boxes instead of the number. It seems as if this is only the case when the fonts are small. I've installed all of the font packages recommended in the R documentation. Any steps I can take to diagnose the missing/problem fonts and/or how to correct this? Thanks! -- Noah Silverman UCLA Department of Statistics 8117 Math Sciences Building Los Angeles, CA 90095 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (no subject)
On Jul 14, 2011, at 6:15 PM, Tyler Rinker wrote: Good Afternoon R Community, I often work with very large data bases and want to search for select cases by a particular word or numeric value. I created the following simple function to do just that. It searchs a particular column for the phrase and returns a data frame with the rows that contain that phrase (for a particular column). Search-function(term, dataframe, column.name, variation=.02,...){ te-substitute(term) te-as.character(te) cn-substitute(column.name) cn-as.character(cn) HUNT-agrep(te,dataframe[,cn],ignore.case =TRUE,max.distance=variation,...) ### dataframe[c(HUNT),] HUNTL - (1:NROW(dataframe) %in% HUNT) } You would make life simpler by keeping your results as logical vectors the same length as your dataframe. Then: logHunt - sapply(dfrmname, Search, term=term, ) indexL - rowSums(logHunt) =1 dfrmname[indexL, ] Untested in absence of test data. -- David. I would like to modify this to search all columns for the phrase keep only the unique rows and return a data frame for any columns (minus repeated rows) that contain the phrase. I assumed this would be an easy task for me using sapply() and unique() or union(). Because this argument takes more than one argument (vector{column} is not the only argument) I dont know how to set it up. Could someone tell me how to apply this function to multiple columns and return one data frame with all the agrep matches (Ill figure out how to deal with duplicates after that; thats the easy part). Thank you in advance for your help, Tyler Rinker PS if your idea is a for loop please explain it well or provide the code because I do not have a programming background and for loops are very difficult to wrap my head around. Running windows 7 R version 2.14.0 (beta) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Splitting one column value into multiple rows
On Jul 14, 2011, at 6:47 PM, Madana_Babu wrote: Hi i have the data in the following format: rent,100,1,common,674 pipe,200,0,usual,864 car,300,1,uncommon,392:jump,700,0,common,664 car,200,1,uncommon,864:snap,900,1,usual,746 stint,600,1,uncommon,257 pull,800,0,usual,594 where as i want the above 6 lines data into 8 lines as below (Spliting row 3 4 at : and sending to a new row): Lines - readLines(textConnection(rent,100,1,common,674 + pipe,200,0,usual,864 + car,300,1,uncommon,392:jump,700,0,common,664 + car,200,1,uncommon,864:snap,900,1,usual,746 + stint,600,1,uncommon,257 + pull,800,0,usual,594)) closeAllConnections() newlines - strsplit(Lines, :) newlines2 - unlist(newlines) newlines2 [1] rent,100,1,common,674pipe,200,0,usual,864 car, 300,1,uncommon,392 jump,700,0,common,664 [5] car,200,1,uncommon,864 snap,900,1,usual,746 stint, 600,1,uncommon,257 pull,800,0,usual,594 read.table(textConnection(newlines2), sep=,) V1 V2 V3 V4 V5 1 rent 100 1 common 674 2 pipe 200 0usual 864 3 car 300 1 uncommon 392 4 jump 700 0 common 664 5 car 200 1 uncommon 864 6 snap 900 1usual 746 7 stint 600 1 uncommon 257 8 pull 800 0usual 594 rent,100,1,common,674 pipe,200,0,usual,864 car,300,1,uncommon,392 jump,700,0,common,664 car,200,1,uncommon,864 snap,900,1,usual,746 stint,600,1,uncommon,257 pull,800,0,usual,594 Request any one who can help me getting this done. Regards, Madana -- View this message in context: http://r.789695.n4.nabble.com/Splitting-one-column-value-into-multiple-rows-tp3668835p3668835.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Add coordinates at specific points...
To add to what David and Duncan wrote: If you want to plot something at a point where the x coordinate is in user coordinates, but the y-coordinate is something like the middle of the plot, or 1/5th of the way from the top then you can use the grconvertY function along with the text function. If you are going to have several labels and want to plot them so that they do not overlap then there are functions in the plotrix and TeachingDemos packages that can be used to spread labels. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of David Winsemius Sent: Wednesday, July 13, 2011 6:11 PM To: JIA Pei Cc: r-help@r-project.org Subject: Re: [R] Add coordinates at specific points... On Jul 13, 2011, at 7:59 PM, JIA Pei wrote: Thanks David, for your prompt reply. Ok... just avoid this mess. I'd love to emphasize the key point M(2.1, sin(2.1)), and just write a point coordinate by the side of M .How to do that easily? Have you worked through the examples on the help(text) page? This does not seem to be that difficult. Three essential arguments x,y and text. You are supposed to put for effort and show what code you have tried. text(2.1, sin(2.1), labels=M=(2.1, sin(2.1) )) You can play around with the positioning arguments, which IIRC are adj= and something else. By the way, if there is another point N(2.2, sin(2.2)), will M and N conflict (overlap) each other? I don't see how that could be avoided. (Last color wins.) We don't have 3d glasses for our screens yet. Transparenct colors are available, but you don't seem to be ready for that yet. Cheers Pei On Wed, Jul 13, 2011 at 4:49 PM, David Winsemius dwinsem...@comcast.net wrote: On Jul 13, 2011, at 7:42 PM, JIA Pei wrote: Hi, Thanks David Winsemius: mtext works !! However, in R plot, mtext will automatically overlap/overwrite the existing coordinates, which makes the coordinates a messy. Refer to http://www.visionopen.com/Rplot.png , which is produced by only 3 lines. dev.new(width = 640, height = 480) plot(sin, -pi, 2*pi) mtext(text=2.1000, side = 1, line = 1, at = 2.1) In order to avoid the coordinates overlap, can I change the position of 2.1000 from outside the box to inside the box? If possible, how to? I tried outer = FALSE, nothing special happened ever !! You should use text() rather than mtext() if you are going to plot within the user area. cheers Pei On Wed, Jul 13, 2011 at 1:41 PM, David Winsemius dwinsem...@comcast.net wrote: On Jul 13, 2011, at 1:22 PM, JIA Pei wrote: Hi, all: I used two lines of very simple code to draw a sin curve. dev.new(width = 640, height = 480) plot(sin, -pi, 2*pi) First look at: ?mtext # then try mtext(text=2.5000, side = 1, line = 1, at = 2.5) Now, I added a specific line (red line in the picture at http://www.visionopen.com/Rplot.png) by using abline. However, I still love to add the X-coordinate 2.5 outside the rectangle box. How to do it in R? Right now, I'm using Windows Paint to add the characters 2.5 up to the plot drawn by R. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] validate survival with val.surv
Dear R users: I want to externally validate a model with val.surv. Can I use only calculated survival (at 1 year) and actual survival? Or I needed the survival function and actual survival. Thanks *Yao Zhu* *Department of Urology Fudan University Shanghai Cancer Center Shanghai, China* [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] validate survival with val.surv
The documentation for val.surv tried to be clear about that. Note that val.surv is only for the case where the predictions were developed on a separate dataset, i.e., that the validation is truly 'external'. Frank yz wrote: Dear R users: I want to externally validate a model with val.surv. Can I use only calculated survival (at 1 year) and actual survival? Or I needed the survival function and actual survival. Thanks *Yao Zhu* *Department of Urology Fudan University Shanghai Cancer Center Shanghai, China* [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/validate-survival-with-val-surv-tp3669022p3669051.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] computing functions with Euler's number (e^n)
R 2.11.1 on Mac OS X. I didn't see the Note. -- View this message in context: http://r.789695.n4.nabble.com/computing-functions-with-Euler-s-number-e-n-tp3655205p3668849.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Non-negative least squares for sparse matrix
Hello, I am attempting to solve the least squares problem Ax = b in R, where A and b are known and x is unknown. It is simple to solve for x using one of a variety of methods outlined here: http://cran.r-project.org/web/packages/Matrix/vignettes/Comparisons.pdf As far as I can tell, none of these methods will solve for x when A, x, and b are constrained to be non-negative (x 0). Other packages, such as nnls, can solve the non-negative least squares problem, but do not work with very large sparse matrices. The matrix A that I am using is 750,000 by 46,000 elements with 99% zeros, and matrix b is a dense 750,000 by 1 matrix. Does an R function exist for solving the non-negative least squares problem with a sparse matrix? Thanks!, Erik sessionInfo() R version 2.13.0 (2011-04-13) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) locale: [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] nnls_1.3 Matrix_0.999375-50 MASS_7.3-12 [4] lattice_0.19-23 loaded via a namespace (and not attached): [1] grid_2.13.0 tools_2.13.0 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to order each element according to alphabet
Hi there, I have a large amino acid csv file like this: input.txt: P,LV,Q,Z P,VL,Q,Z P,ML,QL,Z There is a problem with this file, since LV and VL are in fact the same thing. How do I order each element according to alphabetical order so that the desired output would look like: output.txt: P,LV,Q,Z P,LV,Q,Z P,LM,LQ,Z -- View this message in context: http://r.789695.n4.nabble.com/how-to-order-each-element-according-to-alphabet-tp3668997p3668997.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Export Unicode characters from R
Dear helpers, I am not able to export Unicode characters from R. Below is an example where the Unicode character is correctly rendered as long as I am stay within R. When I export it, the character appears only with its basic code, and the same happens when I import it back into R . I'm using R 2.13.1 in Windows XP. funny.g - \u1E21 funny.g [1] ḡ data.frame (funny.g) - funny.g funny.g$funny.g [1] ḡ Levels: U+1E21 write.table (funny.g, file = C:/~funny.g.txt, col.names = FALSE, row.names = FALSE, quote = FALSE, fileEncoding = UTF-8) read.table (C:/~funny.g.txt, header = FALSE, encoding = UTF-8) - input.funny.g input.funny.g$V1 [1] U+1E21 Levels: U+1E21 Best Sverre __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Calculate Az (A sub z) with R?
I am looking for (or interested in writing) a function that calculates Az, an alternative measure of discriminability from SDT (alternative to d', Az). I have written my own functions for d', A', Bd, and am aware of the 'sdtalt' package, but I have yet to find a way to calculate Az, since it require the phi operator. For a relevant paper/discussion (and formula), please see Verde and McMillian, 2006 (Measures of sensitivity based on a single hit rate and false alarm rate: The accuracy, precision, and robustness of d', Az, and Az) Any help on this would be greatly appreciated! David Dobolyi Graduate Student Cognitive Psychology University of Virginia -- View this message in context: http://r.789695.n4.nabble.com/Calculate-Az-A-sub-z-with-R-tp3668893p3668893.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to order each element according to alphabet
On Jul 14, 2011, at 9:18 PM, onthetopo wrote: Hi there, I have a large amino acid csv file like this: input.txt: P,LV,Q,Z P,VL,Q,Z P,ML,QL,Z Are you also asking how to read a comma separated file? ? read.csv # and read more introductory material There is a problem with this file, since LV and VL are in fact the same thing. How do I order each element according to alphabetical order so that the desired output would look like: output.txt: P,LV,Q,Z P,LV,Q,Z P,LM,LQ,Z That is not a reproducible example without input code: Perhaps: as.data.frame(lapply(input_dfrm, gsub, patt=LV, repl=VL)) -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to order each element according to alphabet
Hi, There are many more patterns than VL to LV. In fact, too many to be listed manually. For example ML should be ordered as LM, QL should be ordered as LQ. The order is according to the alphabet. -- View this message in context: http://r.789695.n4.nabble.com/how-to-order-each-element-according-to-alphabet-tp3668997p3669130.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to order each element according to alphabet
On Jul 14, 2011, at 11:19 PM, onthetopo wrote: Hi, There are many more patterns than VL to LV. In fact, too many to be listed manually. For example ML should be ordered as LM, QL should be ordered as LQ. The order is according to the alphabet. A more complete (reproducible) answer would have been appreciated and note that local custom dictates that context is offered for ongoing threads. Nabble provides a mechanism for doing so. lets2 - paste(LETTERS[sample(20, replace=TRUE)], LETTERS[sample(20, replace=TRUE)], sep=) lets2 [1] IA EP TE IT PS DO RO EJ DR DD LM OF RJ OA JD QB AS TG MK IM sapply( lapply( strsplit(lets2, split=), sort), paste, collapse=) [1] AI EP ET IT PS DO OR EJ DR DD LM FO JR AO DJ BQ AS GT KM IM -- View this message in context: http://r.789695.n4.nabble.com/how-to-order-each-element-according-to-alphabet-tp3668997p3669130.html Sent from the R help mailing list archive at Nabble.com. Nabble is NOT rhelp. So PLEASE, PLEASE, PLEASE: do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Bar chart in ascending order for each level of X
Hello List, The question is how to plot a bar chart in which bars are sorted in ascending order for each level of X. I would appreciate receiving your advice and help. Thanks, Pradip Muhuri ** The following codes work when producing the chart in which bars are NOT sorted. Please see the output. * Data File 5.1 8.7 1.6 3.7 7.4 2.8 10.412.03.5 4.4 8.8 1.7 2.0 3.5 0.7 6.7 11.03.1 5.3 6.7 1.8 ### #source(C:/Documents and Settings/pradip.muhuri/My Documents/disorders_chart1.R) - Please ignore this line #R Scripts for bar chart begin here # Read drug data from tab-delimited data set drug_data - read.table(C:/Documents and Settings/pradip.muhuri/My Documents/xdrug.dat, header=FALSE, col.names=c(Age_1217, Age_1825, Age_26Plus), row.names = c(White,Black,Native American/Alaska Native,Hawaiian/OPI,Asian, More than One Race, Hispanic), sep=\t) # Graph drug use disorder data with adjacent bars using rainbow colors barplot(as.matrix(drug_data), main=Past-Year Illicit Drug Use Disorders by Race/Ethnicity, ylab= Past-Year Use Disorder Rate (%), beside=TRUE, col=rainbow(7)) legend(topright, c(White,Black,Native American/Alaska Native,Hawaiian/OPI,Asian, More than One Race, Hispanic), cex=0.6, bty=n, fill=rainbow(7)); Bar_Graph.pdf Description: Bar_Graph.pdf __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to order each element according to alphabet
Thank you very much for your reply doctor. I tried to apply your command to my table but couldn't Would you please enlighten me on how to do this when 'lets2' is a 4X4 matrix for example. -- View this message in context: http://r.789695.n4.nabble.com/how-to-order-each-element-according-to-alphabet-tp3668997p3669162.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to order each element according to alphabet
On Jul 14, 2011, at 11:56 PM, onthetopo wrote: Thank you very much for your reply doctor. I tried to apply your command to my table but couldn't Would you please enlighten me on how to do this when 'lets2' is a 4X4 matrix for example. The message doesn't seem to be getting through. Let's see if higher volume works: ** PLEASE do read the posting guide http://www.R-project.org/posting-guide.html PLEASE do read the posting guide http://www.R-project.org/posting-guide.html ** AND * and provide commented, minimal, self-contained, reproducible code. and provide commented, minimal, self-contained, reproducible code. * We have yet to see this 4X4 matrix that you are suggesting as an example. -- View this message in context: http://r.789695.n4.nabble.com/how-to-order-each-element-according-to-alphabet-tp3668997p3669162.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.