Re: [R] Merging two files together in R
Try looking at ?merge If your data is in two dataframes df1 and df2: merge(df1, df2) (This will merge on SNPID because that column is common to both dataframes). --- -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Morassa Mohseni Sent: 24 August 2007 15:41 To: r-help@stat.math.ethz.ch Subject: [R] Merging two files together in R Hi, Thanks in advance for reading this post. I received some affymetrix genotyping data back recently (250K, Nsp array)...However, in order for me to do any analysis on this data set, I need to add append the annotation file to it. Basically I want to do something that looks like this: Snpfile(tab delimited): SNPID Genotype X Y 123 AA13.4 1.2 456 AB 10.1 12.2 789 BB 2.714.4 Annotation file (csv file): rs#, SNPID, Chromosome rs23525, 456, 12 rs78423, 123, 4 rs82342, 789, 9 What I am trying to get is an output file that looks like this: SNPID rs# Chromosome Genotype X Y 123 rs78423 4 AA 13.4 1.2 456 rs23525 12AB 10.1 12.2 789 rs82342 9BB 2.7 14.4 The SNPID is the same in both files so I would like to use that to match up...but they are not in the same order in both files, so I want to make sure that I am appending and merging the 2 files correctly. So far all ive really been able to do is import the files into R...Ive been looking through the posts, and was wondering if I could use cbind(...) to merge the files?...not sure though. Thanks again!! Morassa Mohseni PhD Student Johns Hopkins Dept. of Human Genetics Baltimore, MD [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with reading data files with different numbers oflines to skip
Hi Tom It looks as if you are reading in genepix files. I believe the format for the start lines includes a second line to say how many lines to skip. Something like this, specifying 27 lines to skip: ATF 1 27 43 Type=GenePix Results 1.4 DateTime=2003/11/14 17:18:30 If so here is a function I use to do what you want to do. If your files have a different format then you need to modify how you set the number of lines to skip. # Preprocess the genepix files - strip off first header lines dopix-function(genepixfiles, workingdir) { pre-Pre # Read in each genepix file, strip unwanted rows and write out again for (pixfile in genepixfiles) { pixfileout-paste(workingdir, pre, basename(pixfile), sep=) secondline-read.table(pixfile, skip=1, nrows=1) skiplines-as.numeric(secondline[1]) + 2 outdf-read.table(pixfile, header=T, skip=skiplines, sep=\t) write.table(outdf, file=pixfileout, sep=\t, row.names=FALSE) } } Regards John Seers -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tom Cohen Sent: 03 August 2007 13:04 To: r-help@stat.math.ethz.ch Subject: Re: [R] problem with reading data files with different numbers oflines to skip Thanks to Ted and Gabor for your response. I apology for not being clear with my previous description of the problem. I tried with your suggestions using readLines but couldn't make it work. I now explain the problem in more details and hope that you can help me out. I have 30 data files, where some of them have 33 lines and the rests have 31 lines that I want to skip (examples below with bold text). That is, I only want to keep the lines start from Block Column Row Name ID I read in the data files with a loop like below, the problem is how do I tell the loop to skip 31 lines in some data files and 33 in the rests ? for (i in 1:num.files) { a-read.table(file=data[i], ,header=T,skip=31,sep='\t',na.strings=NA) } Thanks for your help, Tom # 33 lines to skip Type=GenePix Results 3 DateTime=2006/10/20 13:35:11Settings= GalFile=G:\Avdelningar\viv\translational immunologi\Peptide-arrays\Gal-files\742-human-pep2.galPixelSize=10 Wavelengths=635 ImageFiles=M:\Peptidearrays\061020\742-2.tif 1 NormalizationMethod=None NormalizationFactors=1 JpegImage=C:\Documents and Settings\Shahnaz Mahdavifar\Skrivbord\Human pep,742\742-2.s1.jpgStdDev=Type 1 RatioFormulations=W1/W2 (635/)FeatureType=Circular Barcode= BackgroundSubtraction=LocalFeature ImageOrigin=560, 1360 JpegOrigin=1940, 3670 Creator=GenePix Pro 6.0.1.25 Scanner=GenePix 4000B [84948] FocusPosition=0Temperature=30.2 LinesAveraged=1 Comment=PMTGain=600 ScanPower=100LaserPower=3.36 Filters=EmptyScanRegion=56,136,2123,6532 Supplier=Genetix Ltd. ArrayerSoftwareName=MicroArraying ArrayerSoftwareVersion=QSoft XP Build 6450 (Revision 131) Block Column Row Name ID X Y Dia. F635 Median F635 Mean1 1 1 IgG-human none 2390 4140 200 301 3171 2 1 PGDR_HUMAN (P09619) AHASDEIYEIMQK 2630 4140 200 254 2501 3 1 ML1X_HUMAN (Q13585) AIAHPVSDDSDLP 2860 4140 200 268 252 1000 more rows # 31 lines to skip ATF 1.029 41 Type=GenePix Results 3 DateTime=2006/10/20 13:05:20 Settings= GalFile=G:\Avdelningar\viv\translational immunologi\Peptide-arrays\Gal-files\742-s2.gal PixelSize=10 Wavelengths=635 ImageFiles=M:\Peptidearrays\061020\742-4.tif 1 NormalizationMethod=None NormalizationFactors=1 JpegImage=C:\Documents and Settings\Shahnaz Mahdavifar\Skrivbord\Human pep,742\742-4.s2.jpgStdDev=Type 1 RatioFormulations=W1/W2 (635/)FeatureType=Circular Barcode= BackgroundSubtraction=LocalFeature ImageOrigin=560, 1360 JpegOrigin=1950, 24310 Creator=GenePix Pro 6.0.1.25 Scanner=GenePix 4000B [84948] FocusPosition=0 Temperature=28.49LinesAveraged=1 Comment=PMTGain=600ScanPower=100 LaserPower=3.32
Re: [R] Finding matches in 2 files
Something like: # Sample data g1-c(gene1, gene2, gene3, gene4, gene5, gene9, gene10, geneA) g2-c(gene6, gene9, gene1, gene2, gene7, gene8, gene9, gene1, gene10) df1-cbind(gene=g1, expr=runif(length(g1))) df2-cbind(gene=g2, expr=runif(length(g2))) # Merge mdf-merge(df1, df2, by=gene, sort=T) # Unique list ug-unique(mdf[,gene]) You may find the match command useful and/or the %in% opertaor. JS --- -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of jenny tan Sent: 26 July 2007 04:35 To: r-help@stat.math.ethz.ch Subject: [R] Finding matches in 2 files I have 2 files containing data analysed by 2 different methods. I would like to find out which genes appear in both analyses. Can someone show me how to do this? _ [[trailing spam removed]] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Previously saved workspace restored
Hi If you enter the command ls()you will see a list of names that have come with the .Rdata file you double-clicked. If you enter one of these names at the command prompt you will see the data. So, for example if you have some data called mydata: ls() [1] mydata repos mydata [,1] [,2] [,3] [1,]147 [2,]258 [3,]369 Regards John --- -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Kristi Glover Sent: 11 July 2007 05:18 To: r-help@stat.math.ethz.ch Subject: [R] Previously saved workspace restored hi there, i an beginner of R. some one have sent me a file (extension is .Rdata). i have installed R in my computer and i just double clicked the data. then it automatically opened R programme and displayed that [previously saved workspace restored]. the following message was displayed. Type 'demo()' for some demos, 'help()' for on-line help, or'help.start()' for an HTML browser interface to help.Type 'q()' to quit R. [Previously saved workspace restored] but how can I see the data (table) which is saved (in R format) in R?, i hope you will help me. Kristi Glover _ Explore the seven wonders of the world BRE [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Loop and cbind
Hi In what way does it not work? My guess is that you have not declared your values outside the for loop. As they are local they will be lost on exit. You need to declare them before: ewma-vector(length=12) standard-vector(length=12) for ... { } John Seers --- Hi, I would like to apply the following function for i between 1 and 12, and then construct a list of the return series. for (i in 1:12){ ewma[i] - emaTA(calm[[i]]^2,0.03) standard[i]- calm[[i]]/sqrt(ewma[i]) standard - cbind(standard[i]) } But it does not work. Could anyone give me some advice how can I achieve this? Many thanks -- View this message in context: http://www.nabble.com/Loop-and-cbind-tf4024291.html#a11430500 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Repeat if
Hi I think a for loop would be more what you want. Something along the lines of: V-list(a=c(1,2,3), b=c(2,3,4)) # list of 2 vectors for ( i in 1:2 ) { # 2 vectors (replace with 85 ...) print(range (V[i], na.rm = TRUE)) } Regards JS --- -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Birgit Lemcke Sent: 28 June 2007 10:48 To: R Hilfe Subject: [R] Repeat if Hello, (Power Book G4, Mac OS X, R 2.5.0) I would like to repeat the function range for 85 Vectors (V1-V85). I tried with this code: i-0 repeat { + i-i+1 + if (i85) next + range (Vi, na.rm = TRUE) + if (i==85) break + } __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Repeat if
Hi Birgit No, you do not have to write all 85 vectors in the first line. I just did not fully appreciate what you were trying to do. You could use the get option as was suggested somewhere else. So, if your vectors are V1 to V2 (i.e. 85) say, something like: V1-c(1,2,3) V2-c(5,2,7) ... V-paste(V, 1:2, sep=) for ( i in 1:length(V) ) { print(range (get(V[i]), na.rm = TRUE)) } Regards JS --- Web sites: www.ifr.ac.uk www.foodandhealthnetwork.com -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Birgit Lemcke Sent: 28 June 2007 15:12 To: john seers (IFR) Cc: R Hilfe Subject: Re: [R] Repeat if Hello John, I tried this code. But I got only the ranges of V1 and V2 what is easily understandable. Do I have to write in all 85 vectors in the first line? V-list(a=c(V1), b=c(V2)) for ( i in 1:85 ) { # 2 vectors (replace with 85 ...) + print(range (V[i], na.rm = TRUE)) + } sapply(1:85, function(i) eval(parse(text=paste(range(V, i, , na.rm=T), sep= But thanks anyway. Greetings Birgit Am 28.06.2007 um 12:23 schrieb john seers ((IFR)): Hi I think a for loop would be more what you want. Something along the lines of: V-list(a=c(1,2,3), b=c(2,3,4)) # list of 2 vectors for ( i in 1:2 ) { # 2 vectors (replace with 85 ...) print(range (V[i], na.rm = TRUE)) } Regards JS --- -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Birgit Lemcke Sent: 28 June 2007 10:48 To: R Hilfe Subject: [R] Repeat if Hello, (Power Book G4, Mac OS X, R 2.5.0) I would like to repeat the function range for 85 Vectors (V1-V85). I tried with this code: i-0 repeat { + i-i+1 + if (i85) next + range (Vi, na.rm = TRUE) + if (i==85) break + } Birgit Lemcke Institut für Systematische Botanik Zollikerstrasse 107 CH-8008 Zürich Switzerland Ph: +41 (0)44 634 8351 [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R: x-y data
tt-read.table(C:/temp/test.csv, header=T, sep=,) # Try: tt$x tt$y # OR tt[x] tt[y] # OR tt[[x]] tt[[y]] # OR tt[1] tt[2] tt[[1]] tt[[2]] # Is this what you want? I have an Excel file with x-y data. I saved this file as a cvs file. Then I used the read.table() function to read the data into R. If I have a formula like (x+y)/2, how would I access x and y in R? I have the table named as something. But how do I access the individual columns if I want to plug them into the formula? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] autoload libraries at startup
Hi I do not know if this is the best way, but have a look at .Rprofile - a text file that lives in the R root directory ans is executed at startup. You could put library() commands in that. See ?Startup for more information. Regards JS --- John Seers Institute of Food Research Norwich Research Park Colney Norwich NR4 7UA tel +44 (0)1603 251497 fax +44 (0)1603 507723 e-mail [EMAIL PROTECTED] e-disclaimer at http://www.ifr.ac.uk/edisclaimer/ Web sites: www.ifr.ac.uk www.foodandhealthnetwork.com -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: 08 March 2007 20:46 To: r-help@stat.math.ethz.ch Subject: [R] autoload libraries at startup Hi All I was wondering if there is a way I can specify in R that it should load libraries automatically at startup, so that I do not have to manually issue the command. Thanks Toby __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] gsub
Is this what you want? : gsub(cpue\|nogd, , string) John --- Web sites: www.ifr.ac.uk www.foodandhealthnetwork.com -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Luis Ridao Cruz Sent: 15 November 2006 13:29 To: r-help@stat.math.ethz.ch Subject: [R] gsub R-help, I want to remove the following strings cpue and nogd string - c(upsanogd ,toskanogd , hysunogd , konganogd ,gullaksnogd , longunogd , blalongunogd , brosmunogd) I could use first : first - gsub(cpue , , string) and then : second - gsub(nogd , , first) Can it be done at once? Thanks in advance version _ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 2 minor 4.0 year 2006 month 10 day03 svn rev39566 language R version.string R version 2.4.0 (2006-10-03) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Closing R fails
Hello All I cannot close R easily: q() Error in .Last() : could not find function finalizeSession This seems to have started after I used the R.utils package. If I load the R.utils package I can close R successfully. But I do not want to have to do this every time I run R. Is there any way I can switch off /reverse this behaviour? (I have had a look in the archives/documentation but cannot find a solution). Thanks for any help. John Seers R.version$os [1] mingw32 R.version.string [1] R version 2.4.0 Patched (2006-10-29 r39744) [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Closing R fails
Hi Gavin Thanks for helping. Does R close properly if you invoke R with the --vanilla flag? It sure does. So, deleting .RData from my workspace seems to fix it more neatly. But, of course, the problem come backs whenever I use R.utils. (Unless I remember not to save my workspace). A good improvement though. Thanks. John --- John Seers Institute of Food Research Norwich Research Park Colney Norwich NR4 7UA tel +44 (0)1603 251497 fax +44 (0)1603 507723 e-mail [EMAIL PROTECTED] e-disclaimer at http://www.ifr.ac.uk/edisclaimer/ Web sites: www.ifr.ac.uk www.foodandhealthnetwork.com -Original Message- From: Gavin Simpson [mailto:[EMAIL PROTECTED] Sent: 09 November 2006 12:46 To: john seers (IFR) Cc: r-help@stat.math.ethz.ch Subject: Re: [R] Closing R fails On Thu, 2006-11-09 at 12:18 +, john seers (IFR) wrote: Hello All I cannot close R easily: q() Error in .Last() : could not find function finalizeSession This seems to have started after I used the R.utils package. If I load the R.utils package I can close R successfully. But I do not want to have to do this every time I run R. John, Does R close properly if you invoke R with the --vanilla flag? I forget the best way to do this in Windows, but editing the shortcut is one way. Right click the R shortcut, find the Target, and add --vanilla to the end of it. You might need to enclose the whole path and command in , e.g.: C:\R\R.exe --vanilla. If I've gotten this wrong then I'm sure a Windows user will let us know. Your problem may be related to R loading a previous session that you saved when exiting, where you'd used R.utils, which expects to find R.utils loaded and throws and error when you exit because it isn't. Running R --vanilla will stop R automatically loading the saved R session file .RData. HTH G Is there any way I can switch off /reverse this behaviour? (I have had a look in the archives/documentation but cannot find a solution). Thanks for any help. John Seers -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Gavin Simpson [t] +44 (0)20 7679 0522 ECRC ENSIS, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read file problem
Hi If the file is tab delimited you could try something like this: a-read.delim(file, skip = 9, header=F, na.strings=NA) Are you sure you want to skip 10 lines? (Is there a blank line somewhere?) J --- -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Luis Ridao Cruz Sent: 03 November 2006 14:03 To: r-help@stat.math.ethz.ch Subject: [R] read file problem R-help, I have the following file I want to import to R (some lines removed) Calibrated CTD data for station:00280001 Calibrated:23/8 2001, Salinity Unsmoothed, Fluorescence Uncalibrated Maximum observed depth:36 m QUAL has one digit for each of pressure, temp., sal. and fluor. QUAL=1:Uncal., QUAL=2:OK, QUAL=6:Interp., QUAL=9:No data DEPTH CTDPRS CTDTMP CTDSAL RAWFLU NUMB. QUAL MDBAR IPTS-68 PSS-78 OBS. *** *** *** *** 1 1.0 2999 2 2.0 5.9793 35.1629.10717 2221 3 3.0 5.9797 35.1631.10117 2221 4 4.0 5.9809 35.1631.11812 2221 5 5.1 5.9811 35.1629.11542 2221 6 6.1 5.9810 35.1631.11618 2221 7 7.1 5.9797 35.1631.11615 2221 8 8.1 5.9798 35.1630.10213 2221 9 9.1 5.9792 35.1629.11311 2221 ... . If I use : read.table(file, skip = 10) it works fine but sometimes the missing data are not only in line number 1 ( 1 1.0 2999) but in lines 1,2,3,,, and therefore R fails to import the data file How can I fix it? I have tried with the arguments strip.white = TRUE , fill = TRUE , blank.lines.skip = TRUE but still not get what I want Thanks in advance version _ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 2 minor 4.0 year 2006 month 10 day03 svn rev39566 language R version.string R version 2.4.0 (2006-10-03) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] splitting very long character string
Hi Arne If you are reading in from files and they are just one number per line it would be more efficient to use scan directly. ?scan For example: filen-C:/temp/tt.txt i-scan(filen) Read 5 items i [1] 12345 5643765674 63566565666 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: 01 November 2006 15:47 To: r-help@stat.math.ethz.ch Subject: [R] splitting very long character string Hello, I've a very long character array (500k characters) that need to split by '\n' resulting in an array of about 60k numbers. The help on strsplit says to use perl=TRUE to get better formance, but still it takes several minutes to split this string. The massive string is the return value of a call to xmlElementsByTagName from the XML library and looks like this: ... 12345 564376 5674 6356656 5666 ... I've to read about a hundred of these files and was wondering whether there's a more efficient way to turn this string into an array of numerics. Any ideas? thanks a lot for your help and kind regards, Arne [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Vectorise a for loop?
Hi R guru coders I wrote a bit of code to add a new column onto a topTable dataframe. That is a list of genes processed using the limma package. I used a for loop but I kept feeling there was a better way using a more vector oriented approach. I looked at several commands such as apply, by etc but could not find a good way to do it. I have this feeling there is a command or technique eluding me. (Is there an expr:value1?value2 construction in R?) Can anybody suggest an elegant solution? Details: So, the topTable looks like this: topa1[1:5,c(1,2,3,4)] IDName GB_accession M 11195 245828 SIGKEC9 AX135029 -7.670197 10966107FHL1 B14446 -5.089926 6287 25744 M90LL137340 -4.531744 777 2288 VSNL1 LF039555 -4.035472 11310 272294 M98LL031650 3.866422 I want to add a fold column so it will look like this: topa1[1:5,c(1,2,3,4,10)] IDName GB_accession M fold 11195 245828 SIGKEC9 AX135029 -7.670197 203.68521 10966107FHL1 B14446 -5.089926 34.05810 6287 25744 M90LL137340 -4.531744 23.13082 777 2288 VSNL1 LF039555 -4.035472 16.39828 11310 272294 M98LL031650 3.866422 14.58508 The fold values is calculated from the M column which is a log2 value. The calculation is different depending on whether the M value is negative or positive. That is if the gene is down regulated the reciprocal value has to be used to calculate a fold value. Here is my clunky, not vectorised code : # Function to add a fold column to the toptable ttfold-function(tt) { fold-NULL for (i in 1:length(tt$M)) { if (tt$M[i] 0 ) { fold[i]-1/(2^tt$M[i]) } else { fold[i]-2^tt$M[i] } } tt-cbind(tt, fold) } # Add fold column to top tables topa1-ttfold(topa1) Regards J --- John Seers Institute of Food Research Norwich Research Park Colney Norwich NR4 7UA tel +44 (0)1603 251497 fax +44 (0)1603 507723 e-mail [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] e-disclaimer at http://www.ifr.ac.uk/edisclaimer/ http://www.ifr.ac.uk/edisclaimer/ Web sites: www.ifr.ac.uk http://www.ifr.ac.uk/ www.foodandhealthnetwork.com http://www.foodandhealthnetwork.com/ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Vectorise a for loop?
Hi Jacques Yes, that looks a whole lot better. That ifelse is exactly what I was searching for. Merci. J --- John Seers Institute of Food Research Norwich Research Park Colney Norwich NR4 7UA tel +44 (0)1603 251497 fax +44 (0)1603 507723 e-mail [EMAIL PROTECTED] e-disclaimer at http://www.ifr.ac.uk/edisclaimer/ Web sites: www.ifr.ac.uk www.foodandhealthnetwork.com -Original Message- From: Jacques VESLOT [mailto:[EMAIL PROTECTED] Sent: 26 September 2006 14:02 To: john seers (IFR) Cc: R-help Subject: Re: [R] Vectorise a for loop? tt$fold - ifelse(tt$M 0, 1/(2^tt$M), 2^tt$M) --- Jacques VESLOT CNRS UMR 8090 I.B.L (2ème étage) 1 rue du Professeur Calmette B.P. 245 59019 Lille Cedex Tel : 33 (0)3.20.87.10.44 Fax : 33 (0)3.20.87.10.31 http://www-good.ibl.fr __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] library(plgem)
Perhaps here? http://www.bioconductor.org/packages/bioc/1.6/src/contrib/html/plgem.htm l -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Amir Safari Sent: 04 September 2006 14:44 To: R-help@stat.math.ethz.ch Subject: [R] library(plgem) Dear Users, library(plgem) doesn't exist directly in the list of available packages of R. Where could it be found? Thanks so much for help. Amir - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R server
There is the Rserve package which you can look at here: http://stats.math.uni-augsburg.de/Rserve/down.shtml JS --- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] sort matrix by sum of columns
Albert Is this what you want?: a[,order(colSums(a))] John S --- -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Albert Vilella Sent: 21 June 2006 11:38 To: r help Subject: [R] sort matrix by sum of columns Hi all, I would like to know how can I sort the cols of a matrix by the sum of their elements. a - matrix(as.integer(rnorm(25,4,2)),10,5) colnames(a) = c(alfa,bravo,charlie,delta,echo) I guess I should use colSums, and then rearrange the matrix somehow according to the result. My idea is to display a sorted barplot: barplot(a, horiz=TRUE, legend.text=T) Thanks in advance, Albert. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] HTML nsmall vector format problem
Hello All I am having a bit of trouble formatting my HTML with the desired number of digits after the decimal place. Am I doing something wrong/misunderstanding or is it a bug? Looking at the example supplied with ?HTML.data.frame: HTML(iris[1:2,1:2],nsmall=c(3,1),file=) Gives html output that includes the lines: /tr trtd class=firstcolumn1/tdtd class=cellinside5.100/tdtd class=cellinside3.500/td/tr trtd class=firstcolumn2/tdtd class=cellinside4.900/tdtd class=cellinside3.000/td/tr My understanding of how nsmall works, as a vector, the output should be something like: /tr trtd class=firstcolumn1/tdtd class=cellinside5.100/tdtd class=cellinside3.5/td/tr trtd class=firstcolumn2/tdtd class=cellinside4.900/tdtd class=cellinside3.0/td/tr i.e. first column with 3 digits after the decimal place and the second column with 1 digit after the decimal place. It appears to only use the first value in the vector. Has anybody got any suggestions? Thanks for any help. John Seers --- John Seers Institute of Food Research Norwich Research Park Colney Norwich NR4 7UA tel +44 (0)1603 251497 fax +44 (0)1603 507723 e-mail [EMAIL PROTECTED] e-disclaimer at http://www.ifr.ac.uk/edisclaimer/ http://www.ifr.ac.uk/edisclaimer/ Web sites: www.ifr.ac.uk http://www.ifr.ac.uk/ www.foodandhealthnetwork.com http://www.foodandhealthnetwork.com/ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] HTML nsmall vector format problem
Hi Tom Thanks for the reply. I see what you are saying - that format does not format using an nsmall vector, though the documentation (of HTML.data.frame) and the example suggest nsmall uses a vector. Even if I went through and changed the columns with a loop the HTML.data.frame would reformat them according to its formatting so I would still get all the columns with one value of nsmall. Specifically I want 0 for my first three columns and 4 for the remaining columns in my data frame. (I would also like to control the widths but I guess I may run into the same problem). I could reprocess the HTML output but that makes generating the HTML a bit redundant! Regards John --- John Seers Institute of Food Research Norwich Research Park Colney Norwich NR4 7UA tel +44 (0)1603 251497 fax +44 (0)1603 507723 e-mail [EMAIL PROTECTED] e-disclaimer at http://www.ifr.ac.uk/edisclaimer/ Web sites: www.ifr.ac.uk www.foodandhealthnetwork.com -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of tshort Sent: 09 June 2006 11:54 To: r-help@stat.math.ethz.ch Subject: Re: [R] HTML nsmall vector format problem John, I don't think nsmall uses a vector. Try the following with format (which HTML.data.frame uses): format(iris[1:2,1:2],nsmall=c(3,1)) Sepal.Length Sepal.Width 15.100 3.500 24.900 3.000 It looks like you'll have to do a format column by column with a loop. - Tom john seers (IFR) wrote: Hello All I am having a bit of trouble formatting my HTML with the desired number of digits after the decimal place. Am I doing something wrong/misunderstanding or is it a bug? Looking at the example supplied with ?HTML.data.frame: HTML(iris[1:2,1:2],nsmall=c(3,1),file=) Gives html output that includes the lines: /tr trtd class=firstcolumn1/tdtd class=cellinside5.100/tdtd class=cellinside3.500/td/tr trtd class=firstcolumn2/tdtd class=cellinside4.900/tdtd class=cellinside3.000/td/tr My understanding of how nsmall works, as a vector, the output should be something like: /tr trtd class=firstcolumn1/tdtd class=cellinside5.100/tdtd class=cellinside3.5/td/tr trtd class=firstcolumn2/tdtd class=cellinside4.900/tdtd class=cellinside3.0/td/tr i.e. first column with 3 digits after the decimal place and the second column with 1 digit after the decimal place. It appears to only use the first value in the vector. Has anybody got any suggestions? Thanks for any help. John Seers --- John Seers Institute of Food Research Norwich Research Park Colney Norwich NR4 7UA tel +44 (0)1603 251497 fax +44 (0)1603 507723 e-mail [EMAIL PROTECTED] e-disclaimer at http://www.ifr.ac.uk/edisclaimer/ http://www.ifr.ac.uk/edisclaimer/ Web sites: www.ifr.ac.uk http://www.ifr.ac.uk/ www.foodandhealthnetwork.com http://www.foodandhealthnetwork.com/ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- View this message in context: http://www.nabble.com/HTML-nsmall-vector-format-problem-t1760896.html#a4 791061 Sent from the R help forum at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] HTML nsmall vector format problem
Hi Tom Excellent! Exactly what I need. I see what you mean now. Thanks very much for your help. John --- John Seers Institute of Food Research Norwich Research Park Colney Norwich NR4 7UA tel +44 (0)1603 251497 fax +44 (0)1603 507723 e-mail [EMAIL PROTECTED] e-disclaimer at http://www.ifr.ac.uk/edisclaimer/ Web sites: www.ifr.ac.uk www.foodandhealthnetwork.com -Original Message- From: Short, Tom [mailto:[EMAIL PROTECTED] Sent: 09 June 2006 12:36 To: john seers (IFR); r-help@stat.math.ethz.ch Subject: RE: [R] HTML nsmall vector format problem John, you can use format ahead of time (this converts to character columns, so HTML won't reformat), and then use HTML: z=format(iris[1:2,1:2],nsmall=3) z[,2]=format(iris[1:2,2],nsmall=1) HTML(z,file=) - Tom -Original Message- From: john seers (IFR) [mailto:[EMAIL PROTECTED] Sent: Friday, June 09, 2006 7:19 AM To: Short, Tom; r-help@stat.math.ethz.ch Subject: RE: [R] HTML nsmall vector format problem Hi Tom Thanks for the reply. I see what you are saying - that format does not format using an nsmall vector, though the documentation (of HTML.data.frame) and the example suggest nsmall uses a vector. Even if I went through and changed the columns with a loop the HTML.data.frame would reformat them according to its formatting so I would still get all the columns with one value of nsmall. Specifically I want 0 for my first three columns and 4 for the remaining columns in my data frame. (I would also like to control the widths but I guess I may run into the same problem). I could reprocess the HTML output but that makes generating the HTML a bit redundant! Regards John --- John Seers Institute of Food Research Norwich Research Park Colney Norwich NR4 7UA tel +44 (0)1603 251497 fax +44 (0)1603 507723 e-mail [EMAIL PROTECTED] e-disclaimer at http://www.ifr.ac.uk/edisclaimer/ Web sites: www.ifr.ac.uk www.foodandhealthnetwork.com -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of tshort Sent: 09 June 2006 11:54 To: r-help@stat.math.ethz.ch Subject: Re: [R] HTML nsmall vector format problem John, I don't think nsmall uses a vector. Try the following with format (which HTML.data.frame uses): format(iris[1:2,1:2],nsmall=c(3,1)) Sepal.Length Sepal.Width 15.100 3.500 24.900 3.000 It looks like you'll have to do a format column by column with a loop. - Tom john seers (IFR) wrote: Hello All I am having a bit of trouble formatting my HTML with the desired number of digits after the decimal place. Am I doing something wrong/misunderstanding or is it a bug? Looking at the example supplied with ?HTML.data.frame: HTML(iris[1:2,1:2],nsmall=c(3,1),file=) Gives html output that includes the lines: /tr trtd class=firstcolumn1/tdtd class=cellinside5.100/tdtd class=cellinside3.500/td/tr trtd class=firstcolumn2/tdtd class=cellinside4.900/tdtd class=cellinside3.000/td/tr My understanding of how nsmall works, as a vector, the output should be something like: /tr trtd class=firstcolumn1/tdtd class=cellinside5.100/tdtd class=cellinside3.5/td/tr trtd class=firstcolumn2/tdtd class=cellinside4.900/tdtd class=cellinside3.0/td/tr i.e. first column with 3 digits after the decimal place and the second column with 1 digit after the decimal place. It appears to only use the first value in the vector. Has anybody got any suggestions? Thanks for any help. John Seers --- John Seers Institute of Food Research Norwich Research Park Colney Norwich NR4 7UA tel +44 (0)1603 251497 fax +44 (0)1603 507723 e-mail [EMAIL PROTECTED] e-disclaimer at http://www.ifr.ac.uk/edisclaimer/ http://www.ifr.ac.uk/edisclaimer/ Web sites: www.ifr.ac.uk http://www.ifr.ac.uk/ www.foodandhealthnetwork.com http://www.foodandhealthnetwork.com/ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- View this message in context: http://www.nabble.com/HTML-nsmall-vector-format-problem-t1760896.html#a4 791061 Sent from the R help forum at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R-Help
I think this does what you require. #Read your data in whatever way you wish: d1-data.frame(Date=c(2005/1/1,2005/2/1,2005/1/3,2005/1/4,2005/ 1/7,2005/3/5), x=c(119,123,-110,114,11,200), y=c(230,-125,300,-21,299,311)) d2-data.frame(Date=c(2005/1/3,2005/1/4,2005/1/5,2005/1/6,2005/ 3/5), x=c(-220,116,888,-239,201), y=c(301,-23,3000,122,312)) d3-data.frame(Date=c(2005/1/4,2005/1/5,2005/3/5,2005/4/23), x=c(392,511,600,723), y=c(-81,6699,9311,1200)) #Make a list listof-list(d1,d2,d3) #loop over any number of datasets merging as you go for ( dataset in 1:length(listof)-1) { if (dataset == 1) { res-merge(listof[dataset],listof[dataset+1],all=T,by=Date) } else { res-merge(res,listof[dataset+1],all=T,by=Date) } } # Hope that helps JS --- John Seers Institute of Food Research Norwich Research Park Colney Norwich NR4 7UA tel +44 (0)1603 251490 fax +44 (0)1603 255167 e-mail [EMAIL PROTECTED] e-disclaimer at http://www.ifr.ac.uk/edisclaimer/ Web sites: www.ifr.ac.uk www.foodandhealthnetwork.com -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of stat stat Sent: 20 April 2006 09:17 To: r-help@stat.math.ethz.ch Subject: [R] R-Help Dear r-users, Suppose I have three datasets: Dataset-1: Date x y Jan-1,2005120 230 Jan-2,2005123 -125 Jan-3,2005-110 300 Jan-4,2005114 -21 Jan-7,200511299 Mar-5,2005200 311 Dataset-2: Date x y Jan-2,2005123 -125 Jan-3,2005-110 300 Jan-4,2005114 -21 Jan-5,200511299 Jan-6,2005-23 12 Mar-5,2005200 311 Dataset-3: Date x y Jan-3,2005-110 300 Jan-4,2005114 -21 Jan-5,200511299 Mar-5,2005200 311 Apl-23,2005 123 200 Now I want to get the common dates along with x and y from this above three datasets keeping the same order in date-variable as it is. For ex. I want to get: Datex y xy x y (from dataset-1) (from dataset-2) (from dataset-3) Jan-3,2005-110 300 -110 300 -110 300 Jan-4,2005 114 -21 114-21 114 -21 Mar-5,2005200 311 200 311 200 311 Can anyone give me any R code to implement this for any number of datasets ? Thanks and regards thanks in advance - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html