Re: [R] Paste every two columns together
Hi Kate, Maybe you want: seq(2,length(x),by=2) Jim On Thu, Jan 29, 2015 at 10:55 AM, Kate Ignatius kate.ignat...@gmail.com wrote: I have genetic data as follows (simple example, actual data is much larger): comb = ID1 A A T G C T G C G T C G T A ID2 G C T G C C T G C T G T T T And I wish to get an output like this: ID1 AA TG CT GC GT CG TA ID2 GC TG CC TG CT GT TT That is, paste every two columns together. I have this code, but I get the error: Error in seq.default(2, nchar(x), 2) : 'to' must be of length 1 conc - function(x) { s - seq(2, nchar(x), 2) paste0(x[s], x[s+1]) } combn - as.data.frame(lapply(comb, conc), stringsAsFactors=FALSE) Thanks in advance! __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Paste every two columns together
I have genetic data as follows (simple example, actual data is much larger): comb = ID1 A A T G C T G C G T C G T A ID2 G C T G C C T G C T G T T T And I wish to get an output like this: ID1 AA TG CT GC GT CG TA ID2 GC TG CC TG CT GT TT That is, paste every two columns together. I have this code, but I get the error: Error in seq.default(2, nchar(x), 2) : 'to' must be of length 1 conc - function(x) { s - seq(2, nchar(x), 2) paste0(x[s], x[s+1]) } combn - as.data.frame(lapply(comb, conc), stringsAsFactors=FALSE) Thanks in advance! __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Paste every two columns together
eek! Chel Hee,anything that complicated should engender fear and trembling. Much simpler and more efficient (if I understand correctly) i - seq.int(1L,length(ID1),by = 2L) paste0(ID1[i],ID1[i+1]) That gives a vector of paired letters. If you want a single character string, just collapse with a (space): paste0(ID1[i],ID1[i+1],collapse= ) Cheers, Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. Clifford Stoll On Wed, Jan 28, 2015 at 7:41 PM, Chel Hee Lee chl...@mail.usask.ca wrote: I am using just the first row of your data (i.e. ID1). ID1 - c(A, A, T, G, C, T, G, C, G, T, C, G, T, A) do.call(c,lapply(tapply(ID1, gl(7,2), c), paste, collapse=)) 1234567 AA TG CT GC GT CG TA Is this what you are looking for? I hope this helps. Chel Hee Lee On 01/28/2015 05:55 PM, Kate Ignatius wrote: I have genetic data as follows (simple example, actual data is much larger): comb = ID1 A A T G C T G C G T C G T A ID2 G C T G C C T G C T G T T T And I wish to get an output like this: ID1 AA TG CT GC GT CG TA ID2 GC TG CC TG CT GT TT That is, paste every two columns together. I have this code, but I get the error: Error in seq.default(2, nchar(x), 2) : 'to' must be of length 1 conc - function(x) { s - seq(2, nchar(x), 2) paste0(x[s], x[s+1]) } combn - as.data.frame(lapply(comb, conc), stringsAsFactors=FALSE) Thanks in advance! __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Paste every two columns together
Hi Bert! yes, you are VERY correct!!! Why am I making this simple thing so complicated??? ;) Thank you so much for your nice lesson! Chel Hee Lee On 01/28/2015 09:59 PM, Bert Gunter wrote: eek! Chel Hee,anything that complicated should engender fear and trembling. Much simpler and more efficient (if I understand correctly) i - seq.int(1L,length(ID1),by = 2L) paste0(ID1[i],ID1[i+1]) That gives a vector of paired letters. If you want a single character string, just collapse with a (space): paste0(ID1[i],ID1[i+1],collapse= ) Cheers, Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. Clifford Stoll On Wed, Jan 28, 2015 at 7:41 PM, Chel Hee Lee chl...@mail.usask.ca wrote: I am using just the first row of your data (i.e. ID1). ID1 - c(A, A, T, G, C, T, G, C, G, T, C, G, T, A) do.call(c,lapply(tapply(ID1, gl(7,2), c), paste, collapse=)) 1234567 AA TG CT GC GT CG TA Is this what you are looking for? I hope this helps. Chel Hee Lee On 01/28/2015 05:55 PM, Kate Ignatius wrote: I have genetic data as follows (simple example, actual data is much larger): comb = ID1 A A T G C T G C G T C G T A ID2 G C T G C C T G C T G T T T And I wish to get an output like this: ID1 AA TG CT GC GT CG TA ID2 GC TG CC TG CT GT TT That is, paste every two columns together. I have this code, but I get the error: Error in seq.default(2, nchar(x), 2) : 'to' must be of length 1 conc - function(x) { s - seq(2, nchar(x), 2) paste0(x[s], x[s+1]) } combn - as.data.frame(lapply(comb, conc), stringsAsFactors=FALSE) Thanks in advance! __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Paste every two columns together
I am using just the first row of your data (i.e. ID1). ID1 - c(A, A, T, G, C, T, G, C, G, T, C, G, T, A) do.call(c,lapply(tapply(ID1, gl(7,2), c), paste, collapse=)) 1234567 AA TG CT GC GT CG TA Is this what you are looking for? I hope this helps. Chel Hee Lee On 01/28/2015 05:55 PM, Kate Ignatius wrote: I have genetic data as follows (simple example, actual data is much larger): comb = ID1 A A T G C T G C G T C G T A ID2 G C T G C C T G C T G T T T And I wish to get an output like this: ID1 AA TG CT GC GT CG TA ID2 GC TG CC TG CT GT TT That is, paste every two columns together. I have this code, but I get the error: Error in seq.default(2, nchar(x), 2) : 'to' must be of length 1 conc - function(x) { s - seq(2, nchar(x), 2) paste0(x[s], x[s+1]) } combn - as.data.frame(lapply(comb, conc), stringsAsFactors=FALSE) Thanks in advance! __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Paste every two columns together
Kate, here's a solution that uses regular expressions, rather than vector manipulation: mystr = ID1 A A T G C T G C G T C G T A gsub( ([ACGT]) ([ACGT]), \\1\\2, mystr) [1] ID1 AA TG CT GC GT CG TA -John -Original Message- From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Chel Hee Lee Sent: Wednesday, January 28, 2015 11:07 PM To: Bert Gunter Cc: r-help Subject: Re: [R] Paste every two columns together Hi Bert! yes, you are VERY correct!!! Why am I making this simple thing so complicated??? ;) Thank you so much for your nice lesson! Chel Hee Lee On 01/28/2015 09:59 PM, Bert Gunter wrote: eek! Chel Hee,anything that complicated should engender fear and trembling. Much simpler and more efficient (if I understand correctly) i - seq.int(1L,length(ID1),by = 2L) paste0(ID1[i],ID1[i+1]) That gives a vector of paired letters. If you want a single character string, just collapse with a (space): paste0(ID1[i],ID1[i+1],collapse= ) Cheers, Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. Clifford Stoll On Wed, Jan 28, 2015 at 7:41 PM, Chel Hee Lee chl...@mail.usask.ca wrote: I am using just the first row of your data (i.e. ID1). ID1 - c(A, A, T, G, C, T, G, C, G, T, C, G, T, A) do.call(c,lapply(tapply(ID1, gl(7,2), c), paste, collapse=)) 1234567 AA TG CT GC GT CG TA Is this what you are looking for? I hope this helps. Chel Hee Lee On 01/28/2015 05:55 PM, Kate Ignatius wrote: I have genetic data as follows (simple example, actual data is much larger): comb = ID1 A A T G C T G C G T C G T A ID2 G C T G C C T G C T G T T T And I wish to get an output like this: ID1 AA TG CT GC GT CG TA ID2 GC TG CC TG CT GT TT That is, paste every two columns together. I have this code, but I get the error: Error in seq.default(2, nchar(x), 2) : 'to' must be of length 1 conc - function(x) { s - seq(2, nchar(x), 2) paste0(x[s], x[s+1]) } combn - as.data.frame(lapply(comb, conc), stringsAsFactors=FALSE) Thanks in advance! __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Paste every two columns together
Hi: Don't know about performance, but this is fairly simple for operating on atomic vectors: x - c(A, A, G, T, C, G) apply(embed(x, 2), 1, paste0, collapse = ) [1] AA GA TG CT GC Check the help page of embed() for details. Dennis On Wed, Jan 28, 2015 at 3:55 PM, Kate Ignatius kate.ignat...@gmail.com wrote: I have genetic data as follows (simple example, actual data is much larger): comb = ID1 A A T G C T G C G T C G T A ID2 G C T G C C T G C T G T T T And I wish to get an output like this: ID1 AA TG CT GC GT CG TA ID2 GC TG CC TG CT GT TT That is, paste every two columns together. I have this code, but I get the error: Error in seq.default(2, nchar(x), 2) : 'to' must be of length 1 conc - function(x) { s - seq(2, nchar(x), 2) paste0(x[s], x[s+1]) } combn - as.data.frame(lapply(comb, conc), stringsAsFactors=FALSE) Thanks in advance! __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Paste every two columns together
Hi, Here is my implementation: combine - function(x){ + odd - x[1:length(x) %% 2 == 1] + even - x[1:length(x) %%2 == 0] + paste0(odd,even)} temp - letters[1:24] temp [1] a b c d e f g h i j k l m n o p q r s t u v w x combine(temp) [1] ab cd ef gh ij kl mn op qr st uv wx -- View this message in context: http://r.789695.n4.nabble.com/Paste-every-two-columns-together-tp4702429p4702433.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.