Re: [R] GREP - Choosing values between two borders
Another way you can do it, if the data has the pattern shown in your sample, it to select all the lines that start with a numeric: input - FILE-CONTENT ## + EXAM NUM:2 + - + EXAM #1 + ASTIG:-2.4D + AXIS:4.8 + START OF HEIGHT DATA + 0 0.0 0. + 0 0.1 0.00055643 + 9 4.9 1.67278117 + 9 5.0 1.74873257 + 10 0.0 0. + 10 0.1 0.00075557 + 99 5.3 1.94719490 + END OF HEIGHT DATA + X POS:-0.299mm + Y POS:0.442mm + Z POS:-0.290mm + - + EXAM #2 + ASTIG:-2.4D + AXIS:4.8 + START OF HEIGHT DATA + 0 0.0 0. + 0 0.1 0.00055643 + 9 4.9 1.67278117 + 9 5.0 1.74873257 + 10 0.0 0. + 10 0.1 0.00075557 + 99 5.3 1.94719490 + END OF HEIGHT DATA + X POS:-0.299mm + Y POS:0.442mm + Z POS:-0.290mm + x - readLines(textConnection(input)) x - x[grep(^\\s*\\d, x, perl=TRUE)] x.in - scan(textConnection(x), what=0) Read 42 items x.in - matrix(x.in, ncol=3, byrow=TRUE) x.in [,1] [,2] [,3] [1,]0 0.0 0. [2,]0 0.1 0.00055643 [3,]9 4.9 1.67278117 [4,]9 5.0 1.74873257 [5,] 10 0.0 0. [6,] 10 0.1 0.00075557 [7,] 99 5.3 1.94719490 [8,]0 0.0 0. [9,]0 0.1 0.00055643 [10,]9 4.9 1.67278117 [11,]9 5.0 1.74873257 [12,] 10 0.0 0. [13,] 10 0.1 0.00075557 [14,] 99 5.3 1.94719490 On 4/17/07, Felix Wave [EMAIL PROTECTED] wrote: Hello, I import datas from an file with: readLines But I need only a part of all measurments of this file. These are between two borders START and END. Can you tell me the syntax of grep(), to choose values between two borders? My R Code was not succesful, and I can't finde anything in the help. Thank's a lot. Felix # R-CODE ### file- file-content Measure - grep([START-END],file) #Measure - grep([START|END],file) FILE-CONTENT ## EXAM NUM:2 - EXAM #1 ASTIG:-2.4D AXIS:4.8 START OF HEIGHT DATA 0 0.0 0. 0 0.1 0.00055643 9 4.9 1.67278117 9 5.0 1.74873257 10 0.0 0. 10 0.1 0.00075557 99 5.3 1.94719490 END OF HEIGHT DATA X POS:-0.299mm Y POS:0.442mm Z POS:-0.290mm - EXAM #2 ASTIG:-2.4D AXIS:4.8 START OF HEIGHT DATA 0 0.0 0. 0 0.1 0.00055643 9 4.9 1.67278117 9 5.0 1.74873257 10 0.0 0. 10 0.1 0.00075557 99 5.3 1.94719490 END OF HEIGHT DATA X POS:-0.299mm Y POS:0.442mm Z POS:-0.290mm __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] GREP - Choosing values between two borders
Hello, I import datas from an file with: readLines But I need only a part of all measurments of this file. These are between two borders START and END. Can you tell me the syntax of grep(), to choose values between two borders? My R Code was not succesful, and I can't finde anything in the help. Thank's a lot. Felix # R-CODE ### file- file-content Measure - grep([START-END],file) #Measure - grep([START|END],file) FILE-CONTENT ## EXAM NUM:2 - EXAM #1 ASTIG:-2.4D AXIS:4.8 START OF HEIGHT DATA 0 0.0 0. 0 0.1 0.00055643 9 4.9 1.67278117 9 5.0 1.74873257 10 0.0 0. 10 0.1 0.00075557 99 5.3 1.94719490 END OF HEIGHT DATA X POS:-0.299mm Y POS:0.442mm Z POS:-0.290mm - EXAM #2 ASTIG:-2.4D AXIS:4.8 START OF HEIGHT DATA 0 0.0 0. 0 0.1 0.00055643 9 4.9 1.67278117 9 5.0 1.74873257 10 0.0 0. 10 0.1 0.00075557 99 5.3 1.94719490 END OF HEIGHT DATA X POS:-0.299mm Y POS:0.442mm Z POS:-0.290mm __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] GREP - Choosing values between two borders
You can adapt this to your situation: http://finzi.psych.upenn.edu/R/Rhelp02a/archive/22195.html On 4/17/07, Felix Wave [EMAIL PROTECTED] wrote: Hello, I import datas from an file with: readLines But I need only a part of all measurments of this file. These are between two borders START and END. Can you tell me the syntax of grep(), to choose values between two borders? My R Code was not succesful, and I can't finde anything in the help. Thank's a lot. Felix # R-CODE ### file- file-content Measure - grep([START-END],file) #Measure - grep([START|END],file) FILE-CONTENT ## EXAM NUM:2 - EXAM #1 ASTIG:-2.4D AXIS:4.8 START OF HEIGHT DATA 0 0.0 0. 0 0.1 0.00055643 9 4.9 1.67278117 9 5.0 1.74873257 10 0.0 0. 10 0.1 0.00075557 99 5.3 1.94719490 END OF HEIGHT DATA X POS:-0.299mm Y POS:0.442mm Z POS:-0.290mm - EXAM #2 ASTIG:-2.4D AXIS:4.8 START OF HEIGHT DATA 0 0.0 0. 0 0.1 0.00055643 9 4.9 1.67278117 9 5.0 1.74873257 10 0.0 0. 10 0.1 0.00075557 99 5.3 1.94719490 END OF HEIGHT DATA X POS:-0.299mm Y POS:0.442mm Z POS:-0.290mm __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] grep searching for sequence of 3 consecutive upper case letters
Hello, I need to identify all elements which have a sequence of 3 consecutive upper case letters, anywhere in the string. I tested my grep expression on this site: http://regexlib.com/RETester.aspx But when I try it in R, it does not filter anything. str -c(AGH, this WOUld be good, Not Good at All) str[grep('[A-Z]{3}',str)] #looking for a sequence of 3 consecutive upper case letters [1] AGHthis WOUld be good Not Good at All Any idea? Pierre ** AVIS DE NON-RESPONSABILITE: Ce document transmis par courrie...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grep searching for sequence of 3 consecutive upper case letters
Try str[grep('[[:upper:]]{3}',str)] On 06/11/06, Lapointe, Pierre [EMAIL PROTECTED] wrote: Hello, I need to identify all elements which have a sequence of 3 consecutive upper case letters, anywhere in the string. I tested my grep expression on this site: http://regexlib.com/RETester.aspx But when I try it in R, it does not filter anything. str -c(AGH, this WOUld be good, Not Good at All) str[grep('[A-Z]{3}',str)] #looking for a sequence of 3 consecutive upper case letters [1] AGHthis WOUld be good Not Good at All Any idea? Pierre ** AVIS DE NON-RESPONSABILITE: Ce document transmis par courrie...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- = David Barron Said Business School University of Oxford Park End Street Oxford OX1 1HP __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grep searching for sequence of 3 consecutive upper case letters
Quoting David Barron [EMAIL PROTECTED]: Try str[grep('[[:upper:]]{3}',str)] or more efficiently : grep('[[:upper:]]{3}', str, value = TRUE) On 06/11/06, Lapointe, Pierre [EMAIL PROTECTED] wrote: Hello, I need to identify all elements which have a sequence of 3 consecutive upper case letters, anywhere in the string. I tested my grep expression on this site: http://regexlib.com/RETester.aspx But when I try it in R, it does not filter anything. str -c(AGH, this WOUld be good, Not Good at All) str[grep('[A-Z]{3}',str)] #looking for a sequence of 3 consecutive upper case letters [1] AGHthis WOUld be good Not Good at All Any idea? Pierre ** AVIS DE NON-RESPONSABILITE: Ce document transmis par courrie...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- = David Barron Said Business School University of Oxford Park End Street Oxford OX1 1HP __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grep searching for sequence of 3 consecutive upper case letters
Lapointe, Pierre [EMAIL PROTECTED] writes: Hello, I need to identify all elements which have a sequence of 3 consecutive upper case letters, anywhere in the string. I tested my grep expression on this site: http://regexlib.com/RETester.aspx But when I try it in R, it does not filter anything. str -c(AGH, this WOUld be good, Not Good at All) str[grep('[A-Z]{3}',str)] #looking for a sequence of 3 consecutive upper case letters [1] AGHthis WOUld be good Not Good at All Any idea? There are multiple versions of RE's, and fine details resolve in different ways. Don't expect the RETester to hold the Final Truth; it seems to relate to a particular programming environment, which is not R. grep('[A-Z]{3}', str, perl=TRUE) [1] 1 2 Not only that, but grep('[ABCDEFGHIJKLMNOPQRSTUVWXYZ]{3}', str) [1] 1 2 Hint: What is your collating sequence? Sys.setlocale(LC_COLLATE, C) [1] C grep('[A-Z]{3}', str) [1] 1 2 -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grep function with patterns list...
Anupam == Anupam Tyagi [EMAIL PROTECTED] on Mon, 16 Oct 2006 18:15:06 + (UTC) writes: Anupam Hi Stephane, Anupam Stéphane CRUVEILLER scruveil at genoscope.cns.fr writes: is there a way to pass a list of patterns to the grep function? I vaguely remember something with %in% operator... Anupam I think you are looking for the %in% and %nin% which Anupam are part of Design package, and also in Hmisc Anupam library. You have to install and load these packages Anupam to access these functions. Hmm, '%in%' has been part of standard R for years ... Martin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] grep function with patterns list...
Dear R-users, is there a way to pass a list of patterns to the grep function? I vaguely remember something with %in% operator... Thanks, Stéphane. -- La science a certes quelques magnifiques réussites à son actif mais à tout prendre, je préfère de loin être heureux plutôt qu'avoir raison. D. Adams -- AGC website http://www.genoscope.cns.fr/agc Stéphane CRUVEILLER Ph. D. Genoscope - Centre National de Séquencage Atelier de Génomique Comparative 2, Rue Gaston Cremieux CP 5706 91057 Evry Cedex - France Phone: +33 (0)1 60 87 84 58 Fax: +33 (0)1 60 87 25 14 [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grep function with patterns list...
Try this: grep(b|c|d, letters, value = TRUE) [1] b c d On 10/16/06, Stéphane CRUVEILLER [EMAIL PROTECTED] wrote: Dear R-users, is there a way to pass a list of patterns to the grep function? I vaguely remember something with %in% operator... Thanks, Stéphane. -- La science a certes quelques magnifiques réussites à son actif mais à tout prendre, je préfère de loin être heureux plutôt qu'avoir raison. D. Adams -- AGC website http://www.genoscope.cns.fr/agc Stéphane CRUVEILLER Ph. D. Genoscope - Centre National de Séquencage Atelier de Génomique Comparative 2, Rue Gaston Cremieux CP 5706 91057 Evry Cedex - France Phone: +33 (0)1 60 87 84 58 Fax: +33 (0)1 60 87 25 14 [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grep function with patterns list...
Thx for the hint, but what would I have used if b,c and d were values of a dataframe for instance? Stéphane. Gabor Grothendieck a écrit : Try this: grep(b|c|d, letters, value = TRUE) [1] b c d On 10/16/06, Stéphane CRUVEILLER [EMAIL PROTECTED] wrote: Dear R-users, is there a way to pass a list of patterns to the grep function? I vaguely remember something with %in% operator... Thanks, Stéphane. -- La science a certes quelques magnifiques réussites à son actif mais à tout prendre, je préfère de loin être heureux plutôt qu'avoir raison. D. Adams -- AGC website http://www.genoscope.cns.fr/agc Stéphane CRUVEILLER Ph. D. Genoscope - Centre National de Séquencage Atelier de Génomique Comparative 2, Rue Gaston Cremieux CP 5706 91057 Evry Cedex - France Phone: +33 (0)1 60 87 84 58 Fax: +33 (0)1 60 87 25 14 [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- La science a certes quelques magnifiques réussites à son actif mais à tout prendre, je préfère de loin être heureux plutôt qu'avoir raison. D. Adams -- AGC website http://www.genoscope.cns.fr/agc Stéphane CRUVEILLER Ph. D. Genoscope - Centre National de Séquencage Atelier de Génomique Comparative 2, Rue Gaston Cremieux CP 5706 91057 Evry Cedex - France Phone: +33 (0)1 60 87 84 58 Fax: +33 (0)1 60 87 25 14 [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grep function with patterns list...
Ooops sorry for html tags... Just forgot to edit the message before sending it... So back to my question: Thx for the hint, but what would I have used if b,c and d were values of a dataframe for instance? X is for instance a dataframe: X Mypatterns 1 pattern1 2 pattern2 3 pattern3 Y is another dataframe. If I do: grep(X$Mypatterns,Y) this will take into account only the first pattern... I could use a loop but I vaguely remember an elegant trick that combined grep and %in%. Stéphane. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grep function with patterns list...
DF - data.frame(pat = letters[1:3]) grep(paste(DF$pat, collapse = |), letters, value = TRUE) [1] a b c On 10/16/06, Stéphane CRUVEILLER [EMAIL PROTECTED] wrote: Ooops sorry for html tags... Just forgot to edit the message before sending it... So back to my question: Thx for the hint, but what would I have used if b,c and d were values of a dataframe for instance? X is for instance a dataframe: X Mypatterns 1 pattern1 2 pattern2 3 pattern3 Y is another dataframe. If I do: grep(X$Mypatterns,Y) this will take into account only the first pattern... I could use a loop but I vaguely remember an elegant trick that combined grep and %in%. Stéphane. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grep function with patterns list...
Hi Stephane, Stéphane CRUVEILLER scruveil at genoscope.cns.fr writes: is there a way to pass a list of patterns to the grep function? I vaguely remember something with %in% operator... I think you are looking for the %in% and %nin% which are part of Design package, and also in Hmisc library. You have to install and load these packages to access these functions. Anupam. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grep question
This finds the matching indices of Farrah and Common and then create a set that does not include them: x - c('Farrah', 'more', 'Common', 'last') got.F - grep('Farrah',x) got.C - grep('Common', x) not.ForC - setdiff(seq(along=x), c(got.F, got.C)) x[not.ForC] [1] more last On 8/31/06, Bob Green [EMAIL PROTECTED] wrote: I am hoping for some advice as to how to modify the following syntax, so that instead of saving all records which refer to Farrah, I select all instances that do not include Farrah, or the word Coolum. test - read.csv(c:\\newdat.csv, as.is=TRUE, header=T) sure - test[grep('Farrah', paste(test$V3.HD, test$V3.LP, test$V3.TD)),] write.csv(sure,c:/farrah4.csv) Any assistance is appreciated, regards Bob Green __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grep question
Or using the same x: setdiff(x, grep(Farrah|Common, x, value = TRUE)) [1] more last On 8/31/06, jim holtman [EMAIL PROTECTED] wrote: This finds the matching indices of Farrah and Common and then create a set that does not include them: x - c('Farrah', 'more', 'Common', 'last') got.F - grep('Farrah',x) got.C - grep('Common', x) not.ForC - setdiff(seq(along=x), c(got.F, got.C)) x[not.ForC] [1] more last On 8/31/06, Bob Green [EMAIL PROTECTED] wrote: I am hoping for some advice as to how to modify the following syntax, so that instead of saving all records which refer to Farrah, I select all instances that do not include Farrah, or the word Coolum. test - read.csv(c:\\newdat.csv, as.is=TRUE, header=T) sure - test[grep('Farrah', paste(test$V3.HD, test$V3.LP, test$V3.TD)),] write.csv(sure,c:/farrah4.csv) Any assistance is appreciated, regards Bob Green __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grep question
You have to be careful if the strings are embedded: x - c('xxxFarrahxxx' ,'more than last time', 'some Common numbers', 'last one') setdiff(x, grep('Farrah|Common', x)) # not correct [1] xxxFarrahxxxmore than last time some Common numbers last one ForC - grep('Farrah|Common', x) x[setdiff(seq(along=x), ForC)] [1] more than last time last one On 8/31/06, Gabor Grothendieck [EMAIL PROTECTED] wrote: Or using the same x: setdiff(x, grep(Farrah|Common, x, value = TRUE)) [1] more last On 8/31/06, jim holtman [EMAIL PROTECTED] wrote: This finds the matching indices of Farrah and Common and then create a set that does not include them: x - c('Farrah', 'more', 'Common', 'last') got.F - grep('Farrah',x) got.C - grep('Common', x) not.ForC - setdiff(seq(along=x), c(got.F, got.C)) x[not.ForC] [1] more last On 8/31/06, Bob Green [EMAIL PROTECTED] wrote: I am hoping for some advice as to how to modify the following syntax, so that instead of saving all records which refer to Farrah, I select all instances that do not include Farrah, or the word Coolum. test - read.csv(c:\\newdat.csv, as.is=TRUE, header=T) sure - test[grep('Farrah', paste(test$V3.HD, test$V3.LP, test$V3.TD)),] write.csv(sure,c:/farrah4.csv) Any assistance is appreciated, regards Bob Green __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grep question
Forget the last reply. I left the 'value=TRUE' off the grep. x - c('xxxFarrahxxx' ,'more than last time', 'some Common numbers', 'last one') setdiff(x, grep('Farrah|Common', x, value=TRUE)) [1] more than last time last one ForC - grep('Farrah|Common', x) x[setdiff(seq(along=x), ForC)] [1] more than last time last one On 8/31/06, jim holtman [EMAIL PROTECTED] wrote: You have to be careful if the strings are embedded: x - c('xxxFarrahxxx' ,'more than last time', 'some Common numbers', 'last one') setdiff(x, grep('Farrah|Common', x)) # not correct [1] xxxFarrahxxxmore than last time some Common numbers last one ForC - grep('Farrah|Common', x) x[setdiff(seq(along=x), ForC)] [1] more than last time last one On 8/31/06, Gabor Grothendieck [EMAIL PROTECTED] wrote: Or using the same x: setdiff(x, grep(Farrah|Common, x, value = TRUE)) [1] more last On 8/31/06, jim holtman [EMAIL PROTECTED] wrote: This finds the matching indices of Farrah and Common and then create a set that does not include them: x - c('Farrah', 'more', 'Common', 'last') got.F - grep('Farrah',x) got.C - grep('Common', x) not.ForC - setdiff(seq(along=x), c(got.F, got.C)) x[not.ForC] [1] more last On 8/31/06, Bob Green [EMAIL PROTECTED] wrote: I am hoping for some advice as to how to modify the following syntax, so that instead of saving all records which refer to Farrah, I select all instances that do not include Farrah, or the word Coolum. test - read.csv(c:\\newdat.csv, as.is=TRUE, header=T) sure - test[grep('Farrah', paste(test$V3.HD, test$V3.LP, test$V3.TD)),] write.csv(sure,c:/farrah4.csv) Any assistance is appreciated, regards Bob Green __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grep help needed
Dear Denis, I don't believe that anyone fielded your question -- my apologies if I missed a response. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Denis Chabot Sent: Monday, July 25, 2005 9:46 PM To: R list Subject: [R] grep help needed Hi, In another thread (PBSmapping and shapefiles) I asked for an easy way to read shapefiles and transform them in data that PBSmapping could use. One person is exploring some ways of doing this, but it is possible I'll have to do this manually. With package maptools I am able to extract the information I need from a shapefile but it is formatted like this: [[1]] [,1] [,2] [1,] -55.99805 51.68817 [2,] -56.00222 51.68911 [3,] -56.01694 51.68911 [4,] -56.03781 51.68606 [5,] -56.04639 51.68759 [6,] -56.04637 51.69445 [7,] -56.03777 51.70207 [8,] -56.02301 51.70892 [9,] -56.01317 51.71578 [10,] -56.00330 51.73481 [11,] -55.99805 51.73840 attr(,pstart) attr(,pstart)$from [1] 1 attr(,pstart)$to [1] 11 attr(,nParts) [1] 1 attr(,shpID) [1] NA [[2]] [,1] [,2] [1,] -57.76294 50.88770 [2,] -57.76292 50.88693 [3,] -57.76033 50.88163 [4,] -57.75668 50.88091 [5,] -57.75551 50.88169 [6,] -57.75562 50.88550 [7,] -57.75932 50.88775 [8,] -57.76294 50.88770 attr(,pstart) attr(,pstart)$from [1] 1 attr(,pstart)$to [1] 8 attr(,nParts) [1] 1 attr(,shpID) [1] NA I do not quite understand the structure of this data object (list of lists I think) Actually, it looks like a list of matrices, each with some attributes (which, I gather, aren't of interest to you). but at this point I resorted to printing it on the console and imported that text into Excel for further cleaning, which is easy enough. I'd like to complete the process within R to save time and to circumvent Excel's limit of around 64000 lines. But I have a hard time figuring out how to clean up this text in R. If I understand correctly what you want, this seems a very awkward way to proceed. Why not just extract the matrices from the list, stick on the additional columns that you want, stick the matrices together, name the columns, and then output the data to a file? M1 - Data[[1]] # assuming that the original list is named Data M2 - Data[[2]] M1 - cbind(1, 1:nrow(M1), M1) M2 - cbind(2, 1:nrow(M2), M2) M - rbind(M1, M2) colnames(M) - c(PID, POS, X, Y) write.table(M, Data.txt, row.names=FALSE, quote=FALSE) It wouldn't be hard to generalize this to any number of matrices and to automate the process. I hope that this helps, John What I need to produce for PBSmapping is a file where each block of coordinates shares one ID number, called PID, and a variable POS indicates the position of each coordinate within a shape. All other lines must disappear. So the above would become: PID POS X Y 1 1 -55.99805 51.68817 1 2 -56.00222 51.68911 1 3 -56.01694 51.68911 1 4 -56.03781 51.68606 1 5 -56.04639 51.68759 1 6 -56.04637 51.69445 1 7 -56.03777 51.70207 1 8 -56.02301 51.70892 1 9 -56.01317 51.71578 1 10 -56.00330 51.73481 1 11 -55.99805 51.73840 2 1 -57.76294 50.88770 2 2 -57.76292 50.88693 2 3 -57.76033 50.88163 2 4 -57.75668 50.88091 2 5 -57.75551 50.88169 2 6 -57.75562 50.88550 2 7 -57.75932 50.88775 2 8 -57.76294 50.88770 First I imported this text file into R: test - read.csv2(test file.txt,header=F, sep=;, colClasses = character) I used sep=; to insure there would be only one variable in this file, as it contains no ; To remove lines that do not contain coordinates, I used the fact that longitudes are expressed as negative numbers, so with my very limited knowledge of grep searches, I thought of this, which is probably not the best way to go: a - rep(-, length(test$V1)) b - grep(a, test$V1) this gives me a warning (Warning message: the condition has length 1 and only the first element will be used in: if (is.na(pattern)) { but seems to do what I need anyway c - seq(1, length(test$V1)) d - c %in% b e - test$V1[d] Partial victory, now I only have lines that look like [1,] -57.76294 50.88770 But I don't know how to go further: the number in square brackets can be used for variable POS, after removing the square brackets and the comma, but this requires a better knowledge of grep than I have. Furthermore, I don't know how to add a PID (polygon ID) variable, i.e. all lines of a polygon must have the same ID, as in the example above (i.e. each time POS == 1, a new polygon starts and PID needs to be incremented by 1, and PID is kept constant for lines where POS ! 1). Any help will be much appreciated. Sincerely, Denis Chabot __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] grep help needed
Thanks for your help, the proposed solutions were much more elegant than what I was attempting. I adopted a slight modification of Tom Mulholland's solution with a piece from John Fox's solution, but many of you had very similar solutions. require(maptools) nc - read.shape(system.file(shapes/sids.shp, package = maptools) [1]) mappolys - Map2poly(nc, as.character(nc$att.data$FIPSNO)) selected.shapes - which(nc$att.data$SID74 20) # just to make it a smaller example submap - subset(mappolys, nc$att.data$SID74 20) final.data - NULL for (j in 1:length(selected.shapes)){ temp.verts - matrix(as.vector(submap[[j]]),ncol = 2) n - length(temp.verts[,1]) temp.order - 1:n temp.data - cbind(rep(j,n),temp.order,temp.verts) final.data - rbind(final.data,temp.data) } colnames(final.data) - c(PID, POS, X, Y) final.data my.data - as.data.frame(final.data) class(my.data) - c(PolySet, data.frame) attr(my.data, projection) - LL meta - nc[2]$att.data[selected.shapes,] PID - seq(1,length(submap)) meta.data - cbind(PID, meta) class(meta.data) - c(PolyData, data.frame) attr(meta.data, projection) - LL It would be nice if a variant of this was incorporated into PBSmapping to make it easier to import data from shapefiles! Thanks again for your help, Denis Chabot Le 05-07-26 à 00:48, Mulholland, Tom a écrit : -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Denis Chabot Sent: Tuesday, 26 July 2005 10:46 AM To: R list Subject: [R] grep help needed Hi, In another thread (PBSmapping and shapefiles) I asked for an easy way to read shapefiles and transform them in data that PBSmapping could use. One person is exploring some ways of doing this, but it is possible I'll have to do this manually. With package maptools I am able to extract the information I need from a shapefile but it is formatted like this: [[1]] [,1] [,2] [1,] -55.99805 51.68817 [2,] -56.00222 51.68911 [3,] -56.01694 51.68911 [4,] -56.03781 51.68606 [5,] -56.04639 51.68759 [6,] -56.04637 51.69445 [7,] -56.03777 51.70207 [8,] -56.02301 51.70892 [9,] -56.01317 51.71578 [10,] -56.00330 51.73481 [11,] -55.99805 51.73840 attr(,pstart) attr(,pstart)$from [1] 1 attr(,pstart)$to [1] 11 attr(,nParts) [1] 1 attr(,shpID) [1] NA [[2]] [,1] [,2] [1,] -57.76294 50.88770 [2,] -57.76292 50.88693 [3,] -57.76033 50.88163 [4,] -57.75668 50.88091 [5,] -57.75551 50.88169 [6,] -57.75562 50.88550 [7,] -57.75932 50.88775 [8,] -57.76294 50.88770 attr(,pstart) attr(,pstart)$from [1] 1 attr(,pstart)$to [1] 8 attr(,nParts) [1] 1 attr(,shpID) [1] NA I do not quite understand the structure of this data object (list of lists I think) but at this point I resorted to printing it on the console and imported that text into Excel for further cleaning, which is easy enough. I'd like to complete the process within R to save time and to circumvent Excel's limit of around 64000 lines. But I have a hard time figuring out how to clean up this text in R. What I need to produce for PBSmapping is a file where each block of coordinates shares one ID number, called PID, and a variable POS indicates the position of each coordinate within a shape. All other lines must disappear. So the above would become: PID POS X Y 1 1 -55.99805 51.68817 1 2 -56.00222 51.68911 1 3 -56.01694 51.68911 1 4 -56.03781 51.68606 1 5 -56.04639 51.68759 1 6 -56.04637 51.69445 1 7 -56.03777 51.70207 1 8 -56.02301 51.70892 1 9 -56.01317 51.71578 1 10 -56.00330 51.73481 1 11 -55.99805 51.73840 2 1 -57.76294 50.88770 2 2 -57.76292 50.88693 2 3 -57.76033 50.88163 2 4 -57.75668 50.88091 2 5 -57.75551 50.88169 2 6 -57.75562 50.88550 2 7 -57.75932 50.88775 2 8 -57.76294 50.88770 First I imported this text file into R: test - read.csv2(test file.txt,header=F, sep=;, colClasses = character) I used sep=; to insure there would be only one variable in this file, as it contains no ; To remove lines that do not contain coordinates, I used the fact that longitudes are expressed as negative numbers, so with my very limited knowledge of grep searches, I thought of this, which is probably not the best way to go: a - rep(-, length(test$V1)) b - grep(a, test$V1) this gives me a warning (Warning message: the condition has length 1 and only the first element will be used in: if (is.na(pattern)) { but seems to do what I need anyway c - seq(1, length(test$V1)) d - c %in% b e - test$V1[d] Partial victory, now I only have lines that look like [1,] -57.76294 50.88770 But I don't know how to go further: the number in square brackets can be used for variable POS, after removing the square brackets and the comma, but this requires a better knowledge of grep than I have. Furthermore, I don't know how to add a PID (polygon ID) variable, i.e. all lines of a polygon must have the same ID, as in the example above (i.e. each
[R] grep help needed
Hi, In another thread (PBSmapping and shapefiles) I asked for an easy way to read shapefiles and transform them in data that PBSmapping could use. One person is exploring some ways of doing this, but it is possible I'll have to do this manually. With package maptools I am able to extract the information I need from a shapefile but it is formatted like this: [[1]] [,1] [,2] [1,] -55.99805 51.68817 [2,] -56.00222 51.68911 [3,] -56.01694 51.68911 [4,] -56.03781 51.68606 [5,] -56.04639 51.68759 [6,] -56.04637 51.69445 [7,] -56.03777 51.70207 [8,] -56.02301 51.70892 [9,] -56.01317 51.71578 [10,] -56.00330 51.73481 [11,] -55.99805 51.73840 attr(,pstart) attr(,pstart)$from [1] 1 attr(,pstart)$to [1] 11 attr(,nParts) [1] 1 attr(,shpID) [1] NA [[2]] [,1] [,2] [1,] -57.76294 50.88770 [2,] -57.76292 50.88693 [3,] -57.76033 50.88163 [4,] -57.75668 50.88091 [5,] -57.75551 50.88169 [6,] -57.75562 50.88550 [7,] -57.75932 50.88775 [8,] -57.76294 50.88770 attr(,pstart) attr(,pstart)$from [1] 1 attr(,pstart)$to [1] 8 attr(,nParts) [1] 1 attr(,shpID) [1] NA I do not quite understand the structure of this data object (list of lists I think) but at this point I resorted to printing it on the console and imported that text into Excel for further cleaning, which is easy enough. I'd like to complete the process within R to save time and to circumvent Excel's limit of around 64000 lines. But I have a hard time figuring out how to clean up this text in R. What I need to produce for PBSmapping is a file where each block of coordinates shares one ID number, called PID, and a variable POS indicates the position of each coordinate within a shape. All other lines must disappear. So the above would become: PID POS X Y 1 1 -55.99805 51.68817 1 2 -56.00222 51.68911 1 3 -56.01694 51.68911 1 4 -56.03781 51.68606 1 5 -56.04639 51.68759 1 6 -56.04637 51.69445 1 7 -56.03777 51.70207 1 8 -56.02301 51.70892 1 9 -56.01317 51.71578 1 10 -56.00330 51.73481 1 11 -55.99805 51.73840 2 1 -57.76294 50.88770 2 2 -57.76292 50.88693 2 3 -57.76033 50.88163 2 4 -57.75668 50.88091 2 5 -57.75551 50.88169 2 6 -57.75562 50.88550 2 7 -57.75932 50.88775 2 8 -57.76294 50.88770 First I imported this text file into R: test - read.csv2(test file.txt,header=F, sep=;, colClasses = character) I used sep=; to insure there would be only one variable in this file, as it contains no ; To remove lines that do not contain coordinates, I used the fact that longitudes are expressed as negative numbers, so with my very limited knowledge of grep searches, I thought of this, which is probably not the best way to go: a - rep(-, length(test$V1)) b - grep(a, test$V1) this gives me a warning (Warning message: the condition has length 1 and only the first element will be used in: if (is.na(pattern)) { but seems to do what I need anyway c - seq(1, length(test$V1)) d - c %in% b e - test$V1[d] Partial victory, now I only have lines that look like [1,] -57.76294 50.88770 But I don't know how to go further: the number in square brackets can be used for variable POS, after removing the square brackets and the comma, but this requires a better knowledge of grep than I have. Furthermore, I don't know how to add a PID (polygon ID) variable, i.e. all lines of a polygon must have the same ID, as in the example above (i.e. each time POS == 1, a new polygon starts and PID needs to be incremented by 1, and PID is kept constant for lines where POS ! 1). Any help will be much appreciated. Sincerely, Denis Chabot __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] grep negation
hi, using the example in the grep help: txt - c(arm,foot,lefroo, bafoobar) i - grep(foo,txt); i [1] 2 4 but how can i get the negation (1,3) when looking for 'foo'? thanks, m. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] grep negation
?setdiff e.g., txt - c(arm,foot,lefroo, bafoobar) i - grep(foo,txt); i [1] 2 4 setdiff(seq(length(txt)),grep(foo,txt)) [1] 1 3 Jim __ James HoltmanWhat is the problem you are trying to solve? Executive Technical Consultant -- Convergys Labs [EMAIL PROTECTED] +1 (513) 723-2929 Marcus Leinweber [EMAIL PROTECTED]To: 'r-help@stat.math.ethz.ch' r-help@stat.math.ethz.ch Sent by: cc: [EMAIL PROTECTED]Subject: [R] grep negation ath.ethz.ch 06/23/2005 08:59 hi, using the example in the grep help: txt - c(arm,foot,lefroo, bafoobar) i - grep(foo,txt); i [1] 2 4 but how can i get the negation (1,3) when looking for 'foo'? thanks, m. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] grep negation
try this: seq(along = txt)[-i] Best, Dimitris Dimitris Rizopoulos Ph.D. Student Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/16/336899 Fax: +32/16/337015 Web: http://www.med.kuleuven.be/biostat/ http://www.student.kuleuven.ac.be/~m0390867/dimitris.htm - Original Message - From: Marcus Leinweber [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Sent: Thursday, June 23, 2005 2:59 PM Subject: [R] grep negation hi, using the example in the grep help: txt - c(arm,foot,lefroo, bafoobar) i - grep(foo,txt); i [1] 2 4 but how can i get the negation (1,3) when looking for 'foo'? thanks, m. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] grep negation
If all you need to do is extract the subset of elements of txt that do not contain 'foo', then txt[-i] will do the job. Provided that at east one element of txt contains 'foo', that is. -Don At 2:59 PM +0200 6/23/05, Marcus Leinweber wrote: hi, using the example in the grep help: txt - c(arm,foot,lefroo, bafoobar) i - grep(foo,txt); i [1] 2 4 but how can i get the negation (1,3) when looking for 'foo'? thanks, m. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- -- Don MacQueen Environmental Protection Department Lawrence Livermore National Laboratory Livermore, CA, USA __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] grep
Hi, I want to use the first digit of the elements of a vector. I've tried grep but didn't work. Any help is welcome. Thanks EJ grep(^[0-9],as.character(runif(100,0,2))) [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 [19] 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 [37] 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 [55] 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 [73] 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 [91] 91 92 93 94 95 96 97 98 99 100 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] grep
Ernesto Jardim wrote: I want to use the first digit of the elements of a vector. I've tried grep but didn't work. Any help is welcome. substr(as.character(runif(100,0,2)), 1, 1) see ?substr -- Chuck Cleland, Ph.D. NDRI, Inc. 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 452-1424 (M, W, F) fax: (917) 438-0894 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] grep
Ernesto Jardim wrote: Hi, I want to use the first digit of the elements of a vector. I've tried grep but didn't work. Any help is welcome. Thanks EJ grep(^[0-9],as.character(runif(100,0,2))) [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 [19] 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 [37] 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 [55] 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 [73] 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 [91] 91 92 93 94 95 96 97 98 99 100 Not surprising. Try ?substring instead. substring(runif(100, 0, 2), 1, 1) -sundar __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] grep
On Fri, 2004-03-12 at 11:08, Ernesto Jardim wrote: Hi, I want to use the first digit of the elements of a vector. I've tried grep but didn't work. Any help is welcome. Thanks EJ grep(^[0-9],as.character(runif(100,0,2))) [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 [19] 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 [37] 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 [55] 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 [73] 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 [91] 91 92 93 94 95 96 97 98 99 100 How about ?substr substr(as.character(runif(100, 0, 2)), 1, 1) [1] 0 1 0 1 0 0 0 0 0 1 1 1 0 0 1 0 [17] 1 0 1 1 1 1 0 1 0 1 0 1 1 0 1 0 [33] 1 0 1 0 0 1 0 0 0 0 0 0 0 0 1 0 [49] 1 1 0 0 0 1 1 1 0 1 0 1 0 1 1 1 [65] 1 0 1 1 1 1 1 0 1 1 0 1 0 1 0 0 [81] 1 1 0 0 1 1 0 1 0 0 0 0 0 0 0 1 [97] 0 0 0 1 or substr(as.character(1:100), 1, 1) [1] 1 2 3 4 5 6 7 8 9 1 1 1 1 1 1 1 [17] 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 [33] 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 [49] 4 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 [65] 6 6 6 6 6 7 7 7 7 7 7 7 7 7 7 8 [81] 8 8 8 8 8 8 8 8 8 9 9 9 9 9 9 9 [97] 9 9 9 1 HTH, Marc Schwartz __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] grep
Ernesto - Use as.numeric(substr(as.character(x), 1, 1)). - tom blackwell - u michigan medical school - ann arbor - On Fri, 12 Mar 2004, Ernesto Jardim wrote: Hi, I want to use the first digit of the elements of a vector. I've tried grep but didn't work. Any help is welcome. Thanks EJ grep(^[0-9],as.character(runif(100,0,2))) [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 [19] 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 [37] 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 [55] 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 [73] 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 [91] 91 92 93 94 95 96 97 98 99 100 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] grep
as.integer(x/10^(as.integer(log10(x -Original Message- From: Ernesto Jardim [mailto:[EMAIL PROTECTED] Sent: Friday, March 12, 2004 12:08 PM To: Mailing List R Subject: [R] grep Hi, I want to use the first digit of the elements of a vector. I've tried grep but didn't work. Any help is welcome. Thanks EJ grep(^[0-9],as.character(runif(100,0,2))) [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 [19] 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 [37] 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 [55] 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 [73] 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 [91] 91 92 93 94 95 96 97 98 99 100 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] grep and gsub on backslash and quotes
Simon Fear [EMAIL PROTECTED] writes: The following code works, to gsub single quotes to double quotes: line - gsub(', '', line) (that's a single quote within doubles then a double within singles if your viewer's font is not good). But The R Language Manual tells me that Quotes and other special characters within strings are specified using escape sequences: \' single quote \ double quote so why is the following wrong: gsub(', , line)? That or any other number of backslashes (have tried all up to n=6 just for good measure). There's a backslash missing in the replacement. This works: line - ab\\\'cd gsub(', \, line) and will replace \' with \ BTW is it documented anywhere that you need four backslashes in an RE to match one in the target, when it is being passed as an argument to gsub or grep? How would I know how many levels of doubling up to use for any other functions? (I got to 4 consecutive \ by trial and error in this case, but have a dim memory of having read about it somewhere.) There are two levels because backslashes are escape characters both to R strings and regular expressions. So in the above, line is ab\'cd and the match pattern is \\' which matches \' and the replacement is \\ which becomes \ More interesting is gsub(\\', a, line) [1] ab\\'cda gsub(\\', a, line, perl=T) [1] ab\\acd so \' matches a single quote with PCRE but not with ordinary RE. (Yes, there's a reason...) -- O__ Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] grep and gsub on backslash and quotes
The following code works, to gsub single quotes to double quotes: line - gsub(', '', line) (that's a single quote within doubles then a double within singles if your viewer's font is not good). But The R Language Manual tells me that Quotes and other special characters within strings are specified using escape sequences: \' single quote \ double quote so why is the following wrong: gsub(', , line)? That or any other number of backslashes (have tried all up to n=6 just for good measure). BTW is it documented anywhere that you need four backslashes in an RE to match one in the target, when it is being passed as an argument to gsub or grep? How would I know how many levels of doubling up to use for any other functions? (I got to 4 consecutive \ by trial and error in this case, but have a dim memory of having read about it somewhere.) TIA Simon Fear Senior Statistician Syne qua non Ltd Tel: +44 (0) 1379 69 Fax: +44 (0) 1379 65 email: [EMAIL PROTECTED] web: http://www.synequanon.com Number of attachments included with this message: 0 This message (and any associated files) is confidential and\...{{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] grep and gsub on backslash and quotes
Thank you. Single backslash version, first thing I tried (I thought) works just fine when I copy and paste, ergo I must have got confused by some stupid typo of mine. Sorry to waste everyone's time over this. (Still, I am probably not the only confused user when it comes to RE handling - I hope the examples posted will be of as much use to others as they are to me.) Simon -Original Message- From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] Sent: 12 August 2003 17:13 To: Simon Fear Cc: [EMAIL PROTECTED] Subject: Re: [R] grep and gsub on backslash and quotes Security Warning: If you are not sure an attachment is safe to open please contact Andy on x234. There are 0 attachments with this message. On Tue, 12 Aug 2003, Simon Fear wrote: The following code works, to gsub single quotes to double quotes: line - gsub(', '', line) (that's a single quote within doubles then a double within singles if your viewer's font is not good). But The R Language Manual tells me that Quotes and other special characters within strings are specified using escape sequences: \' single quote \ double quote so why is the following wrong: gsub(', , line)? That or any other number of backslashes (have tried all up to n=6 just for good measure). BTW is it documented anywhere that you need four backslashes in an RE to match one in the target, when it is being passed as an argument to gsub or grep? It's not true, so I hope it is not documented anywhere. You may need 6, as in the following from methods(): res - sort(grep(gsub(([.[]), \\1, name), an, value = TRUE)) since that is \\ \1 withou tthe space. Each backslash in the target only needs to be doubled. In your example gsub(\', \, line) or even gsub(', \, line) is all you need: only R strings need the escape. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 Simon Fear Senior Statistician Syne qua non Ltd Tel: +44 (0) 1379 69 Fax: +44 (0) 1379 65 email: [EMAIL PROTECTED] web: http://www.synequanon.com Number of attachments included with this message: 0 This message (and any associated files) is confidential and\...{{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] grep and gsub on backslash and quotes
On Tue, 12 Aug 2003, Simon Fear wrote: The following code works, to gsub single quotes to double quotes: line - gsub(', '', line) (that's a single quote within doubles then a double within singles if your viewer's font is not good). But The R Language Manual tells me that Quotes and other special characters within strings are specified using escape sequences: \' single quote \ double quote so why is the following wrong: gsub(', , line)? That or any other number of backslashes (have tried all up to n=6 just for good measure). BTW is it documented anywhere that you need four backslashes in an RE to match one in the target, when it is being passed as an argument to gsub or grep? It's not true, so I hope it is not documented anywhere. You may need 6, as in the following from methods(): res - sort(grep(gsub(([.[]), \\1, name), an, value = TRUE)) since that is \\ \1 withou tthe space. Each backslash in the target only needs to be doubled. In your example gsub(\', \, line) or even gsub(', \, line) is all you need: only R strings need the escape. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help