Re: [R] Regular expressions and 2 dots

2019-06-28 Thread Rui Barradas
Hello, Please always cc the list. To know more about the regular expressions used by r read help("regex") The one I used is not very complicated. \\. match a dot; it is a meta-character so it needs to be escaped. {2,} repeated at least 2 times, at most an undetermined number of times. .*

Re: [R] Regular expressions and 2 dots

2019-06-28 Thread Rui Barradas
Hello, Try s <- c( "colone..xx.","coltwo.ft..rr.","colthree.gh..az.","colfour.DG..lm.") sub("\\.{2,}.*$", "", s) #[1] "colone" "coltwo.ft" "colthree.gh" "colfour.DG" Às 09:00 de 28/06/19, lionel sicot via R-help escreveu: c(

[R] Regular expressions and 2 dots

2019-06-28 Thread lionel sicot via R-help
Hello, I have files from an equipment with column names including dots.I would like to simplify these names but all my attempts with sub and regular expressions are unsuccessful. I havec( "colone..xx.","coltwo.ft..rr.","colthree.gh..az.","colfour.DG..lm.")and I would like to have c( 

Re: [R] Regular expressions, genbank

2014-02-06 Thread arun
Hi, One way would be: vec1 - c(CDS 3300..4037,  CDS complement(3300..4037), CDS 3300..4037, CDS join(21467..26641,27577..28890),  CDS complement(join(30708..31700,31931..31984)),  CDS 3300..4037) library(stringr)

Re: [R] Regular expressions, genbank

2014-02-06 Thread arun
You could also try: library(gsubfn) strapply(gsub(\\d+|\\d+,,vec1),([0-9]+),as.numeric,simplify=c) A.K. On Thursday, February 6, 2014 1:55 PM, arun smartpink...@yahoo.com wrote: Hi, One way would be: vec1 - c(CDS 3300..4037,  CDS complement(3300..4037), CDS 

Re: [R] Regular expressions, genbank

2014-02-06 Thread arun
HI, May be this helps: lines1 - readLines(textConnection('text to be ignored... CDS 687..3158 /gene=AXL2 /note=plasma membrane glycoprotein other text to be ignored... CDS complement(3300..4037)

[R] Regular expressions on filenames

2014-01-15 Thread Fisher Dennis
R 3.0.2 OS X Colleagues I am writing code to read a large number of files in a particular folder. In some situations, there may be two versions of the file with different extensions, e.g.: FILE.csv FILE.xls I extracted the portion before the extension with: sub(\\..*$,

Re: [R] Regular expressions on filenames

2014-01-15 Thread jim holtman
try this: x - c( FILE.XXX.csv + , FILE.YYY.xls) sub(\\.[^.]*$, , x) [1] FILE.XXX FILE.YYY the '[^.]*' says to match anything BUT a period. Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. On

Re: [R] Regular expressions on filenames

2014-01-15 Thread arun
Hi, Try:  FILELIST - list.files() FILELIST #[1] FILE.csv FILE.XXX.csv FILE.YYY.xls   sub((.*)\\..*$, \\1, basename(FILELIST)) #[1] FILE FILE.XXX FILE.YYY A.K. On Wednesday, January 15, 2014 7:35 PM, Fisher Dennis fis...@plessthan.com wrote: R 3.0.2 OS X Colleagues I am writing

Re: [R] Regular expressions on filenames

2014-01-15 Thread Jeff Newmiller
You want to match a period and anything that follows to the end of the string, as long as what follows has no period in it. \\.[^.]*$ --- Jeff NewmillerThe . . Go Live...

Re: [R] Regular expressions on filenames

2014-01-15 Thread David Winsemius
On Jan 15, 2014, at 4:37 PM, Fisher Dennis wrote: R 3.0.2 OS X Colleagues I am writing code to read a large number of files in a particular folder. In some situations, there may be two versions of the file with different extensions, e.g.: FILE.csv FILE.xls I extracted

Re: [R] Regular expressions on filenames

2014-01-15 Thread Wojtek Poppe
Try sub(\\.[^.]+$, , basename(FILELIST)) Thanks, Wojtek On Wed, Jan 15, 2014 at 4:37 PM, Fisher Dennis fis...@plessthan.com wrote: R 3.0.2 OS X Colleagues I am writing code to read a large number of files in a particular folder. In some situations, there may be two versions of the

[R] R Regular Expressions - Metacharacters

2013-02-05 Thread Seth Dickey
I thought that I can use metacharacters such as \w to match word characters with one backslash. But for some reason, I need to include two backslashes. grepl(pattern='\w', x=what) Error: '\w' is an unrecognized escape in character string starting \w grepl(pattern='\\w', x=what) [1] TRUE I

Re: [R] R Regular Expressions - Metacharacters

2013-02-05 Thread Duncan Murdoch
On 05/02/2013 12:49 PM, Seth Dickey wrote: I thought that I can use metacharacters such as \w to match word characters with one backslash. But for some reason, I need to include two backslashes. grepl(pattern='\w', x=what) Error: '\w' is an unrecognized escape in character string starting \w

Re: [R] R Regular Expressions - Metacharacters

2013-02-05 Thread David Winsemius
On Feb 5, 2013, at 9:49 AM, Seth Dickey wrote: I thought that I can use metacharacters such as \w to match word characters with one backslash. But for some reason, I need to include two backslashes. grepl(pattern='\w', x=what) Error: '\w' is an unrecognized escape in character string

[R] Regular expressions: stuck again...

2012-08-24 Thread Bart Joosen
Hi, I'm currently reworking a report, originating from a MS Access database, but should be implemented in R. Now I'm facing the task to convert a lot of queries to postgreSQL. What I want to do is make a function which takes the MS Access query as an argument and returns the pgSQL version. So:

Re: [R] Regular expressions: stuck again...

2012-08-24 Thread Noia Raindrops
Hello, try this: x - c(SELECT [public_tblFiche].[Fichenr], [public_tblArtnr].[Artnr], SELECT public_tblFiche.Fichenr, public_tblArtnr.Artnr) # The square backets [ and ] should removed x - gsub([][], , x) # and xxx_xxx.xxx should become \xxx\.\xxx\\.\xxx\ x -

Re: [R] Regular Expressions in grep - Solution and function to determine significant figures of a number

2012-08-23 Thread Dr. Holger van Lishaut
Am 22.08.2012, 21:46 Uhr, schrieb Dr. Holger van Lishaut h.v.lish...@gmx.de: SignifStellen-function(x){ strx=as.character(x) nchar(regmatches(strx, regexpr([1-9][0-9]*\\.[0-9]*[1-9],strx)))-1 } returns the significant figures of a number. Perhaps this can help someone. Sorry,

Re: [R] Regular Expressions in grep - Solution and function to determine significant figures of a number

2012-08-22 Thread Dr. Holger van Lishaut
Dear all, regmatches works. And, since this has been asked here before: SignifStellen-function(x){ strx=as.character(x) nchar(regmatches(strx, regexpr([1-9][0-9]*\\.[0-9]*[1-9],strx)))-1 } returns the significant figures of a number. Perhaps this can help someone. Thanks best

Re: [R] Regular Expressions in grep - Solution and function to determine significant figures of a number

2012-08-22 Thread Bert Gunter
... On Wed, Aug 22, 2012 at 12:46 PM, Dr. Holger van Lishaut h.v.lish...@gmx.de wrote: Dear all, regmatches works. And, since this has been asked here before: SignifStellen-function(x){ strx=as.character(x) nchar(regmatches(strx, regexpr([1-9][0-9]*\\.[0-9]*[1-9],strx)))-1 }

[R] Regular Expressions in grep

2012-08-21 Thread Dr. Holger van Lishaut
Dear r-help members, I have a number in the form of a string, say: a--01020.909200 I'd like to extract 1020. as well as .9092 Front-grep(pattern=[1-9]+[0-9]*\\., value=TRUE, x=a, fixed=FALSE) End-grep(pattern=\\.[0-9]*[1-9]+, value=TRUE, x=a, fixed=FALSE) However, both strings give

Re: [R] Regular Expressions in grep

2012-08-21 Thread Bert Gunter
grep() returns the matches. You want regexpr() and regmatches() -- Bert On Tue, Aug 21, 2012 at 12:24 PM, Dr. Holger van Lishaut h.v.lish...@gmx.de wrote: Dear r-help members, I have a number in the form of a string, say: a--01020.909200 I'd like to extract 1020. as well as .9092

Re: [R] Regular Expressions in grep

2012-08-21 Thread Noia Raindrops
'grep' does not change strings. Use 'gsub' or 'regmatches': # gsub Front - gsub(^.*?([1-9][0-9]*\\.).*?$, \\1, a) End - gsub(^.*?(\\.[0-9]*[1-9]).*?$, \\1, a) # regexpr and regmatches (R = 2.14.0) Front - regmatches(a, regexpr([1-9][0-9]*\\., a)) End - regmatches(a, regexpr(\\.[0-9]*[1-9], a))

Re: [R] Regular Expressions in grep

2012-08-21 Thread R. Michael Weylandt
You're misreading the docs: from grep, value: if ‘FALSE’, a vector containing the (‘integer’) indices of the matches determined by ‘grep’ is returned, and if ‘TRUE’, a vector containing the matching elements themselves is returned. Since there's a match somewhere

Re: [R] Regular Expressions in grep

2012-08-21 Thread arun
HI, Try this: gsub(^-\\d(\\d{4}.).*,\\1,a) #[1] 1020. gsub(^.*(.\\d{5}).,\\1,a) #[1] .90920 A.K. - Original Message - From: Dr. Holger van Lishaut h.v.lish...@gmx.de To: r-help@r-project.org r-help@r-project.org Cc: Sent: Tuesday, August 21, 2012 3:24 PM Subject: [R] Regular

[R] Regular Expressions + Matrices

2012-08-10 Thread Fred G
Hi all, My code looks like the following: inname = read.csv(ID_error_checker.csv, as.is=TRUE) outname = read.csv(output.csv, as.is=TRUE) #My algorithm is the following: #for line in inname #if first string up to whitespace in row in inname$name = first string up to whitespace in row + 1 in

Re: [R] Regular Expressions + Matrices

2012-08-10 Thread Fred G
New York Mets 1900ESPN #2 2 New York Yankees 1920 Cooperstown A.K. - Original Message - From: Fred G bayespoker...@gmail.com To: r-help@r-project.org Cc: Sent: Friday, August 10, 2012 1:41 PM Subject: [R] Regular Expressions + Matrices Hi all, My code looks like

Re: [R] Regular Expressions + Matrices

2012-08-10 Thread Rui Barradas
Hello, Try the following. d - read.table(textConnection( ID NAME YEAR SOURCE 1 'New York Mets' 1900 ESPN 2 'New York Yankees' 1920 Cooperstown 3 'Boston Redsox' 1918 ESPN 4 'Washington Nationals' 2010

Re: [R] Regular Expressions + Matrices

2012-08-10 Thread arun
: Sent: Friday, August 10, 2012 1:41 PM Subject: [R] Regular Expressions + Matrices Hi all, My code looks like the following: inname = read.csv(ID_error_checker.csv, as.is=TRUE) outname = read.csv(output.csv, as.is=TRUE) #My algorithm is the following: #for line in inname #if first string up

Re: [R] Regular Expressions + Matrices

2012-08-10 Thread Rui Barradas
NAME YEAR SOURCE #1 1New York Mets 1900ESPN #2 2 New York Yankees 1920 Cooperstown A.K. - Original Message - From: Fred G bayespoker...@gmail.com To: r-help@r-project.org Cc: Sent: Friday, August 10, 2012 1:41 PM Subject: [R] Regular Expressions + Matrices Hi

Re: [R] Regular Expressions + Matrices

2012-08-10 Thread Fred G
SOURCE #1 1New York Mets 1900ESPN #2 2 New York Yankees 1920 Cooperstown A.K. - Original Message - From: Fred G bayespoker...@gmail.com To: r-help@r-project.org Cc: Sent: Friday, August 10, 2012 1:41 PM Subject: [R] Regular Expressions + Matrices Hi all

Re: [R] Regular Expressions + Matrices

2012-08-10 Thread William Dunlap
To: Fred G Cc: r-help Subject: Re: [R] Regular Expressions + Matrices Hello, Try the following. d - read.table(textConnection( ID NAME YEAR SOURCE 1 'New York Mets' 1900 ESPN 2 'New York Yankees' 1920 Cooperstown 3

Re: [R] Regular Expressions + Matrices

2012-08-10 Thread Fred G
- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Rui Barradas Sent: Friday, August 10, 2012 11:18 AM To: Fred G Cc: r-help Subject: Re: [R] Regular Expressions + Matrices Hello, Try the following. d - read.table(textConnection( ID

[R] regular expressions in R

2011-12-21 Thread Alaios
Dear all I would like to ask from dir function in R (?dir) to give me only the files that end with .txt or .doc. The dir functions supports the use of patterns (is not that regular expressions) for doing that.   print(dir(i,full.names=TRUE,pattern=.)) Could you please help me compose such

Re: [R] regular expressions in R

2011-12-21 Thread Sarah Goslee
From the help for dir: File naming conventions are platform dependent. The pattern matching works with the case of file names as returned by the OS On my linux system, this works: dir(pattern=*.txt) [1] a.txt b.txt dir(pattern=*.doc) [1] c.doc dir(pattern=*.doc|*.txt) [1] a.txt

Re: [R] regular expressions in R

2011-12-21 Thread R. Michael Weylandt
Do you wish to include .docx files as well or just .doc? Michael On Wed, Dec 21, 2011 at 10:04 AM, Alaios ala...@yahoo.com wrote: Dear all I would like to ask from dir function in R (?dir) to give me only the files that end with .txt or .doc. The dir functions supports the use of patterns

Re: [R] regular expressions in R

2011-12-21 Thread jim holtman
To be correct for the regular expression, it should be: dir(pattern = \\.(txt|doc)$) The form dir(pattern=*.txt) will match 'txt' appearing anywhere in the name; this looks like the argument you would have used to Sys.glob which is a UNIX style file name match and not a regular expression. .

Re: [R] Regular expressions in R

2011-11-16 Thread Michael Griffiths
Thanks to everyone who contributed to my questions. As ever, I am extremely grateful to all those on the R-list who make it what it is. Regards Mike Griffiths On Tue, Nov 15, 2011 at 5:47 PM, Joshua Wiley jwiley.ps...@gmail.comwrote: Hi Michael, Your strings were long so I made a bit

[R] Regular expressions in R

2011-11-15 Thread Michael Griffiths
Good afternoon list, I have the following character strings; one with spaces between the maths operators and variable names, and one without said spaces. form-c('~ Sentence + LEGAL + Intro + Intro / Intro1 + Intro * LEGAL + benefit + benefit / benefit1 + product + action * mean + CTA + help +

Re: [R] Regular expressions in R

2011-11-15 Thread Sarah Goslee
Hi Michael, You need to take another look at the examples you were given, and at the help for ?sub(): The two ‘*sub’ functions differ only in that ‘sub’ replaces only the first occurrence of a ‘pattern’ whereas ‘gsub’ replaces all occurrences. If ‘replacement’ contains

Re: [R] Regular expressions in R

2011-11-15 Thread Joshua Wiley
Hi Michael, Your strings were long so I made a bit smaller example. Sarah made one good point, you want to be using gsub() not sub(), but when I use your code, I do not think it even works precisely for one instance. Try this on for size, you were 99% there: ## simplified cases form1 -

[R] Regular Expressions for Large Data Set

2011-06-07 Thread Abraham Mathew
I'm running R 2.13 on Ubuntu 10.10 I have a data set which is comprised of character strings. site = readLines('http://www.census.gov/tiger/tms/gazetteer/zips.txt') dat - c(01, 35004, AL, ACMAR, 86.51557, 33.584132, 6055, 0.001499) dat I want to loop through the data and construct a data frame

Re: [R] Regular Expressions for Large Data Set

2011-06-07 Thread Marc Schwartz
On Jun 7, 2011, at 3:55 PM, Abraham Mathew wrote: I'm running R 2.13 on Ubuntu 10.10 I have a data set which is comprised of character strings. site = readLines('http://www.census.gov/tiger/tms/gazetteer/zips.txt') dat - c(01, 35004, AL, ACMAR, 86.51557, 33.584132, 6055, 0.001499) dat

[R] Regular Expressions in Column Headings

2011-03-09 Thread Matthew DeAngelis
Hi all, I am hoping that someone can help me with a problem I am having with column headings. I have read a table into R using read.table: the rows are documents, and the columns are counts of regular expression matches (so that the column heading is the given regular expression). My problem is

Re: [R] Regular Expressions in Column Headings

2011-03-09 Thread Gabor Grothendieck
On Wed, Mar 9, 2011 at 8:52 AM, Matthew DeAngelis roni...@gmail.com wrote: Hi all, I am hoping that someone can help me with a problem I am having with column headings.  I have read a table into R using read.table: the rows are documents, and the columns are counts of regular expression

[R] Regular Expressions

2010-11-05 Thread Noah Silverman
Hi, I'm trying to figure out how to use capturing parenthesis in regular expressions in R. (Doing this in Perl, Java, etc. is fairly trivial, but I can't seem to find the functionality in R.) For example, given the string:10 Nov 13.00 (PFE1020K13) I want to capture the first to digits

Re: [R] Regular Expressions

2010-11-05 Thread Prof Brian Ripley
On Thu, 4 Nov 2010, Noah Silverman wrote: Hi, I'm trying to figure out how to use capturing parenthesis in regular expressions in R. (Doing this in Perl, Java, etc. is fairly trivial, but I can't seem to find the functionality in R.) For example, given the string:10 Nov 13.00

Re: [R] Regular Expressions

2010-11-05 Thread Noah Silverman
That's perfect! Don't know how I missed that. I want to start playing with some modeling of financial data and the only format I can download is rather ugly. So my plan is to use a series of Regex to extract what I want. Noticed that you are a Prof. in applied stats. I'm at UCLA working on

Re: [R] Regular Expressions

2010-11-05 Thread Brian Diggs
On 11/5/2010 12:09 AM, Prof Brian Ripley wrote: On Thu, 4 Nov 2010, Noah Silverman wrote: Hi, I'm trying to figure out how to use capturing parenthesis in regular expressions in R. (Doing this in Perl, Java, etc. is fairly trivial, but I can't seem to find the functionality in R.) For

Re: [R] Regular Expressions

2010-11-05 Thread Gabor Grothendieck
2010/11/5 Brian Diggs dig...@ohsu.edu: Is there a standard, built in way to get both (all) backreferences at the same time with just one call to sub (or the appropriate function)? I can cobble something together specifically for 2 backreferences (not extensively tested): both_backrefs -

Re: [R] Regular expressions: offsets of groups

2010-09-30 Thread Titus von der Malsburg
Ok, we decided to have a shot at modifying gregexpr. Let's see how it works out. If anybody is interested in discussing this please contact me. R-help doesn't seem like the right place for further discussion. Is there a default place for discussing things like that? Thanks everybody for your

Re: [R] Regular expressions: offsets of groups

2010-09-29 Thread Titus von der Malsburg
Bill, Michael, good to see I'm not the only one who sees potential for improvements in the regexpr domain. Adding a subpattern argument is certainly a step in the right direction and would make my life much easier. However, in my application I need to know not only the position of one group but

Re: [R] Regular expressions: offsets of groups

2010-09-29 Thread Michael Bedward
I'd definitely be a customer for it Titus. And it does seem like an obvious hole in regex processing in R that cries out to be filled. Um, ggregexpr isn't the sexiest of function names :) Perhaps we can think of something a little easier ? How is your C coding ? Bill ? Anyone else ? I could

Re: [R] Regular expressions: offsets of groups

2010-09-29 Thread Titus von der Malsburg
On Wed, Sep 29, 2010 at 1:58 PM, Michael Bedward michael.bedw...@gmail.com wrote: How is your C coding ? Bill ? Anyone else ?  I could have a got at writing some prototype code to test in the next few days, though if someone else with decent C skills is itching to do it please speak up. We

Re: [R] Regular expressions: offsets of groups

2010-09-28 Thread Michael Bedward
What Titus wants to do is akin to retrieving capturing groups from a Matcher object in Java. I also thought there must be an existing, elegant solution to this some time ago and searched for it, including looking at the sources (albeit with not much expertise) but came up blank. I also looked at

Re: [R] Regular expressions: offsets of groups

2010-09-28 Thread Titus von der Malsburg
On Tue, Sep 28, 2010 at 9:46 AM, Michael Bedward michael.bedw...@gmail.com wrote: What Titus wants to do is akin to retrieving capturing groups from a Matcher object in Java. Precisely. Here's the description:

Re: [R] Regular expressions: offsets of groups

2010-09-28 Thread Gabor Grothendieck
On Tue, Sep 28, 2010 at 6:52 AM, Titus von der Malsburg malsb...@gmail.com wrote: On Tue, Sep 28, 2010 at 9:46 AM, Michael Bedward michael.bedw...@gmail.com wrote: What Titus wants to do is akin to retrieving capturing groups from a Matcher object in Java. Precisely.  Here's the description:

Re: [R] Regular expressions: offsets of groups

2010-09-28 Thread William Dunlap
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Michael Bedward Sent: Tuesday, September 28, 2010 12:46 AM To: Titus von der Malsburg Cc: r-help@r-project.org Subject: Re: [R] Regular expressions: offsets of groups What

Re: [R] Regular expressions: offsets of groups

2010-09-28 Thread Michael Bedward
Ah, that's interesting - thanks Bill. That's certainly on the right track for me (Titus, you too ?) especially if the subpattern argument accepted a vector of multiple group indices. As you say, this is straightforward in C. I'd be happy to (try to) make a patch for the R sources if there was

[R] Regular expressions: offsets of groups

2010-09-27 Thread Titus von der Malsburg
Dear list! gregexpr(a+(b+), abcdaabbc) [[1]] [1] 1 5 attr(,match.length) [1] 2 4 What I want is the offsets of the matches for the group (b+), i.e. 2 and 7, not the offsets of the complete matches. Is there a way in R to get that? I know about gsubgn and strapply, but they only give me the

Re: [R] Regular expressions: offsets of groups

2010-09-27 Thread jim holtman
try this: x - gregexpr(a+(b+), abcdaabbcaaacaaab) justA - gregexpr(a+, abcdaabbcaaacaaab) # find matches in 'x' for 'justA' indx - which(justA[[1]] %in% x[[1]]) # now determine where 'b' starts justA[[1]][indx] + attr(justA[[1]], 'match.length')[indx] [1] 2 7 17 On Mon, Sep 27, 2010

Re: [R] Regular expressions: offsets of groups

2010-09-27 Thread Titus von der Malsburg
Thank you Jim, but just as the solution that I discussed, your proposal involves deconstructing the pattern and searching several times. I'm looking for a general and efficient solution. Internally, the regexpr engine has all necessary information after one pass through the string. What I need

Re: [R] Regular expressions: offsets of groups

2010-09-27 Thread Titus von der Malsburg
On Mon, Sep 27, 2010 at 7:16 PM, Henrique Dallazuanna www...@gmail.com wrote: You've tried: gregexpr(b+, abcdaabbc) But this would match the third occurrence of b+ in abcdaabbcbb. But in this example I'm only interested in b+ if it's preceded by a+. Titus

Re: [R] Regular expressions: offsets of groups

2010-09-27 Thread Gabor Grothendieck
On Mon, Sep 27, 2010 at 11:48 AM, Titus von der Malsburg malsb...@gmail.com wrote: Dear list! gregexpr(a+(b+), abcdaabbc) [[1]] [1] 1 5 attr(,match.length) [1] 2 4 What I want is the offsets of the matches for the group (b+), i.e. 2 and 7, not the offsets of the complete matches.  Is

Re: [R] Regular expressions: offsets of groups

2010-09-27 Thread Titus von der Malsburg
On Mon, Sep 27, 2010 at 7:29 PM, Gabor Grothendieck ggrothendi...@gmail.com wrote: Try this zero width negative look behind expression: gregexpr((?!a+)(b+), abcdaabbc, perl = TRUE) [[1]] [1] 2 7 attr(,match.length) [1] 1 2 Thanks Gabor, but this gives me the same result as gregexpr(b+,

Re: [R] Regular expressions: offsets of groups

2010-09-27 Thread Henrique Dallazuanna
You could do this: gregexpr(ab+, abcdaabbcbb)[[1]] + 1 On Mon, Sep 27, 2010 at 2:25 PM, Titus von der Malsburg malsb...@gmail.comwrote: On Mon, Sep 27, 2010 at 7:16 PM, Henrique Dallazuanna www...@gmail.com wrote: You've tried: gregexpr(b+, abcdaabbc) But this would match the third

Re: [R] Regular expressions: offsets of groups

2010-09-27 Thread Henrique Dallazuanna
You've tried: gregexpr(b+, abcdaabbc) On Mon, Sep 27, 2010 at 12:48 PM, Titus von der Malsburg malsb...@gmail.com wrote: Dear list! gregexpr(a+(b+), abcdaabbc) [[1]] [1] 1 5 attr(,match.length) [1] 2 4 What I want is the offsets of the matches for the group (b+), i.e. 2 and 7, not

Re: [R] Regular expressions: offsets of groups

2010-09-27 Thread Gabor Grothendieck
On Mon, Sep 27, 2010 at 1:34 PM, Titus von der Malsburg malsb...@gmail.com wrote: On Mon, Sep 27, 2010 at 7:29 PM, Gabor Grothendieck ggrothendi...@gmail.com wrote: Try this zero width negative look behind expression: gregexpr((?!a+)(b+), abcdaabbc, perl = TRUE) [[1]] [1] 2 7

[R] regular expressions

2009-10-26 Thread baptiste auguie
Dear list, I have the following text to parse (originating from readLines as some lines have unequal size), st = c(START text1 1 text2 2.3, whatever intermediate text, START text1 23.4 text2 3.1415) from which I'd like to extract the lines starting with START, and group the subsequent fields in

Re: [R] regular expressions

2009-10-26 Thread Gabor Grothendieck
Assuming only START fields match pat: ## this one has more fields: how do I generalize the regular expression? st2 = c(START text1 1 text2 2.3 text3 5, whatever intermediate text, + START text1 23.4 text2 3.1415 text3 6) pat - [[:alnum:]]+ +([0-9.]+) s - strapply(st2, pat, c, simplify =

Re: [R] regular expressions

2009-10-26 Thread baptiste auguie
Perfect, thanks! baptiste 2009/10/26 Gabor Grothendieck ggrothendi...@gmail.com: Assuming only START fields match pat: ## this one has more fields: how do I generalize the regular expression? st2 = c(START text1 1 text2 2.3 text3 5, whatever intermediate text, + START text1 23.4 text2

[R] Regular expressions: bug or misunderstanding?

2008-07-06 Thread Duncan Murdoch
I'm trying to write a gsub() call that takes a string and escapes all the unescaped quote marks in it. So the string \ would be left unchanged, but \\ would be changed to \\\ because the double backslash doesn't act as an escape for the quote, the first just escapes the second. I have

Re: [R] Regular expressions: bug or misunderstanding?

2008-07-06 Thread Gabor Grothendieck
Try adding perl = TRUE On Sun, Jul 6, 2008 at 5:17 PM, Duncan Murdoch [EMAIL PROTECTED] wrote: I'm trying to write a gsub() call that takes a string and escapes all the unescaped quote marks in it. So the string \ would be left unchanged, but \\ would be changed to \\\ because the

Re: [R] Regular expressions: bug or misunderstanding?

2008-07-06 Thread Ted Harding
On 06-Jul-08 21:17:04, Duncan Murdoch wrote: I'm trying to write a gsub() call that takes a string and escapes all the unescaped quote marks in it. So the string \ would be left unchanged, but \\ would be changed to \\\ because the double backslash doesn't act as an escape

Re: [R] Regular expressions: bug or misunderstanding?

2008-07-06 Thread Duncan Murdoch
On 06/07/2008 5:37 PM, (Ted Harding) wrote: On 06-Jul-08 21:17:04, Duncan Murdoch wrote: I'm trying to write a gsub() call that takes a string and escapes all the unescaped quote marks in it. So the string \ would be left unchanged, but \\ would be changed to \\\ because the double

Re: [R] Regular expressions: bug or misunderstanding?

2008-07-06 Thread Gabor Grothendieck
Look at the discussion of zero width lookahead assertions in ?regex . Use perl = TRUE as previously indicated. On Sun, Jul 6, 2008 at 7:29 PM, Duncan Murdoch [EMAIL PROTECTED] wrote: On 06/07/2008 5:37 PM, (Ted Harding) wrote: On 06-Jul-08 21:17:04, Duncan Murdoch wrote: I'm trying to write

Re: [R] Regular expressions: bug or misunderstanding?

2008-07-06 Thread Duncan Murdoch
On 06/07/2008 7:37 PM, Gabor Grothendieck wrote: Look at the discussion of zero width lookahead assertions in ?regex . Use perl = TRUE as previously indicated. Thanks, this seems to work: gsub( (?!E)((EE)*)q, \\1Eq, x, perl=TRUE) Duncan Murdoch On Sun, Jul 6, 2008 at 7:29 PM, Duncan

[R] Regular Expressions

2008-05-13 Thread Shubha Vishwanath Karanth
Hi R, Again struck with regular expressions... Suppose, S=c(World_is_beautiful, one_two_three_four,My_book) I need to extract the last but one element of the strings. So, my output should look like: Ans=c(is,three,My) gsub() can do this...but wondering how do I give the

Re: [R] Regular Expressions

2008-05-13 Thread Dimitris Rizopoulos
: Tuesday, May 13, 2008 11:02 AM Subject: [R] Regular Expressions Hi R, Again struck with regular expressions... Suppose, S=c(World_is_beautiful, one_two_three_four,My_book) I need to extract the last but one element of the strings. So, my output should look like: Ans=c(is,three,My

Re: [R] Regular Expressions

2008-05-13 Thread Richard . Cotton
S=c(World_is_beautiful, one_two_three_four,My_book) I need to extract the last but one element of the strings. So, my output should look like: Ans=c(is,three,My) gsub() can do this...but wondering how do I give the regular expression sapply(strsplit(S, _), function(x)

Re: [R] Regular Expressions

2008-05-13 Thread Gabor Grothendieck
On Tue, May 13, 2008 at 5:02 AM, Shubha Vishwanath Karanth [EMAIL PROTECTED] wrote: Suppose, S=c(World_is_beautiful, one_two_three_four,My_book) I need to extract the last but one element of the strings. So, my output should look like: Ans=c(is,three,My) gsub() can do this...but

[R] Regular Expressions Help

2008-04-19 Thread maud
I am having some trouble learning regular expressions. Let me describe the general problem I am dealing with. Consider the following setup: Joe- c(1,2,3) Bob- c(2,4,6) Alice - c(9,8,7) Matrix - cbind(Joe, Bob, Alice) St - c(Bob, Alice, Alice:Bob) Now I want to make a new matrix having only the

Re: [R] Regular Expressions Help

2008-04-19 Thread Hans-Jörg Bibiko
On 19.04.2008, at 06:46, maud wrote: I am having some trouble learning regular expressions. Let me describe the general problem I am dealing with. Consider the following setup: Joe- c(1,2,3) Bob- c(2,4,6) Alice - c(9,8,7) Matrix - cbind(Joe, Bob, Alice) St - c(Bob, Alice, Alice:Bob)

[R] regular expressions

2008-03-12 Thread GOUACHE David
Hello all, Still fighting with regular expressions and such, I am again stuck: Suppose I have a vector of character chains. In this vector, I wish to identify which character chains start with a given pattern, and then replace everything that comes after said pattern. Here is a quick

Re: [R] regular expressions

2008-03-12 Thread Christos Hatzis
] regular expressions Hello all, Still fighting with regular expressions and such, I am again stuck: Suppose I have a vector of character chains. In this vector, I wish to identify which character chains start with a given pattern, and then replace everything that comes after said pattern