[R] regular expressions : extracting numbers

2007-07-30 Thread GOUACHE David
Hello all,

I have a vector of character strings, in which I have letters, numbers, and 
symbols. What I wish to do is obtain a vector of the same length with just the 
numbers.
A quick example -

extract of the original vector :
lema, rb 2% rb 2% rb 3% rb 4% rb 3% rb 2%,mineuse rb rb rb 12 
rb rj 30% rb rb rb 25% rb rb rb rj, rb

and the type of thing I wish to end up with :
2 2 3 4 3 2   12  30   25

or, instead of , NA would be acceptable (actually it would almost be better 
for me)

Anyways, I've been battling with gsub() and things of the sort, but I'm 
drowning in the regular expressions, despite a few hours of looking at Perl 
tutorials...
So if anyone can help me out, it would be greatly appreciated!!

In advance, thanks very much.

David Gouache
Arvalis - Institut du Végétal
Station de La Minière
78280 Guyancourt
Tel: 01.30.12.96.22 / Port: 06.86.08.94.32

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] regular expressions : extracting numbers

2007-07-30 Thread Romain Francois
Bonjour David,

What about one of these :

R gsub( [^[:digit:]], , x )

or using perl regular expressions:

R gsub( \\D, , x, perl = T )

Cheers,

Romain

GOUACHE David wrote:
 Hello all,

 I have a vector of character strings, in which I have letters, numbers, and 
 symbols. What I wish to do is obtain a vector of the same length with just 
 the numbers.
 A quick example -

 extract of the original vector :
 lema, rb 2% rb 2% rb 3% rb 4% rb 3% rb 2%,mineuse rb rb rb 
 12 rb rj 30% rb rb rb 25% rb rb rb rj, rb

 and the type of thing I wish to end up with :
 2 2 3 4 3 2   12  30   25

 or, instead of , NA would be acceptable (actually it would almost be better 
 for me)

 Anyways, I've been battling with gsub() and things of the sort, but I'm 
 drowning in the regular expressions, despite a few hours of looking at Perl 
 tutorials...
 So if anyone can help me out, it would be greatly appreciated!!

 In advance, thanks very much.

 David Gouache
 Arvalis - Institut du Végétal
 Station de La Minière
 78280 Guyancourt
 Tel: 01.30.12.96.22 / Port: 06.86.08.94.3


-- 
Mango Solutions
data analysis that delivers

Tel:  +44(0) 1249 467 467
Fax:  +44(0) 1249 467 468
Mob:  +44(0) 7813 526 123

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] regular expressions : extracting numbers

2007-07-30 Thread jim holtman
Is this what you want:

 x
 [1] lema, rb 2%   rb 2% rb 3% rb 4%
rb 3% rb 2%,mineuse
 [7] rbrbrb 12 rb
rj 30%rb
[13] rbrb 25%rbrb
rbrj, rb
 gsub([^0-9]*([0-9]*)[^0-9]*, \\1, x)
 [1] 2  2  3  4  3  21230
25  



On 7/30/07, GOUACHE David [EMAIL PROTECTED] wrote:
 Hello all,

 I have a vector of character strings, in which I have letters, numbers, and 
 symbols. What I wish to do is obtain a vector of the same length with just 
 the numbers.
 A quick example -

 extract of the original vector :
 lema, rb 2% rb 2% rb 3% rb 4% rb 3% rb 2%,mineuse rb rb rb 
 12 rb rj 30% rb rb rb 25% rb rb rb rj, rb

 and the type of thing I wish to end up with :
 2 2 3 4 3 2   12  30   25

 or, instead of , NA would be acceptable (actually it would almost be better 
 for me)

 Anyways, I've been battling with gsub() and things of the sort, but I'm 
 drowning in the regular expressions, despite a few hours of looking at Perl 
 tutorials...
 So if anyone can help me out, it would be greatly appreciated!!

 In advance, thanks very much.

 David Gouache
 Arvalis - Institut du Végétal
 Station de La Minière
 78280 Guyancourt
 Tel: 01.30.12.96.22 / Port: 06.86.08.94.32

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] regular expressions : extracting numbers

2007-07-30 Thread Vladimir Eremeev



GOUACHE David wrote:
 
 Hello all,
 
 I have a vector of character strings, in which I have letters, numbers,
 and symbols. What I wish to do is obtain a vector of the same length with
 just the numbers.
 A quick example -
 
 extract of the original vector :
 lema, rb 2% rb 2% rb 3% rb 4% rb 3% rb 2%,mineuse rb rb
 rb 12 rb rj 30% rb rb rb 25% rb rb rb rj, rb
 
 and the type of thing I wish to end up with :
 2 2 3 4 3 2   12  30   25
 
 or, instead of , NA would be acceptable (actually it would almost be
 better for me)
 

 chv-scan(what=character,sep= ) #then copy the text from your message
 to the clipboard and paste it to the R console
 chv
 [1] lema, rb 2%   rb 2% rb 3% rb 4%
 [5] rb 3% rb 2%,mineuse rbrb   
 [9] rb 12 rbrj 30%rb   
[13] rbrb 25%rbrb   
[17] rbrj, rb   

# actual replacements :

# replace non-digits with nothing
 chv.digits-gsub([^0-9],,chv)
 chv.digits
 [1] 2  2  3  4  3  21230   25
  
[16]

# replace empty strings with NA
 chv.digits[chv.digits==]-NA
 chv.digits
 [1] 2  2  3  4  3  2  NA   NA   12 NA   30 NA   NA   25
NA  
[16] NA   NA   NA  

 
-- 
View this message in context: 
http://www.nabble.com/regular-expressions-%3A-extracting-numbers-tf4169660.html#a11862597
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] regular expressions : extracting numbers

2007-07-30 Thread Christian Ritz
Dear David,

does the following work for you?


sVec - c(lema, rb 2%, rb 2%, rb 3%, rb 4%, rb 3%, rb 2%,mineuse, 
rb, rb, 
rb 12, rb, rj 30%, rb, rb, rb 25%, rb, rb, rb, rj, rb)

reVec - regexpr([[:digit:]]+, sVec)
# see ?regex for details on '[:digit:]' and '+'

substr(sVec ,start = reVec, stop=reVec + attr(reVec, match.length) - 1)
# see ?substr for details



Christian

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] regular expressions : extracting numbers

2007-07-30 Thread Marc Schwartz
On Mon, 2007-07-30 at 13:58 +0200, GOUACHE David wrote:
 Hello all,
 
 I have a vector of character strings, in which I have letters,
 numbers, and symbols. What I wish to do is obtain a vector of the same
 length with just the numbers.
 A quick example -
 
 extract of the original vector :
 lema, rb 2% rb 2% rb 3% rb 4% rb 3% rb 2%,mineuse rb
 rb rb 12 rb rj 30% rb rb rb 25% rb rb rb rj, rb
 
 and the type of thing I wish to end up with :
 2 2 3 4 3 2   12  30   25
 
 or, instead of , NA would be acceptable (actually it would almost be
 better for me)
 
 Anyways, I've been battling with gsub() and things of the sort, but
 I'm drowning in the regular expressions, despite a few hours of
 looking at Perl tutorials...
 So if anyone can help me out, it would be greatly appreciated!!
 
 In advance, thanks very much.

Try this:

 Vec
 [1] lema, rb 2%   rb 2% rb 3% rb 4%
 [5] rb 3% rb 2%,mineuse rbrb   
 [9] rb 12 rbrj 30%rb   
[13] rbrb 25%rbrb   
[17] rbrj, rb 

 gsub([^0-9], , Vec)
 [1] 2  2  3  4  3  21230  
[14] 25


The search pattern regex here is [^0-9] which says to replace anything
that is not (^) in the character range of 0 through 9.

See ?regex and/or http://www.regular-expressions.info/

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] regular expressions : extracting numbers

2007-07-30 Thread Gabor Grothendieck
I assume if you want the  components to be NA then you really intend
the result to be a numeric vector.  The following replaces all non-digits
with  (thereby removing them) and then uses as.numeric to convert the
result to numeric.  Just omit the conversion if you want a character
vector result:

s - c(lema, rb 2%, rb 2%, rb 3%, rb 4%, rb 3%, rb 2%,mineuse,
   rb, rb, rb 12, rb, rj 30%, rb, rb, rb 25%, rb, rb,
   rb, rj, rb)

as.numeric(gsub([^[:digit:]]+, , s))

On 7/30/07, GOUACHE David [EMAIL PROTECTED] wrote:
 Hello all,

 I have a vector of character strings, in which I have letters, numbers, and 
 symbols. What I wish to do is obtain a vector of the same length with just 
 the numbers.
 A quick example -

 extract of the original vector :
 lema, rb 2% rb 2% rb 3% rb 4% rb 3% rb 2%,mineuse rb rb rb 
 12 rb rj 30% rb rb rb 25% rb rb rb rj, rb

 and the type of thing I wish to end up with :
 2 2 3 4 3 2   12  30   25

 or, instead of , NA would be acceptable (actually it would almost be better 
 for me)

 Anyways, I've been battling with gsub() and things of the sort, but I'm 
 drowning in the regular expressions, despite a few hours of looking at Perl 
 tutorials...
 So if anyone can help me out, it would be greatly appreciated!!

 In advance, thanks very much.

 David Gouache
 Arvalis - Institut du Végétal
 Station de La Minière
 78280 Guyancourt
 Tel: 01.30.12.96.22 / Port: 06.86.08.94.32

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] regular expressions : extracting numbers

2007-07-30 Thread Jacques VESLOT
  gsub( , , gsub(%, , gsub([a-z], , c(tr3,jh40%qs  dqd
[1] 3  40


Jacques VESLOT

INRA - Biostatistique  Processus Spatiaux
Site Agroparc 84914 Avignon Cedex 9, France

Tel: +33 (0) 4 32 72 21 58
Fax: +33 (0) 4 32 72 21 84



GOUACHE David a écrit :
 Hello all,

 I have a vector of character strings, in which I have letters, numbers, and 
 symbols. What I wish to do is obtain a vector of the same length with just 
 the numbers.
 A quick example -

 extract of the original vector :
 lema, rb 2% rb 2% rb 3% rb 4% rb 3% rb 2%,mineuse rb rb rb 
 12 rb rj 30% rb rb rb 25% rb rb rb rj, rb

 and the type of thing I wish to end up with :
 2 2 3 4 3 2   12  30   25

 or, instead of , NA would be acceptable (actually it would almost be better 
 for me)

 Anyways, I've been battling with gsub() and things of the sort, but I'm 
 drowning in the regular expressions, despite a few hours of looking at Perl 
 tutorials...
 So if anyone can help me out, it would be greatly appreciated!!

 In advance, thanks very much.

 David Gouache
 Arvalis - Institut du Végétal
 Station de La Minière
 78280 Guyancourt
 Tel: 01.30.12.96.22 / Port: 06.86.08.94.32

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] regular expressions : extracting numbers

2007-07-30 Thread Kuhn, Max
This might work:

 numOnly - function(x) gsub([^0-9], , x)
 numOnly(lema, rb 2%)
[1] 2
 numOnly(rb)
[1] 

Max

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of GOUACHE David
Sent: Monday, July 30, 2007 7:59 AM
To: r-help@stat.math.ethz.ch
Subject: [R] regular expressions : extracting numbers

Hello all,

I have a vector of character strings, in which I have letters, numbers, and 
symbols. What I wish to do is obtain a vector of the same length with just the 
numbers.
A quick example -

extract of the original vector :
lema, rb 2% rb 2% rb 3% rb 4% rb 3% rb 2%,mineuse rb rb rb 12 
rb rj 30% rb rb rb 25% rb rb rb rj, rb

and the type of thing I wish to end up with :
2 2 3 4 3 2   12  30   25

or, instead of , NA would be acceptable (actually it would almost be better 
for me)

Anyways, I've been battling with gsub() and things of the sort, but I'm 
drowning in the regular expressions, despite a few hours of looking at Perl 
tutorials...
So if anyone can help me out, it would be greatly appreciated!!

In advance, thanks very much.

David Gouache
Arvalis - Institut du Végétal
Station de La Minière
78280 Guyancourt
Tel: 01.30.12.96.22 / Port: 06.86.08.94.32

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
LEGAL NOTICE\ Unless expressly stated otherwise, this messag...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.